|
|
| 4 | Hardware transactional memory for GPU architectures |
| 4 | SIMD re-convergence at thread frontiers |
| 3 | Idempotent processor architecture |
| 3 | Bubble-Up: increasing utilization in modern warehouse scale computers via sensible co-locations |
| 3 | Improving GPU performance via large warps and two-level warp scheduling |
| 3 | Reducing memory interference in multicore systems via application-aware memory channel partitioning |
| 3 | Efficiently enabling conventional block sizes for very large die-stacked DRAM caches |
| 2 | Packet chaining: efficient single-cycle allocation for on-chip networks |
| 2 | A new case for the TAGE branch predictor |
| 2 | QsCores: trading dark silicon for scalable energy efficiency with quasi-specific cores |
| 2 | Parallel application memory scheduling |
| 2 | SHiP: signature-based hit predictor for high performance caching |
| 1 | Active management of timing guardband to save energy in POWER7 |
| 1 | Bundled execution of recurring traces for energy-efficient general purpose processing |
| 1 | Towards the ideal on-chip fabric for 1-to-many and many-to-1 communication |
| 1 | Proactive instruction fetch |
| 1 | Pack & Cap: adaptive DVFS and thread packing under power caps |
| 1 | ATDetector: improving the accuracy of a commercial data race detector by identifying address transfer |
| 1 | System-level integrated server architectures for scale-out datacenters |
| 1 | Multi retention level STT-RAM cache designs with a dynamic refresh scheme |
| 1 | PACMan: prefetch-aware cache management for high performance caching |
| 0 | Minimalist open-page: a DRAM page-mode scheduling policy for the many-core era |
| 0 | The NoX router |
| 0 | A systematic methodology to develop resilient cache coherence protocols |
| 0 | Dataflow execution of sequential imperative programs on multicore architectures |
| 0 | Resilient microring resonator based photonic networks |
| 0 | FeatherWeight: low-cost optical arbitration with QoS support |
| 0 | Identifying and predicting timing-critical instructions to boost timing speculation |
| 0 | Preventing PCM banks from seizing too much power |
| 0 | CRAM: coded registers for amplified multiporting |
| 0 | CoreRacer: a practical memory race recorder for multicore x86 TSO processors |
| 0 | Manager-client pairing: a framework for implementing coherence hierarchies |
| 0 | TransCom: transforming stream communication for load balance and efficiency in networks-on-chip |
| 0 | Architectural support for secure virtualization under a vulnerable hypervisor |
| 0 | Complementing user-level coarse-grain parallelism with implicit speculative parallelism |
| 0 | Pay-As-You-Go: low-overhead hard-error correction for phase change memories |
| 0 | A resistive TCAM accelerator for data-intensive computing |
| 0 | A register-file approach for row buffer caches in die-stacked DRAMs |
| 0 | Accelerating microprocessor silicon validation by exposing ISA diversity |
| 0 | Encore: low-cost, fine-grained transient fault recovery |
| 0 | Formally enhanced runtime verification to ensure NoC functional correctness |
| 0 | Residue cache: a low-energy low-area L2 cache architecture via compression and partial hits |
| 0 | A compile-time managed multi-level register file hierarchy |
| 0 | A data layout optimization framework for NUCA-based multicores |