|
|
| 332 | Evaluating MapReduce for Multi-core and Multiprocessor Systems |
| 210 | LogTM-SE: Decoupling Hardware Transactional Memory from Caches |
| 106 | Concurrent Direct Network Access for Virtual Machine Monitors |
| 89 | Thermal Herding: Microarchitecture Techniques for Controlling Hotspots in High-Performance 3D-Integrated Processors |
| 88 | A Scalable, Non-blocking Approach to Transactional Memory |
| 71 | Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetchers |
| 69 | An Adaptive Shared/Private NUCA Cache Partitioning Scheme for Chip Multiprocessors |
| 59 | Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-thread Applications |
| 58 | A Burst Scheduling Access Reordering Mechanism |
| 55 | HARD: Hardware-Assisted Lockset-based Race Detection |
| 51 | Perturbation-based Fault Screening |
| 50 | Application-Level Correctness and its Impact on Fault Tolerance |
| 48 | MemTracker: Efficient and Programmable Support for Memory Access Monitoring and Debugging |
| 47 | Illustrative Design Space Studies with Microarchitectural Regression Models |
| 35 | Fully-Buffered DIMM Memory Architectures: Understanding Mechanisms, Overheads and Scaling |
| 31 | An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing |
| 28 | Colorama: Architectural Support for Data-Centric Synchronization |
| 28 | A Memory-Level Parallelism Aware Fetch Policy for SMT Processors |
| 27 | Modeling and Managing Thermal Profiles of Rack-mounted Servers with ThermoStat |
| 25 | A Domain-Specific On-Chip Network Design for Large Scale Cache Systems |
| 23 | Error Detection via Online Checking of Cache Coherence with Token Coherence Signatures |
| 23 | A Low Overhead Fault Tolerant Coherence Protocol for CMP Architectures |
| 23 | Interactions Between Compression and Prefetching in Chip Multiprocessors |
| 21 | Line Distillation: Increasing Cache Capacity by Filtering Unused Words in Cache Lines |
| 20 | Liquid SIMD: Abstracting SIMD Hardware using Lightweight Dynamic Mapping |
| 19 | Exploiting Postdominance for Speculative Parallelization |
| 17 | Accelerating and Adapting Precomputation Threads for Effcient Prefetching |
| 13 | Optical Interconnect Opportunities for Future Server Memory Systems |
| 11 | Implications of Device Timing Variability on Full Chip Timing |
| 7 | Improving Branch Prediction and Predicated Execution in Out-of-Order Processors |
| 3 | Interconnect-Centric Computing |
| 1 | Petascale Computing Research Challenges - A Manycore Perspective |