|
|
| 142 | McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures |
| 83 | Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping |
| 77 | Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling |
| 64 | Characterizing flash memory: anomalies, observations, and applications |
| 51 | EazyHTM: eager-lazy hardware transactional memory |
| 43 | Application-aware prioritization mechanisms for on-chip networks |
| 39 | Flip-N-Write: a simple deterministic technique to improve PRAM write performance, energy and endurance |
| 38 | Into the wild: studying real user activity patterns to guide power optimizations for mobile architectures |
| 35 | SCARAB: a single cycle adaptive routing and bufferless network |
| 33 | Low-cost router microarchitecture for on-chip networks |
| 33 | Pseudo-LIFO: the foundation of a new family of replacement policies for last-level caches |
| 33 | A tagless coherence directory |
| 31 | Coordinated control of multiple prefetchers in multi-core systems |
| 28 | mSWAT: low-cost hardware fault detection and diagnosis for multicore systems |
| 24 | Improving cache lifetime reliability at ultra-low voltages |
| 24 | Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip |
| 24 | Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems |
| 23 | Adaptive line placement with the set balancing cache |
| 20 | Finding concurrency bugs with context-aware communication graphs |
| 19 | Light speed arbitration and flow control for nanophotonic interconnects |
| 18 | Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy |
| 17 | ZerehCache: armoring cache architectures in high defect density technologies |
| 17 | Optimizing shared cache behavior of chip multiprocessors |
| 17 | SHARP control: controlled shared cache management in chip multiprocessors |
| 16 | Characterizing and mitigating the impact of process variations on phase change based memory systems |
| 16 | Complexity effective memory access scheduling for many-core accelerator architectures |
| 16 | Low Vccmin fault-tolerant cache with highly predictable performance |
| 16 | The BubbleWrap many-core: popping cores for sequential acceleration |
| 15 | Offline symbolic analysis for multi-processor execution replay |
| 14 | Portable compiler optimisation across embedded programs and microarchitectures using machine learning |
| 14 | BulkCompiler: high-performance sequential consistency through cooperative compiler and hardware support |
| 14 | A case for dynamic frequency tuning in on-chip networks |
| 13 | In-network coherence filtering: snoopy coherence without broadcasts |
| 13 | Architecting a chunk-based memory race recorder in modern CMPs |
| 12 | Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications |
| 11 | Multiple clock and voltage domains for chip multi processors |
| 10 | Reducing peak power with a table-driven adaptive processor core |
| 10 | Improving memory bank-level parallelism in the presence of prefetching |
| 10 | Execution leases: a hardware-supported mechanism for enforcing strong non-interference |
| 9 | Proactive transaction scheduling for contention management |
| 9 | Characterizing the resource-sharing levels in the UltraSPARC T2 processor |
| 8 | An hybrid eDRAM/SRAM macrocell to implement first-level data caches |
| 8 | Tribeca: design for PVT variations with local recovery and fine-grained adaptation |
| 7 | ESKIMO: Energy savings using Semantic Knowledge of Inconsequential Memory Occupancy for DRAM subsystem |
| 7 | Using a configurable processor generator for computer architecture prototyping |
| 6 | Light64: lightweight hardware support for data race detection during systematic testing of parallel programs |
| 5 | A microarchitecture-based framework for pre- and post-silicon power delivery analysis |
| 5 | Ordering decoupled metadata accesses in multiprocessors |
| 4 | Variation-tolerant non-uniform 3D cache management in die stacked multicore processor |
| 3 | Tree register allocation |
| 3 | Control flow obfuscation with information flow tracking |
| 2 | DDT: design and evaluation of a dynamic program analysis for optimizing data structure usage |