SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

142 McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures
83 Qilin: exploiting parallelism on heterogeneous multiprocessors with adaptive mapping
77 Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling
64 Characterizing flash memory: anomalies, observations, and applications
51 EazyHTM: eager-lazy hardware transactional memory
43 Application-aware prioritization mechanisms for on-chip networks
39 Flip-N-Write: a simple deterministic technique to improve PRAM write performance, energy and endurance
38 Into the wild: studying real user activity patterns to guide power optimizations for mobile architectures
35 SCARAB: a single cycle adaptive routing and bufferless network
33 Low-cost router microarchitecture for on-chip networks
33 Pseudo-LIFO: the foundation of a new family of replacement policies for last-level caches
33 A tagless coherence directory
31 Coordinated control of multiple prefetchers in multi-core systems
28 mSWAT: low-cost hardware fault detection and diagnosis for multicore systems
24 Improving cache lifetime reliability at ultra-low voltages
24 Preemptive virtual clock: a flexible, efficient, and cost-effective QOS scheme for networks-on-chip
24 Comparing cache architectures and coherency protocols on x86-64 multicore SMP systems
23 Adaptive line placement with the set balancing cache
20 Finding concurrency bugs with context-aware communication graphs
19 Light speed arbitration and flow control for nanophotonic interconnects
18 Extending the effectiveness of 3D-stacked DRAM caches with an adaptive multi-queue policy
17 ZerehCache: armoring cache architectures in high defect density technologies
17 Optimizing shared cache behavior of chip multiprocessors
17 SHARP control: controlled shared cache management in chip multiprocessors
16 Characterizing and mitigating the impact of process variations on phase change based memory systems
16 Complexity effective memory access scheduling for many-core accelerator architectures
16 Low Vccmin fault-tolerant cache with highly predictable performance
16 The BubbleWrap many-core: popping cores for sequential acceleration
15 Offline symbolic analysis for multi-processor execution replay
14 Portable compiler optimisation across embedded programs and microarchitectures using machine learning
14 BulkCompiler: high-performance sequential consistency through cooperative compiler and hardware support
14 A case for dynamic frequency tuning in on-chip networks
13 In-network coherence filtering: snoopy coherence without broadcasts
13 Architecting a chunk-based memory race recorder in modern CMPs
12 Polymorphic pipeline array: a flexible multicore accelerator with virtualized execution for mobile multimedia applications
11 Multiple clock and voltage domains for chip multi processors
10 Reducing peak power with a table-driven adaptive processor core
10 Improving memory bank-level parallelism in the presence of prefetching
10 Execution leases: a hardware-supported mechanism for enforcing strong non-interference
9 Proactive transaction scheduling for contention management
9 Characterizing the resource-sharing levels in the UltraSPARC T2 processor
8 An hybrid eDRAM/SRAM macrocell to implement first-level data caches
8 Tribeca: design for PVT variations with local recovery and fine-grained adaptation
7 ESKIMO: Energy savings using Semantic Knowledge of Inconsequential Memory Occupancy for DRAM subsystem
7 Using a configurable processor generator for computer architecture prototyping
6 Light64: lightweight hardware support for data race detection during systematic testing of parallel programs
5 A microarchitecture-based framework for pre- and post-silicon power delivery analysis
5 Ordering decoupled metadata accesses in multiprocessors
4 Variation-tolerant non-uniform 3D cache management in die stacked multicore processor
3 Tree register allocation
3 Control flow obfuscation with information flow tracking
2 DDT: design and evaluation of a dynamic program analysis for optimizing data structure usage