SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

27 A quantitative performance analysis model for GPU architectures
16 Relaxing non-volatility for fast and energy-efficient STT-RAM caches
14 Calvin: Deterministic or not? Free will to choose
13 Thread block compaction for efficient SIMT control flow
13 CHIPPER: A low-complexity bufferless deflection router
10 Essential roles of exploiting internal parallelism of flash memory based solid state drives in high-speed data processing
10 FREE-p: Protecting non-volatile memory against both hard and soft errors
9 SolarCore: Solar energy driven multi-core architecture power management
9 HAsim: FPGA-based high-detail multicore simulation using time-division multiplexing
8 Cuckoo directory: A scalable directory for many-core systems
8 A new server I/O architecture for high speed networks
8 Beyond block I/O: Rethinking traditional storage primitives
7 Addressing system-level trimming issues in on-chip nanophotonic networks
7 Atomic Coherence: Leveraging nanophotonics to build race-free cache coherence protocols
7 A case for guarded power gating for multi-core processors
7 Efficient complex operators for irregular codes
6 CloudCache: Expanding and shrinking private caches
6 Practical and secure PCM systems by online detection of malicious write streams
6 Dynamically Specialized Datapaths for energy efficient computing
6 ACCESS: Smart scheduling for asymmetric cache CMPs
4 Shared last-level TLBs for chip multiprocessors
4 Dynamic parallelization of JavaScript applications using an ultra-lightweight speculation mechanism
4 I-CASH: Intelligently Coupled Array of SSD and HDD
4 Mercury: A fast and energy-efficient multi-level cell based Phase Change Memory system
4 Architectural framework for supporting operating system survivability
3 Achieving uniform performance and maximizing throughput in the presence of heterogeneity
3 Bloom Filter Guided Transaction Scheduling
3 HAQu: Hardware-accelerated queueing for fine-grained threading on a chip multiprocessor
3 Fast thread migration via cache working set prediction
3 Offline symbolic analysis to infer Total Store Order
3 Abstraction and microarchitecture scaling in early-stage power modeling
3 Exploiting criticality to reduce bottlenecks in distributed uniprocessors
3 Storage free confidence estimation for the TAGE branch predictor
2 MorphCache: A Reconfigurable Adaptive Multi-level Cache hierarchy
2 Efficient data streaming with on-chip accelerators: Opportunities and challenges
2 Archipelago: A polymorphic cache design for enabling robust near-threshold operation
1 Fg-STP: Fine-Grain Single Thread Partitioning on Multicores
1 Low-voltage on-chip cache architecture using heterogeneous cell sizes for high-performance processors
1 MOPED: Orchestrating interprocess message data on CMPs
1 Power shifting in Thrifty Interconnection Network
1 NUcache: An efficient multicore cache organization based on Next-Use distance
1 Hardware/software-based diagnosis of load-store queues using expandable activity logs
1 Hardware/software techniques for DRAM thermal management
0 Data-triggered threads: Eliminating redundant computation
0 Safe and efficient supervised memory systems
0 Checked Load: Architectural support for JavaScript type-checking on mobile processors