SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

66 Graphite: A distributed parallel simulator for multicores
43 ATLAS: A scalable and high-performance scheduling algorithm for multiple memory controllers
36 Architecting for power management: The IBM POWER7TM approach
33 Designing a processor from the ground up to allow voltage/reliability tradeoffs
33 Improving read performance of Phase Change Memories via Write Cancellation and Write Pausing
32 Understanding how off-chip memory bandwidth partitioning in Chip Multiprocessors affects system performance
32 An optimized 3D-stacked memory architecture by exploiting excessive, high-density TSV bandwidth
28 A Hybrid solid-state storage architecture for the performance, energy consumption, and lifetime improvement
26 Operating system support for overlapping-ISA heterogeneous multi-core architectures
23 Application performance modeling in a virtualized environment
20 CHOP: Adaptive filter-based DRAM caching for CMP server platforms
18 Interval simulation: Raising the level of abstraction in architectural simulation
15 High performance network virtualization with SR-IOV
12 Scalable architectural support for trusted software
11 FlexiShare: Channel sharing for an energy-efficient nanophotonic crossbar
9 LiteTM: Reducing transactional state overhead
9 ESP-NUCA: A low-cost adaptive Non-Uniform Cache Architecture
9 Towards scalable, energy-efficient, bus-based on-chip networks
9 Worth their watts? - an empirical study of datacenter servers
8 Explaining cache SER anomaly using DUE AVF measurement
7 A bandwidth-aware memory-subsystem resource management using non-invasive resource profilers for large CMP systems
7 DMA cache: Using on-chip storage to architecturally separate I/O data from CPU data for improving I/O performance
6 Delay-Hiding energy management mechanisms for DRAM
5 Value Based BTB Indexing for indirect jump prediction
4 LeadOut: Composing low-overhead frequency-enhancing techniques for single-thread performance in configurable multicores
4 UNified Instruction/Translation/Data (UNITD) coherence: One protocol to rule them all
4 StimulusCache: Boosting performance of chip multiprocessors with excess cache
4 Simple virtual channel allocation for high throughput and high frequency on-chip routers
4 COMIC++: A software SVM system for heterogeneous multicore accelerator clusters
4 BOLT: Energy-efficient Out-of-Order Latency-Tolerant execution
3 HARE: Hardware assisted reverse execution
3 IADVS: On-demand performance for interactive applications
3 DMA++: on the fly data realignment for on-chip memories
3 SIF: Overcoming the limitations of SIMD devices via implicit permutation
1 Handling branches in TLS systems with Multi-Path Execution
0 High-Performance low-vcc in-order core