SC
11
10
09
08
07
06
05
04
03
02
PPoPP
11
10
09
08
07
06
05
03
ICS
11
10
09
08
07
06
05
04
03
02
IPDPS
11
10
09
08
07
06
05
04
03
02
ISCA
11
10
09
08
07
06
05
04
03
02
ASPLOS
11
10
09
08
06
04
02
MICRO
11
10
09
08
07
06
05
04
03
02
HPCA
11
10
09
08
07
06
05
04
03
02
13
Mint: Realizing CUDA performance in 3D Stencil Methods with Annotated C
9
Hystor: Making the Best Use of Solid State Drives in High Performance Storage Systems
8
Page Placement in Hybrid Memory Systems
7
A QHD-Capable Parallel H.264 Decoder
6
Automatic generation of executable communication specifications from parallel applications
6
ZEBRA : A Data-Centric, Hybrid-Policy Hardware Transactional Memory Design
4
Coordinating Processor and Main Memory for Efficient Server Power Control
4
Transactional Conflict Decoupling and Value Prediction
4
An Idiom-finding Tool for Increasing Productivity of Accelerators
4
High Performance Linpack Benchmark: A Fault Tolerant Implementation without Checkpointing
4
Generic Topology Mapping Strategies for Large-scale Parallel Architectures
3
Karma: Scalable Deterministic Record-Replay
3
Performance Impact and Interplay of SSD Parallelism through Advanced Commands, Allocation Strategy and Data Granularity
3
Modeling the Performance of an Algebraic Multigrid Cycle on HPC Platforms
2
Using GPU to Compute Large Out-of-card FFTs
2
Controlling Cache Utilization of HPC Applications
2
Predictive Coordination of Multiple On-Chip Resources for Chip Multiprocessors
2
Scalable Fine-grained Call Path Tracing
2
Multiset Signatures for Transactional Memory
1
Active Pebbles: Parallel Programming for Data-Driven Applications
1
SecureME: A Hardware-Software Approach to Full System Security
1
Characterizing the Impact of Soft Errors on Iterative Methods in Scientific Computing
1
The elephant and the mice: the role of non-strict fine-grain synchronization for modern many-core architectures
1
An Execution Strategy and Optimized Runtime Support for Parallelizing Irregular Reductions on Modern GPUs
1
MDR: Performance model driven runtime for heterogeneous parallel platforms
1
Processing data streams with hard real-time constraints on heterogeneous systems
0
Optimizing the Datacenter for Data-Centric Workloads
0
A Composite and Scalable Cache Coherence Protocol for Large Scale CMPs
0
Automatic SIMD Vectorization of Fast Fourier Transforms for the Larrabee and AVX Instruction Sets
0
MP-PIPE: A Massively Parallel Protein-Protein Interaction Prediction Engine
0
Optimizing Throughput/Power Tradeoffs in Hardware Transactional Memory Using DVFS and Intelligent Scheduling
0
Automating GPU Computing in MATLAB
0
Cosmic Microwave Background Map-Making At The Petascale And Beyond
0
Cost-Effectively Offering Private Buffers from a Shared Cache
0
F^2BFLY: An On-Chip Free-Space Optical Network with Wavelength-Switching