SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

176 OpenMP to GPGPU: a compiler framework for automatic translation and optimization
55 How much parallelism is there in irregular applications?
54 A comprehensive strategy for contention management in software transactional memory
50 Solving dense linear systems on platforms with multiple hardware accelerators
50 Formal verification of practical MPI programs
46 Transactional memory with strong atomicity using off-the-shelf memory protection hardware
40 Mapping parallelism to multi-cores: a machine learning based approach
37 Atomic quake: using transactional memory in an interactive multiplayer game server
35 Committing conflicting transactions in an STM
34 Idempotent work stealing
34 Effective performance measurement and analysis of multithreaded applications
29 Serialization sets: a dynamic dependence-based parallel execution model
24 Efficient and scalable multiprocessor fair scheduling using distributed weighted round-robin
24 Detecting and tolerating asymmetric races
20 MPIWiz: subgroup reproducible replay of mpi applications
18 Compiler-assisted dynamic scheduling for effective parallelization of loop nests on multicore processors
16 Backtracking-based load balancing
15 Safe open-nested transactions through ownership
13 Petascale computing with accelerators
12 An efficient transactional memory algorithm for computing minimum spanning forest of sparse graphs
11 Efficient, portable implementation of asynchronous multi-place programs
10 Comparability graph coloring for optimizing utilization of stream register files in stream processors
8 A compiler-directed data prefetching scheme for chip multiprocessors
5 Techniques for efficient placement of synchronization primitives
1 Application-aware management of parallel simulation collections
1 A comparison of programming models for multiprocessors with explicitly managed memory hierarchies
0 How to build programmable multi-core chips