SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

616 Composable memory transactions
169 Energy conservation in heterogeneous server clusters
94 A framework for adaptive algorithm selection in STAPL
93 Using multiple energy gears in MPI programs on a power-scalable cluster
73 Locality aware dynamic load management for massively multiplayer games
72 Exposing speculative thread parallelism in SPEC2000
62 Automated type-based analysis of data races and atomicity
61 An evaluation of global address space languages: co-array fortran and unified parallel C
51 Fault tolerant high performance computing by a coding approach
47 Static analysis of atomicity for programs with non-blocking synchronization
40 Applications of synchronization coverage
39 Modeling wildcard-free MPI programs for verification
37 Compiler techniques for high performance sequentially consistent java programs
35 Teleport messaging for distributed stream programs
27 Adaptive execution techniques for SMT multiprocessor architectures
21 Revocable locks for non-blocking programming
20 Automatic multithreading and multiprocessing of C programs for IXP
19 A sampling-based framework for parallel data mining
16 Exposing disk layout to compiler for reducing energy consumption of parallel disk based systems
15 Effective communication coalescing for data-parallel applications
14 Trust but verify: monitoring remotely executing programs for progress and correctness
9 Scaling model checking of dataraces using dynamic information
7 A novel approach for partitioning iteration spaces with variable densities
6 A linear-time algorithm for optimal barrier placement
6 Extracting SMP parallelism for dense linear algebra algorithms from high-level specifications
5 Why is graphics hardware so fast?
4 System-wide performance monitors and their application to the optimization of coherent memory accesses
3 Performance modeling and optimization of parallel out-of-core tensor contractions