SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

746 A comparison of sequential consistency with home-based lazy release consistency for software distributed shared memory
216 Energy conservation techniques for disk array-based servers
163 CQoS: a framework for enabling QoS in shared caches of CMP platforms
81 Adaptive incremental checkpointing for massively parallel systems
78 The energy efficiency of CMP vs. SMT for multimedia workloads
65 PB-LRU: a self-tuning power aware storage cache replacement algorithm for conserving disk energy
53 EXPERT: expedited simulation exploiting program behavior repetition
39 Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system
36 Effective stream-based and execution-based data prefetching
36 An analysis of the impact of MPI overlap and independent progress
32 A unified framework for nonlinear dependence testing and symbolic analysis
31 Back-end assignment schemes for clustered multithreaded processors
29 Adaptive Java optimisation using instance-based learning
28 Practical and efficient point insertion scheduling method for parallel guaranteed quality delaunay refinement
25 Evaluating support for global address space languages on the Cray X1
21 Parallel algorithms for mining frequent structural motifs in scientific data
19 Enhancing data cache reliability by the addition of a small fully-associative replication cache
19 Multilevel hierarchical matrix multiplication on clusters
18 Scaling the issue window with look-ahead latency prediction
16 Characterizing a new class of threads in scientific applications for high end supercomputers
14 Design space exploration of caches using compressed traces
13 Performance characteristics of the Cray X1 and their implications for application performance tuning
12 Detailed cache coherence characterization for OpenMP benchmarks
11 Inter-reference gap distribution replacement: an improved replacement algorithm for set-associative caches
11 Adaptive paging for a multifrontal solver
11 Cluster prefetch: tolerating on-chip wire delays in clustered microarchitectures
7 Implementation and performance evaluation of CONFLEX-G: grid-enabled molecular conformational space search program with OmniRPC
7 Cluster scheduling for explicitly-speculative tasks
5 Data forwarding through in-memory precomputation threads
5 Implicit java array bounds checking on 64-bit architecture
4 Automatic re-scheduling of dependencies in a RPC-based grid
4 Time and space optimization for processing groups of multi-dimensional scientific queries
3 A dynamic application-driven data communication strategy
2 Applications of storage mapping optimization to register promotion
0 Impact of far-field interactions on performance of multipole-based preconditioners for sparse linear systems