|
|
| 746 | A comparison of sequential consistency with home-based lazy release consistency for software distributed shared memory |
| 216 | Energy conservation techniques for disk array-based servers |
| 163 | CQoS: a framework for enabling QoS in shared caches of CMP platforms |
| 81 | Adaptive incremental checkpointing for massively parallel systems |
| 78 | The energy efficiency of CMP vs. SMT for multimedia workloads |
| 65 | PB-LRU: a self-tuning power aware storage cache replacement algorithm for conserving disk energy |
| 53 | EXPERT: expedited simulation exploiting program behavior repetition |
| 39 | Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system |
| 36 | Effective stream-based and execution-based data prefetching |
| 36 | An analysis of the impact of MPI overlap and independent progress |
| 32 | A unified framework for nonlinear dependence testing and symbolic analysis |
| 31 | Back-end assignment schemes for clustered multithreaded processors |
| 29 | Adaptive Java optimisation using instance-based learning |
| 28 | Practical and efficient point insertion scheduling method for parallel guaranteed quality delaunay refinement |
| 25 | Evaluating support for global address space languages on the Cray X1 |
| 21 | Parallel algorithms for mining frequent structural motifs in scientific data |
| 19 | Enhancing data cache reliability by the addition of a small fully-associative replication cache |
| 19 | Multilevel hierarchical matrix multiplication on clusters |
| 18 | Scaling the issue window with look-ahead latency prediction |
| 16 | Characterizing a new class of threads in scientific applications for high end supercomputers |
| 14 | Design space exploration of caches using compressed traces |
| 13 | Performance characteristics of the Cray X1 and their implications for application performance tuning |
| 12 | Detailed cache coherence characterization for OpenMP benchmarks |
| 11 | Inter-reference gap distribution replacement: an improved replacement algorithm for set-associative caches |
| 11 | Adaptive paging for a multifrontal solver |
| 11 | Cluster prefetch: tolerating on-chip wire delays in clustered microarchitectures |
| 7 | Implementation and performance evaluation of CONFLEX-G: grid-enabled molecular conformational space search program with OmniRPC |
| 7 | Cluster scheduling for explicitly-speculative tasks |
| 5 | Data forwarding through in-memory precomputation threads |
| 5 | Implicit java array bounds checking on 64-bit architecture |
| 4 | Automatic re-scheduling of dependencies in a RPC-based grid |
| 4 | Time and space optimization for processing groups of multi-dimensional scientific queries |
| 3 | A dynamic application-driven data communication strategy |
| 2 | Applications of storage mapping optimization to register promotion |
| 0 | Impact of far-field interactions on performance of multipole-based preconditioners for sparse linear systems |