|
|
| 160 | Proactive fault tolerance for HPC with Xen virtualization |
| 146 | Cooperative cache partitioning for chip multiprocessors |
| 31 | Executing irregular scientific applications on stream architectures |
| 31 | High performance MPI design using unreliable datagram for ultra-scale InfiniBand clusters |
| 28 | Sensitivity analysis for automatic parallelization on multi-cores |
| 26 | Scalability of the Nutch search engine |
| 26 | Scheduling FFT computation on SMP and multicore systems |
| 22 | Representation-transparent matrix algorithms with scalable performance |
| 19 | A study of process arrival patterns for MPI collective operations |
| 17 | Locality of sampling and diversity in parallel system workloads |
| 15 | Modeling correlated workloads by combining model based clustering and a localized sampling algorithm |
| 12 | Scalability analysis of SPMD codes using expectations |
| 12 | Automatic nonblocking communication for partitioned global address space programs |
| 12 | Active memory operations |
| 12 | A low-cost mixed-mode parallel processor architecture for embedded systems |
| 11 | Performance driven data cache prefetching in a dynamic software optimization system |
| 10 | Characteristics of workloads used in high performance and technical computing |
| 9 | Tradeoff between data-, instruction-, and thread-level parallelism in stream processors |
| 9 | Adaptive performance control for distributed scientific coupled models |
| 9 | Adaptive Strassen's matrix multiplication |
| 8 | GridRod: a dynamic runtime scheduler for grid workflows |
| 6 | An L2-miss-driven early register deallocation for SMT processors |
| 5 | Optimization of data prefetch helper threads with path-expression based statistical modeling |
| 4 | Optimization and bottleneck analysis of network block I/O in commodity storage systems |
| 4 | A symmetric transformation for 3-body potential molecular dynamics using force-decomposition in a heterogeneous distributed environment |
| 2 | Sequencer virtualization |
| 2 | Compression in cache design |
| 2 | Increasing cache capacity through word filtering |
| 1 | An operation stacking framework for large ensemble computations |