|
|
| 268 | Optimization of Sparse Matrix-vector Multiplication on Emerging Multicore Platforms |
| 159 | Falkon: Fast and Light-weight tasK executiON framework |
| 92 | Efficient Operating System Scheduling for Performance-asymmetric Multi-core Architectures |
| 76 | Exploring Event Correlation for Failure Prediction in Coalitions of Clusters |
| 60 | Implementation and Performance Analysis of Non-blocking Collective Operations for MPI |
| 59 | Anatomy of a Cortical Simulator |
| 57 | Efficient Gather and Scatter Operations on Graphics Processors |
| 56 | Inter-operating Grids through Delegated MatchMaking |
| 52 | Virtual Machine Aware Communication Libraries for High Performance Computing |
| 51 | Large-scale Maximum Likelihood-based Phylogenetic Analysis on the IBM BlueGene/L |
| 47 | Cray XT4: An Early Evaluation for Petascale Scientific Simulation |
| 46 | User-friendly and Reliable Grid Computing Based on Imperfect Middleware |
| 43 | Bounding Energy Consumption in Large-scale MPI Programs |
| 35 | The Cray BlackWidow: A Highly Scalable Vector Multiprocessor |
| 29 | GRAPE-DR: 2-Pflops Massively-Parallel Computer with 512-Core, 512-Gflops Processor Chips for Scientific Computing |
| 29 | The Ghost in the Machine: Observing the Effects of Kernel Operation on Parallel Application Performance |
| 28 | Multi-threading and One-sided Communication in Parallel LU Factorization |
| 28 | Multi-level Tiling: M for the Price of One |
| 26 | P^nMPI Tools: A Whole Lot Greater than the Sum of Their Parts |
| 26 | WRF nature run |
| 25 | Automatic Resource Specification Generation for Resource Selection |
| 25 | Evaluation of Active Storage Strategies for the Lustre Parallel File System |
| 24 | A Genetic Algorithms Approach to Modeling the Performance of Memory-bound Computations |
| 24 | Advanced Data Flow Support for Scientific Grid Workflow Applications |
| 23 | Extending stability beyond CPU millennium: a micron-scale atomistic simulation of Kelvin-Helmholtz instability |
| 21 | Scalable Security for Petascale Parallel File Systems |
| 21 | Investigation of Leading HPC I/O Performance using a Scientific-application-derived Benchmark |
| 20 | Integrating Parallel File Systems with Object-based Storage Devices |
| 19 | High-performance Ethernet-based Communications for Future Multi-core Processors |
| 18 | Performance under Failure of High-end Computing |
| 18 | Parallel Hierarchical Visualization of Large Time-varying 3D Vector Fields |
| 18 | Application Development on Hybrid Systems |
| 16 | Optimizing Center Performance through Coordinated Data Staging, Scheduling and Recovery |
| 16 | Data Access History Cache and Associated Data Prefetching Mechanisms |
| 16 | Age-Based Packet Arbitration in Large-Radix k-ary n-cubes |
| 15 | RobuSTore: A Distributed Storage Architecture with Robust and High Performance |
| 15 | Data Exploration of Turbulence Simulations using a Database Cluster |
| 14 | A Job Scheduling Framework for Large Computing Farms |
| 13 | Using MPI File Caching to Improve Parallel Write Performance for Large-scale Scientific Applications |
| 12 | DMTracker: Finding Bugs in Large-scale Parallel Programs by Detecting Anomaly in Data Movements |
| 12 | Low-Constant Parallel Algorithms for Finite Element Simulations using Linear Octrees |
| 12 | Performance and Cost Optimization for Multiple Large-scale Grid Workflow Applications |
| 12 | Anomaly Detection and Diagnosis in Grid Environments |
| 10 | Noncontiguous Locking Techniques for Parallel File Systems |
| 10 | Analyzing the Impact of Supporting Out-of-order Communication on In-order Performance with iWARP |
| 7 | Evaluating NIC Hardware Requirements to Achieve High Message Rate PGAS Support on Multi-Core Processors |
| 7 | Scaling Performance of Interior-Point Method on Large-Scale Chip Multiprocessor System |
| 7 | Performance Adaptive Power-aware Reconfigurable Optical Interconnects for HPC Systems |
| 7 | An Adaptive Mesh Refinement Benchmark for Modern Parallel Programming Languages |
| 6 | Automatic Software Interference Detection in Parallel Applications |
| 6 | Workstation Capacity Tuning using Reinforcement Learning |
| 5 | A Case for Low-complexity MP Architectures |
| 5 | Evaluating Network Information Models on Resource Efficiency and Application Performance in Lambda-Grids |
| 4 | A User-level Secure Grid File System |
| 4 | A Preliminary Investigation of a Neocortex Model Implementation on the Cray XD1 |
| 4 | A 281 Tflops calculation for X-ray protein structure analysis with special-purpose computers MDGRAPE-3 |
| 2 | First-principles calculations of large-scale semiconductor systems on the earth simulator |
| 1 | Variable Latency Caches for Nanoscale Processor |