SC
11
10
09
08
07
06
05
04
03
02
PPoPP
11
10
09
08
07
06
05
03
ICS
11
10
09
08
07
06
05
04
03
02
IPDPS
11
10
09
08
07
06
05
04
03
02
ISCA
11
10
09
08
07
06
05
04
03
02
ASPLOS
11
10
09
08
06
04
02
MICRO
11
10
09
08
07
06
05
04
03
02
HPCA
11
10
09
08
07
06
05
04
03
02
7
GreenSlot: Scheduling Energy Consumption in Green Datacenters
7
Checkpointing strategies for parallel jobs
7
Evaluating the Viability of Process Replication Reliability for Exascale Systems
6
Enabling and Scaling Biomolecular Simulations of 100 Million Atoms on Petascale Machines with a Multicore-optimized Message-driven Runtime
6
Improving Communication Performance in Dense Linear Algebra via Topology Aware Collectives
5
Liszt: A Domain Specific Language for Building Portable Mesh-based PDE Solvers
5
Parallel Reduction to Condensed Forms for Symmetric Eigenvalue Problems using Aggregated Fine-Grained and Memory-Aware Kernels
5
Purlieus: Locality-aware Resource Allocation for MapReduce in a Cloud
5
Parallel Breadth-First Search on Distributed Memory Systems
4
Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems
4
Reducing Electricity Cost Through Virtual Machine Placement in High Performance Computing Clouds
4
High-Efficiency Server Design
4
Modeling and Tolerating Heterogeneous Failures in Large Parallel Systems
4
SciHadoop: Array-based Query Processing in Hadoop
4
Scalable Stochastic Optimization of Complex Energy Systems
3
CudaDMA: Optimizing GPU Memory Bandwidth via Warp Specialization
3
Simplified Parallel Domain Traversal
3
Physis: An Implicitly Parallel Programming Model for Stencil Computations on Large-Scale GPU-Accelerated Supercomputers
3
Server-Side I/O Coordination for Parallel File Systems
3
A `Cool' Load Balancer for Parallel Applications
3
Parallel Random Numbers: As Easy as 1, 2, 3
3
Fast Implementation of DGEMM on Fermi GPU
3
SCMFS: A File System for Storage Class Memory
3
BlobCR: Efficient Checkpoint-Restart for HPC Applications on IaaS Clouds using Virtual Disk Image Snapshots
3
Sniper: Exploring the Level of Abstraction for Scalable and Accurate Parallel Multi-Core Simulation
3
Copernicus: A New Paradigm for Parallel Adaptive Molecular Dynamics
2
Optimizing Symmetric Dense Matrix-Vector Multiplication on GPUs
2
Tiled QR factorization algorithms
2
Dymaxion: Optimizing Memory Access Patterns for Heterogeneous Systems
2
Peta-scale Phase-Field Simulation for Dendritic Solidification on the TSUBAME 2.0 Supercomputer
2
The IBM Blue Gene/Q Interconnection Network and Message Unit
2
I/O Streaming Evaluation of Batch Queries for Data-Intensive Computational Turbulence
2
Parallel Index and Query for Large Scale Data Analysis
2
Using the TOP500 to Trace and Project Technology and Architecture Trends
2
FTI: high performance Fault Tolerance Interface for hybrid systems
2
Scalable fast multipole methods on distributed heterogeneous architectures
2
A new Computational Paradigm in Multiscale Simulations: Application to Brain Blood Flow
2
System Implications of Memory Reliability in Exascale Computing
2
Multithreaded Global Address Space Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms
2
Optimizing the Barnes-Hut Algorithm in UPC
2
Hardware, Software Co-design for Energy Efficient Seismic Modeling
1
GROPHECY: GPU Performance Projection from CPU Code Skeletons
1
QoS Support for End Users of I/O-intensive Applications using Shared Storage Systems
1
Gyrokinetic Toroidal Simulations on Leading Multi- and Manycore HPC Systems
1
Multi-Science Applications with Single Codebase - GAMER - for Massively Parallel Architectures
1
Optimized Pre-Copy Live Migration for Memory Intensive Applications
1
TRACON: Interference-Aware Scheduling for Data-Intensive Applications in Virtualized Environments
1
Flexible Resource Allocation for Reliable Virtual Cluster Computing Systems
1
Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows
1
Large Scale Debugging of Parallel Tasks with AutomaDeD
1
Performance of the Community Earth System Model
1
On the Duality of Data-intensive File System Design: Reconciling HDFS and PVFS
1
Scalable Implementations of Accurate Excited-state Coupled Cluster Theories: Application of High-level Methods to Porphyrin-based Systems
0
Unitary Qubit Lattice Simulations of Multiscale Phenomena in Quantum Turbulence
0
An Image Compositing Solution at Scale
0
ISABELA-QA: Query-driven Data Analytics over ISABELA-compressed Extreme-Scale Scientific Data
0
Virtual I/O caching: dynamic storage cache management for concurrent workloads
0
Scalable Hashing for Shared Memory Supercomputers
0
An Early Performance Analysis of POWER7-IH HPC Systems
0
A Similarity Measure for Time, Frequency, and Dependencies in Large-Scale Workloads
0
Efficient Data Race Detection for Distributed Memory Parallel Programs
0
MAximum Multicore POwer (MAMPO) - An Automatic Multithreaded Synthetic Power Virus Generation Framework for Multicore Systems
0
Hadoop Acceleration Through Network Levitated Merge
0
Extracting Ultra-Scale Lattice Boltzmann Performance via Hierarchical and Distributed Auto-Tuning
0
Highly Scalable Ab Initio Genomic Motif Identification
0
A Distributed Look-up Architecture for Text Mining Applications using MapReduce
0
Parallelization Design for Multi-core Platforms in Density Matrix Renormalization Group toward 2-D Quantum Strongly-correlated Systems
0
A Scalable Eigensolver for Large Scale-Free Graphs Using 2D Partitioning
0
High-Performance Lattice QCD for Multi-core Based Parallel Systems Using a Cache-Friendly Hybrid Threaded-MPI Approach
0
Scaling Lattice QCD beyond 100 GPUs
0
End-to-End Network QoS via Scheduling of Flexible Resource Reservation Requests
0
Large Scale Plane Wave Pseudopotential Density Functional Theory Calculations on GPU Clusters
0
Avoiding hot-spots on two-level direct networks
0
A Fast Solver for Modeling the Evolution of Virus Populations