SC1110090807 0605040302
PPoPP1110090807 060503
ICS1110090807 0605040302
IPDPS1110090807 0605040302
ISCA1110090807 0605040302
ASPLOS11100908 060402
MICRO1110090807 0605040302
HPCA1110090807 0605040302

299 Sequoia: programming the memory hierarchy
250 CellSs: a programming model for the cell BE architecture
166 Scalable algorithms for molecular dynamics simulations on commodity clusters
136 A memory model for scientific algorithms on graphics processors
96 Grid capacity planning with negotiation-based advance reservation for optimized QoS
85 Exploiting the performance of 32 bit floating point arithmetic in obtaining 64 bit accuracy (revisiting iterative refinement for linear systems)
84 CRUSH: controlled, scalable, decentralized placement of replicated data
70 High-performance dynamic graphics streaming for scalable adaptive graphics environment
67 From mesh generation to scientific visualization: an end-to-end approach to parallel supercomputing
57 Adaptive, transparent frequency and voltage scaling of communication phases in MPI programs
55 FFT program generation for shared memory: SMP and multicore
55 Designing a runtime system for volunteer computing
53 Evaluation of a workflow scheduler using integrated performance modelling and batch queue wait time prediction
52 Toward a doctrine of containment: grid hosting with adaptive resource control
47 A performance comparison through benchmarking and modeling of three leading supercomputers: blue Gene/L, Red Storm, and Purple
45 Problem diagnosis in large-scale computing environments
43 Blue matter: approaching the limits of concurrency for classical molecular dynamics
40 Blocking vs. non-blocking coordinated checkpointing for large-scale fault tolerant MPI
40 Topology mapping for Blue Gene/L supercomputer
34 Designing a highly-scalable operating system: the Blue Gene/L story
33 Parallel genomic sequence-searching on an ad-hoc grid: experiences, lessons learned, and implications
30 Benchmarking XML processors for applications in grid web services
29 Toward real-time image guided neurosurgery using distributed and grid computing
29 Quantifying the potential benefit of overlapping communication and computation in large-scale scientific applications
28 The design space of data-parallel memory systems
28 Sustainable adaptive grid supercomputing: multiscale simulation of semiconductor processing across the pacific
27 High-performance and scalable MPI over InfiniBand with reduced memory usage: an in-depth performance analysis
25 Architectures and APIs: assessing requirements for delivering FPGA performance to applications
24 Preliminary investigation of advanced electrostatics in molecular dynamics on reconfigurable computers
23 Improving grid resource allocation via integrated selection and binding
19 PBPI: a high performance implementation of Bayesian phylogenetic inference
19 MPI performance analysis tools on Blue Gene/L
19 Detecting distributed scans using high-performance query-driven visualization
19 Computing large sparse multivariate optimization problems with an application in biophysics
18 Large image correction and warping in a cluster environment
17 Locality and parallelism optimization for dynamic programming algorithm in bioinformatics
17 Evaluating grid portal security
16 Adaptive routing in high-radix clos network
15 Hypergraph partitioning for automatic memory hierarchy management
14 Software routing and aggregation of messages to optimize the performance of HPCC randomaccess benchmark
13 Multiple range query optimization with distributed cache indexing
13 Supporting dynamic migration in tightly coupled grid applications
11 The potential energy efficiency of vector acceleration
11 Level-wise scheduling algorithm for fat tree interconnection networks
11 A software based approach for providing network fault tolerance in clusters with uDAPL interface: MPI level design and performance evaluation
10 Design and implementation of a one-sided communication interface for the IBM eServer Blue GeneŽ supercomputer
7 Estimating query result sizes for proxy caching in scientific database federations
6 Performance modeling and optimization of a high energy colliding beam simulation code
6 Nested OpenMP for efficient computation of 3D critical points in multi-block CFD datasets
6 Revisiting web server workload invariants in the context of scientific web sites
5 End-system aware, rate-adaptive protocol for network transport in LambdaGrid environments
3 Modeling pulse propagation and scattering in a dispersive medium: performance of MPI/OpenMP hybrid code
3 A near-optimal real-time hardware scheduler for large cardinality crossbar switches
1 CycleMeter: detecting fraudulent peers in internet cycle sharing