Abstracts for James W. Demmel
The EECS Research Summary for 2003
Automatic Performance Tuning of Sparse Matrix Kernels
Richard Vuduc, Shoaib Kamil1, Rajesh Nishtala2, Benjamin Lee3, and Attila Gyulassy4
(Professors James W. Demmel and Katherine A. Yelick)
(DOE) DE-FG03-94ER25219, (DOE) DE-FC03-98ER25351, and (LLNL/DOE) W-7405-ENG-48
The overall performance in a variety of scientific computing and
information retrieval applications is dominated by a few
computational kernels. One important class of such operations
are sparse matrix kernels--computations with matrices having
relatively few non-zero entries. Performance tuning of sparse kernels
is a particularly tedious and time-consuming task because performance
is a complicated function of the kernel, machine, and non-zero
structure of the matrix: for every machine, a user must carefully
choose a data structure and implementation (code and transformations)
that minimize the matrix storage while achieving the best possible
performance.
The goal of this research project is to generate implementations of
particular sparse kernels tuned to a given matrix and machine. Our
work builds on Sparsity [1], an early successful
prototype for sparse matrix-vector multiply (y = A*x, where A is a
sparse matrix and x, y, are dense vectors), in the following ways:
- Architecture-specific performance bounds: by careful
analysis, we have shown that the code generated by the Sparsity
system is often within 20% of the fastest possible. This places
a limit on what additional low-level tuning (e.g., instruction
scheduling) will do, and helps identify new opportunities for
performance improvements
[2].
We have applied a similar analysis to sparse triangular solve
[3].
- New optimization techniques: we are currently
exploring a large space of techniques to exploit a variety of matrix
structures (symmetry, diagonals, bands, variable blocks, etc.) and
to increase locality (reordering to create dense structure, cache
blocking, multiple vectors, etc.). We will develop automatic
techniques for deciding when and in what combinations to apply
these methods.
- Extensions to higher-level kernels: we are
exploring new sparse kernels, such as multiplying by A and AT
simultaneously, A*AT*x (for interior point methods and the
SVD), R*A*RT (A and R are sparse; used in multigrid methods),
and Ak*x, among others.
- Implementation of an automatically tuned sparse matrix library:
we will make this work available as an implementation of the new
Sparse BLAS standard [4], augmented by a single routine
to facilitate automatic tuning.
This research is being conducted by members of the Berkeley
Benchmarking and OPtimization (BeBOP) project
[5].
- [1]
- E.-J. Im, "Optimizing the Performance of Sparse Matrix-Vector
Multiplication," PhD thesis, UC Berkeley, May 2000.
- [2]
- R. Vuduc, J. Demmel, K. Yelick, et al., "Performance
Optimizations and Bounds for Sparse-Matrix Vector Multiply,"
Supercomputing, November 2002.
- [3]
- R. Vuduc, S. Kamil, et al., "Automatic Performance Tuning
and Analysis of Sparse Triangular Solve," Int. Conf.
Supercomputing, Workshop on Performance Optimization of High-level
Languages and Libraries, June 2002.
- [4]
- BLAS Technical Forum: http://www.netlib.org/
blas/blast-forum.
- [5]
- The BeBOP Homepage: http://www.cs.berkeley.
edu/~richie/bebop.
1Undergraduate (EECS)
2Undergraduate (EECS)
3Undergraduate (EECS)
4Undergraduate (EECS)
More information (http://www.cs.berkeley.edu/~richie/bebop) or
Send mail to the author : (richie@cs.berkeley.edu)
Simulation of Microelectromechanical Systems
David Bindel, David Garmire, Shyam Lakshmin, and Jiawang Nie
(Professor James W. Demmel)
NTONC 29823
SUGAR is a simulation tool for MEMS that inherits its name and philosophy from the IC simulator SPICE. SUGAR is used by researchers at the Berkeley Sensor and Actuator Center, and also at outside sites. Currently, about 500 users have filled out the registration form, and there are many unregistered users as well.
We are working to extend SUGAR in several ways:
- Applying homotopy methods to discover electrostatic pull-in, buckling of mechanical components, and other significant bifurcations
- Building fast reduced-order models that incorporate nonlinear effects
- Studying sensitivity of microsystem performance to deviations from nominal geometry and material properties
- Extending the set of models available to the user
- Building a framework for comparing results from optical measurement tools to simulation results, in order to improve the fidelity of the simulation and to better understand the behavior of different designs
We have also deployed a version of SUGAR as a web service running on the Millennium cluster. Also, the SUGAR software was used in a graduate introduction to MEMS course.
There now exists an open source SUGAR repository under the BSD software license on sourceforge.net at http://sourceforge.net/projects/mems. Other relevant information may be found through this website.
SUGAR is a collaborative effort of many graduate students and professors. Affiliated graduate students include David Bindel, Jason V. Clark, David Garmire, Babak Jamshidi, Rishi Kant, Shyam Lakshmin, Jiawang Nie. Affiliated professors include Alice Agogino, Zhaojun Bai, James Demmel, Sanjay Govindjee, Ming Gu, and Kristofer S. J. Pister.
More information (http://www.cs.berkeley.edu/~strive/sugar) or
Send mail to the author : (strive@eecs.berkeley.edu)