|
|
|
Books
- T. El-Ghazawi, W. Carlson, T. Sterling, and K. A. Yelick, UPC: Distributed Shared-Memory Programming, Wiley-Interscience, Hoboken, NJ: Wiley, 2005.
- A. W. Trivelpiece, R. Biswas, J. Dongarra, P. Paul, and K. A. Yelick, Assessment of High-End Computing Research and Development in Japan: Final Report, Baltimore, MD: World Technology Evaluation Center, Inc., 2004.
Book chapters or sections
- E. J. Im, I. Bustany, C. Ashcraft, J. Demmel, and K. A. Yelick, "Performance tuning of matrix triple products based on matrix structure," in Applied Parallel Computing: State of the Art in Scientific Computing. Proc. 7th Intl. Workshop (PARA 2004): Revised Selected Papers, J. Dongarra, K. Madsen, and J. Wasniewski, Eds., Lecture Notes in Computer Science, Vol. 3732, Berlin, Germany: Springer-Verlag, 2006, pp. 740-746.
- R. Vuduc, A. Gyulassy, J. Demmel, and K. A. Yelick, "Memory hierarchy optimizations and performance bounds for Sparse {A sup T Ax}," in Computational Science: Proc. Intl. Conf. on Computational Science (ICCS 2003), P. M. A. Sloot, D. Abramson, A. V. Bogdanov, J. J. Dongarra, A. Y. Zomaya, and Y. E. Gorbachev, Eds., Lecture Notes in Computer Science, Vol. 2659, Berlin, Germany: Springer-Verlag, 2003, pp. 705-714.
Articles in journals or magazines
- S. Williams, J. Shalf, L. Oliker, S. Kamil, P. Husbands, and K. A. Yelick, "Scientific computing kernels on the cell processor," Intl. J. Parallel Programming, vol. 35, no. 3, pp. 263-398, June 2007.
- R. Nishtala, R. W. Vuduc, J. Demmel, and K. A. Yelick, "When cache blocking of sparse matrix vector multiply works and why," Applicable Algebra in Engineering, Communication and Computing, vol. 18, no. 3, pp. 297-311, May 2007.
- E. Givelberg and K. A. Yelick, "Distributed immersed boundary simulations in Titanium," SIAM J. on Scientific Computing, vol. 28, no. 4, pp. 1361-1378, July 2006.
- K. A. Yelick, P. N. Hilfinger, S. L. Graham, D. Bonachea, J. Su, A. Kami, K. Datta, P. Colella, and T. Wen, "Parallel languages and compilers: Perspective from the Titanium experience," The Intl. J. High Performance Computing Applications, vol. SI, pp. 1-23, June 2006.
- R. Vuduc, J. Demmel, and K. A. Yelick, "OSKI: A library of automatically tuned sparse matrix kernels," J. Physics: Conference Series, vol. 16, no. 1, pp. 521-530, 2005.
- J. Demmel, J. Dongarra, V. Eijkhout, E. Fuentes, A. Petitet, R. Vuduc, R. C. Whaley, and K. A. Yelick, "Self-adapting linear algebra algorithms and software," Proc. IEEE, vol. 93, no. 2, pp. 293-312, Feb. 2005.
- K. A. Yelick, L. Semenzato, G. Pike, C. Miyamoto, B. Liblit, A. Krishnamurthy, P. N. Hilfinger, S. L. Graham, D. Gay, P. Colella, and A. Aiken, "Titanium: A high-performance Java dialect," Concurrency: Practice and Experience, vol. 10, no. 11-13, pp. 825-836, Sep. 1998.
- S. Chakrabarti, J. Demmel, and K. A. Yelick, "Models and scheduling algorithms for mixed data and task parallel programs," J. Parallel and Distributed Computing: Special Issue on Dynamic Load Balancing, vol. 47, no. 1, pp. 168-184, Nov. 1997.
Articles in conference proceedings
- H. Gahvari, M. Hoemmen, J. Demmel, and K. A. Yelick, "Benchmarking sparse matrix-vector multiply in five minutes," in Proc. 2007 SPEC Benchmark Workshop, Warrenton, VA: Standard Performance Evaluation Corporation, 2007, pp. 11 pg.
- S. Williams, J. Shalf, L. Oliker, S. Kamil, P. Husbands, and K. A. Yelick, "The potential of the Cell processor for scientific computing," in Proc. 3rd Conf. on Computing Frontiers, New York, NY: ACM Press, 2006, pp. 9-20.
- C. Bell, D. Bonachea, R. Nishtala, and K. A. Yelick, "Optimizing bandwidth limited problems using one-sided communication and overlap," in Proc. 20th Intl. Parallel and Distributed Processing Symp., Piscataway, NJ: IEEE Press, 2006, pp. 10 pp..
- A. Kamil, J. Su, and K. A. Yelick, "Making sequential consistency practical in Titanium," in Proc. 2005 ACM/IEEE Supercomputing Conf., Los Alamitos, CA: IEEE Computer Society Press, 2005, pp. 15 pp..
- B. C. Lee, R. W. Vuduc, J. Demmel, and K. A. Yelick, "Best Paper Prize: Performance models for evaluation and automatic tuning of symmetric sparse matrix-vector multiply," in Proc. 2004 Intl. Conf. on Parallel Processing (ICPP 2004), R. Eigenmann, Ed., Vol. 1, Los Alamitos, CA: IEEE Computer Society, 2004, pp. 169-176.
- W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, and K. A. Yelick, "A performance analysis of the Berkeley UPC compiler," in Proc. 17th Annual Intl. Conf. on Supercomputing, New York, NY: ACM Press, 2003, pp. 63-73.
- R. Vuduc, J. Demmel, K. A. Yelick, S. Kamil, R. Nishtala, and B. Lee, "Performance optimizations and bounds for sparse matrix-vector multiply," in Proc. ACM/IEEE 2002 Conf. on Supercomputing (SC '02), Los Alamitos,CA: IEEE Computer Society, 2002, pp. 35 pg.
- B. R. Gaeke, P. Husbands, X. S. Li, L. Oliker, K. A. Yelick, and R. Biswas, "Memory-intensive benchmarks: IRAM vs. cache-based machines," in Proc. 16th Intl. Parallel and Distributed Processing Symp., Piscataway, NJ: IEEE Press, 2002, pp. 30-36.
- R. H. Arpaci-Dusseau, E. Anderson, N. Treuhaft, D. E. Culler, J. M. Hellerstein, D. A. Patterson, and K. A. Yelick, "Cluster I/O with River: Making the fast case common," in Proc. 6th Workshop on I/O in Parallel and Distributed Systems (IOPADS 1999), New York, NY: ACM Press, 1999, pp. 10-22.
Technical Documentation
- P. N. Hilfinger, D. Bonachea, K. Datta, D. Gay, S. L. Graham, A. Kamil, B. Liblit, G. Pike, J. Su, and K. A. Yelick, "Titanium Language Reference Manual (Version 2.20)," 2006.
Technical Reports
- K. Asanovic, R. Bodik, J. Demmel, T. Keaveny, K. Keutzer, J. D. Kubiatowicz, E. A. Lee, N. Morgan, G. Necula, D. A. Patterson, K. Sen, J. Wawrzynek, D. Wessel, and K. A. Yelick, "The Parallel Computing Laboratory at U.C. Berkeley: A Research Agenda Based on the Berkeley View," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2008-23, March 2008.
- J. Demmel, M. F. Hoemmen, M. Mohiyuddin, and K. A. Yelick, "Avoiding Communication in Computing Krylov Subspaces," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2007-123, Oct. 2007.
- K. Asanovic, R. Bodik, B. C. Catanzaro, J. J. Gebis, P. Husbands, K. Keutzer, D. A. Patterson, W. L. Plishker, J. Shalf, S. W. Williams, and K. A. Yelick, "The Landscape of Parallel Computing Research: A View from Berkeley," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2006-183, Dec. 2006.
- J. Z. Su, T. Wen, and K. A. Yelick, "Compiler and Runtime Support for Scaling Adaptive Mesh Refinement Computations in Titanium," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2006-87, June 2006.
- A. A. Kamil and K. A. Yelick, "Concurrency Analysis for Parallel Programs with Textually Aligned Barriers," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2006-41, April 2006.
- P. N. Hilfinger, D. O. Bonachea, K. Datta, D. Gay, S. L. Graham, B. R. Liblit, G. Pike, J. Z. Su, and K. A. Yelick, "Titanium Language Reference Manual, version 2.19," EECS Department, University of California, Berkeley, Tech. Rep. UCB/EECS-2005-15, Nov. 2005.
- R. Nishtala, R. W. Vuduc, J. W. Demmel, and K. A. Yelick, "Performance Modeling and Analysis of Cache Blocking in Sparse Matrix Vector Multiply," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-04-1335, 2004.
- B. C. Lee, R. W. Vuduc, J. W. Demmel, K. A. Yelick, M. de Lorimier, and L. Zhong, "Performance Optimizations and Bounds for Sparse Symmetric Matrix-Multiple Vector Multiply," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-03-1297, 2003.
- M. Narayanan and K. A. Yelick, "Generating Permutation Instructions from a High-Level Description," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-03-1287, 2003.
- W. Chen, A. Krishnamurthy, and K. Yelick, "Polynomial-time Algorithms for Enforcing Sequential Consistency for SPMD Programs with Arrays," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-03-1272, Sep. 2003.
- R. Vuduc, A. Gyulassy, J. Demmel, and K. A. Yelick, "Memory Hierarchy Optimizations and Performance Bounds for Sparse A^T Ax," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-03-1232, Feb. 2003.
- P. N. Hilfinger, D. Bonachea, D. Gay, S. Graham, B. Liblit, G. Pike, and K. Yelick, "Titanium Language Reference Manual," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-01-1163, Nov. 2001.
- B. Liblit, A. Aiken, and K. Yelick, "Data Sharing Analysis for Titanium," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-01-1165, Nov. 2001.
- A. Krishnamurthy, D. E. Culler, and K. Yelick, "Empirical Evaluation of Global Memory Support on the Cray-T3D and Cray-T3E," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-98-991, Aug. 1998.
- A. Krishnamurthy, K. E. Schauser, C. J. Scheiman, D. E. Culler, K. Yelick, and R. Y. Wang, "Evaluation of Architectural Support for Global Address-Based Communication in Large-Scale Parallel Machines," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-98-984, Jan. 1998.
- S. Chakrabarti, E. Deprit, E. Im, J. Jones, A. Krishnamurthy, C. Wen, and K. Yelick, "Multipol: A Distributed Data Structure Library," EECS Department, University of California, Berkeley, Tech. Rep. UCB/CSD-95-879, July 1995.
Software
- K. A. Yelick and J. Demmel, "OSKI -- Optimized Sparse Kernel Interface," 2006.
- E. Givelberg and K. A. Yelick, "IB Using Titanium," 2005.
- K. A. Yelick, P. N. Hilfinger, S. L. Graham, and P. Colella, "The Titanium Compiler," 2003.
- W. Chen, D. Bonachea, J. Duell, P. Husbands, C. Iancu, K. A. Yelick, and D. E. Culler, "The Berkeley UPC Compiler," 2003.
|
|
|