Talks:
(pdf) March 2013; SIAM CSE; Boston, USA;
Scalable Numerical Algorithms for Electronic Structure Calculations
(pdf) February 2013; Berkeley, USA;
Communication-Avoiding Parallel Algorithms for Dense Linear Algebra and Tensor Computations
(pdf) January 2013; University of Southern California; LA, USA;
A parallel tensor framework for Coupled Cluster
(pdf) Sep. 2012; seminar; Lawrence Livermore National Laboratory; Livermore, CA;
Scalable numerical algorithms for electronic structure calculations
(pdf) July 2012; University of Tokyo; Tokyo, Japan;
2.5D algorithms for distributed-memory computing
(pdf) July 2012; VECPAR; Kobe, Japan;
Matrix multiplication on multidimensional torus networks
(pdf) June 2012; SIAM ALA; Valencia, Spain;
2.5D Algorithms for dense linear algebra
(pdf) Feb. 2012; SIAM PP; Savannah, GA;
Topology-aware parallel algorithms for symmetric tensor contractions
(pdf) Nov. 2011; ACM/IEEE Supercomputing; Seattle, WA;
Improving communication performance in dense linear algebra via topology-aware collectives
(pdf) Sep. 2011; CS 294 lecture; Berkeley, CA;
2.5D algorithms: from hardware to theory and back
(pdf) Sep. 2011; Bordeaux, France;
Communication-optimal parallel 2.5D matrix multiplication and LU factorization algorithms
(pdf) Aug. 2011; seminar; Argonne National Laboratory; Argonne, IL;
Reducing communication in dense matrix/tensor computations
(pdf) Apr. 2010; IPDPS; Atlanta, GA;
Highly Scalable Parallel Sorting
Posters:
(pdf) Dec. 2011; CSGF conference; Arlington, VA;
Cyclops Tensor Framework.
(pdf) Jul. 2011; CSGF conference; Arlington, VA;
2.5D algorithms for dense linear algebra
(pdf) Nov. 2009; ACM/IEEE Supercomputing; Portland, OR
Performance Comparison of Intrepid, Jaguar and Ranger Using Scientific Applications