Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Autotuning Sparse Matrix-Vector Multiplication for Multicore

Jong-Ho Byun, Richard Lin, Katherine A. Yelick and James Demmel

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2012-215
November 28, 2012

http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-215.pdf

Sparse matrix-vector multiplication (SpMV) is an important kernel in scientific and engineering computing. Straightforward parallel implementations of SpMV often perform poorly, and with the increasing variety of architectural features in multicore processors, it is getting more difficult to determine the sparse matrix data structure and corresponding SpMV implementation that optimize performance. In this paper we present pOSKI, an autotuning system for SpMV that automatically searches over a large set of possible data structures and implementations to optimize SpMV performance on multicore platforms. pOSKI explores a design space that depends on both the nonzero pattern of the sparse matrix, typically not known until run-time, and the architecture, which is explored off-line as much as possible, in order to reduce tuning time. We demonstrate significant performance improvements compared to previous serial and parallel implementations, and compare performance to upper bounds based on architectural models.


BibTeX citation:

@techreport{Byun:EECS-2012-215,
    Author = {Byun, Jong-Ho and Lin, Richard and Yelick, Katherine A. and Demmel, James},
    Title = {Autotuning Sparse Matrix-Vector Multiplication for Multicore},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2012},
    Month = {Nov},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-215.html},
    Number = {UCB/EECS-2012-215},
    Abstract = {Sparse matrix-vector multiplication (SpMV) is an important kernel in scientific and engineering computing. Straightforward parallel implementations of SpMV often perform poorly, and with the increasing variety of architectural features in multicore processors, it is getting more difficult to determine the sparse matrix data structure and corresponding SpMV implementation that optimize performance. In this paper we present pOSKI, an autotuning system for SpMV that automatically searches over a large set of possible data structures and implementations to optimize SpMV performance on multicore platforms. pOSKI explores a design space that depends on both the nonzero pattern of the sparse matrix, typically not known until run-time, and the architecture, which is explored off-line as much as possible, in order to reduce tuning time. We demonstrate significant performance improvements compared to previous serial and parallel implementations, and compare performance to upper bounds based on architectural models.}
}

EndNote citation:

%0 Report
%A Byun, Jong-Ho
%A Lin, Richard
%A Yelick, Katherine A.
%A Demmel, James
%T Autotuning Sparse Matrix-Vector Multiplication for Multicore
%I EECS Department, University of California, Berkeley
%D 2012
%8 November 28
%@ UCB/EECS-2012-215
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2012/EECS-2012-215.html
%F Byun:EECS-2012-215