volkov@cs.berkeley.edu |

I am a doctoral candidate in Computer Science. My advisor is Prof. James Demmel.

I have BS and MS in Applied mathematics and Physics from Moscow Institute of Physics and Technology.

Publications

- 2011. Alcantara
*et al*.*Building an Efficient Hash Table on the GPU*, in GPU Computing Gems Jade edition, 39–54. - 2009. Datta
*et al*.*Auto-tuning the 27-point stencil for multicore*, 4th International Workshop on Automatic Performance Tuning (iWAPT). - 2008. Volkov and Demmel.
*Benchmarking GPUs to tune dense linear algebra*, SC08. - 2008. Datta
*et al*.*Stencil computation optimization and autotuning on state-of-the-art multicore architectures*, SC08. - 2008. Garland
*et al*.*Parallel computing experiences with CUDA*, IEEE Micro 28, 4, 13–27. - 2008. Volkov and Kazian.
*Fitting FFT onto the G80 Architecture*, CS 258 final project report, University of California, Berkeley. - 2008. Volkov and Demmel.
*LU, QR and Cholesky factorizations using vector capabilities of GPUs*, Technical Report No. UCB/EECS-2008-49, EECS Department, University of California, Berkeley, May 13, 2008. - 2007. Volkov and Demmel.
*Using GPUs to accelerate the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices*, Technical Report No. UCB/EECS-2007-179, EECS Department, University of California, Berkeley, December 29, 2007.

Presentations

- 2012. Volkov.
*Intro to MIC performance*, presented at BeBOP meeting. - 2011. Volkov.
*Unrolling parallel loops*, tutorial talk at SC11. - 2010. Volkov.
*Better performance at lower occupancy*, GPU Technology Conference 2010 (GTC 2010). - 2010. Volkov.
*Use registers and multiple outputs per thread on GPU*, International Workshop on Parallel Matrix Algorithms and Applications 2010 (PMAA'10). - 2010. Volkov.
*Programming inverse memory hierarchy: case of stencils on GPUs*, International Conference on Parallel Computational Fluid Dynamics 2010 (ParCFD 2010). - 2008. Volkov and Demmel.
*Benchmarking GPUs to tune dense linear algebra*, 2008 ACM/IEEE Conference on Supercomputing (SC08). - 2008. Volkov and Demmel.
*Using GPUs to accelerate linear algebra routines*, Poster at PAR Lab Winter Retreat, January 9, 2008.

GPU codes

- 2009. LU and other factorizations: source code, discussion at NVIDIA forum.
- 2008. Matrix-matrix multiply (SGEMM): source code, discussion at NVIDIA forum. It is included in NVIDIA CUBLAS.
- 2008. My FFT prototype was 3x faster than NVIDIA CUFFT. The ideas behind it were used in OpenCL FFT and later in CUFFT.
- 2007. Matrix-matrix multiply for ATI GPUs using DirectX.

citation statistics (collected May 2012)