Vasily Volkov
| E-mail: | | volkov@cs.berkeley.edu |
| Phone: | | (510) 289-1469 |
| Office: | | 517 Soda Hall |
I am a doctoral candidate in Computer Science
with a Designated Emphasis in
Computational Science and Engineering
affiliated with the
Berkeley Benchmarking and Optimization Group (BeBOP) and the
Parallel Computing Laboratory (Par Lab).
My advisor is Prof. James Demmel.
Previously, I have received BS and MS in Applied Mathematics and Physics from
Moscow Institute of Physics and Technology.
Publications
- Alcantara, D. A., Volkov, V., Sengupta, S., Mitzenmacher, M., Owens, J. D., and Amenta, N. 2011. Building an Efficient Hash Table on the GPU, GPU Computing Gems Jade edition.
- Demmel, J., Dongarra, J., Fox, A., Williams, S., Volkov, V., and Yelick, K. 2009. Accelerating time-to-solution for computational science and engineering, SciDAC Review 15.
- Datta, K., Williams, S., Volkov, V., Carter, J., Oliker, L., Shalf, J., and Yelick, K. 2009. Auto-tuning the 27-point stencil for multicore, 4th International Workshop on Automatic Performance Tuning (iWAPT).
- Volkov, V., and Demmel, J. W. 2008. Benchmarking GPUs to tune dense linear algebra, 2008 ACM/IEEE Conference on Supercomputing (SC08). (Best Student Paper award.)
- Datta, K., Murphy, M., Volkov, V., Williams, S., Carter, J., Oliker, L., Patterson, D., Shalf, J., and Yelick, K. 2008. Stencil computation optimization and autotuning on state-of-the-art multicore architectures, 2008 ACM/IEEE Conference on Supercomputing (SC08).
- Garland, M., LeGrand, S., Nickolls, J., Anderson, J., Hardwick, J., Morton, S., Philipps, E., Zhang, Y., and Volkov, V. 2008. Parallel computing experiences with CUDA, IEEE Micro 28, 4, 13–27.
- Volkov, V., and Kazian, B. 2008. Fitting FFT onto the G80 Architecture, CS 258 final project report, University of California, Berkeley.
- Volkov, V., and Demmel, J. W. 2008. LU, QR and Cholesky factorizations using vector capabilities of GPUs, Technical Report No. UCB/EECS-2008-49, EECS Department, University of California, Berkeley, May 13, 2008. (Also LAPACK Working Note 202.)
- Volkov, V., and Demmel, J. W. 2008. Using GPUs to accelerate linear algebra routines, Poster at PAR Lab Winter Retreat, January 9, 2008.
- Volkov, V., and Demmel, J. W. 2007. Using GPUs to accelerate the bisection algorithm for finding eigenvalues of symmetric tridiagonal matrices, Technical Report No. UCB/EECS-2007-179, EECS Department, University of California, Berkeley, December 29, 2007. (Also LAPACK Working Note 197.)
Presentation slides
- Volkov, V. 2011. Unrolling parallel loops, tutorial talk at SC11.
- Volkov, V. 2010. Better performance at lower occupancy, GPU Technology Conference 2010 (GTC 2010).
- Volkov, V. 2010. Use registers and multiple outputs per thread on GPU, International Workshop on Parallel Matrix Algorithms and Applications 2010 (PMAA'10).
- Volkov, V. 2010. Programming inverse memory hierarchy: case of stencils on GPUs, International Conference on Parallel Computational Fluid Dynamics 2010 (ParCFD 2010).
- Volkov, V. 2009. Optimizing GPU codes, CScADS Workshop on Libraries and Autotuning for Petascale Applications 2009.
- Volkov, V., and Demmel, J. 2009. Using register files, strip mining and global synchronization on GPUs, 10th US National Congress on Computational Mechanics (USNCCM 10).
- Catanzaro, B., Volkov, V., Su, B.-Y., Sundaram, N., Demmel, J., and Keutzer, K. 2009. CPU-GPU hybrid eigensolvers for symmetric eigenproblems, SIAM Conference on Applied Linear Algebra 2009 (LA09).
- Volkov, V., and Demmel, J. W. 2008. Benchmarking GPUs to tune dense linear algebra, 2008 ACM/IEEE Conference on Supercomputing (SC08).
GPU codes