Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Quantifying the Energy Efficiency of Object Recognition and Optical Flow

Michael Anderson, Forrest Iandola and Kurt Keutzer

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2014-22
March 28, 2014

http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.pdf

In this report, we analyze the computational and performance aspects of current state-of-the-art object recognition and optical flow algorithms. First, we identify important algorithms for object recognition and optical flow, then we perform a pattern decomposition to identify key computations. We include profiles of the runtime and energy efficiency (GFLOPS/W) for our implementation of these applications on a commercial architecture. Finally, we include an analysis of memory-bandwidth boundedness for optical flow to identify opportunities for communication-avoiding algorithms. Our results were measured on an Intel i7-4770K (Haswell) reference platform. A five-layer convolutional neural network used for object classification achieves 0.70 GFLOPS/W, which is 21% of the theoretical compute bound for this Haswell processor. On the Horn-Schunck, Lucas-Kanade, and Brox optical flow methods our implementations achieve 0.0338, 0.0103, and 0.0203 GFLOPS/W respectively. Our implementation achieves 7.9% of the theoretical bandwidth bound, assuming no cross-iteration memory optimization, for Horn-Schunk optical flow using the Jacobi solver, and 9.7% of the bandwidth bound for the conjugate-gradient solver. To improve performance, we will focus first on increasing bandwidth utilization, then on doing cross-iteration memory optimizations such as blocking and tiling the Jacobi solver and employing communication-avoiding linear solvers. We also compare the runtime-accuracy tradeoffs for each optical flow method. We find that each method has distinct advantages over the other methods in terms of the runtime-accuracy tradeoff, so we will continue to develop and support all three methods in the future.


BibTeX citation:

@techreport{Anderson:EECS-2014-22,
    Author = {Anderson, Michael and Iandola, Forrest and Keutzer, Kurt},
    Title = {Quantifying the Energy Efficiency of Object Recognition and Optical Flow},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2014},
    Month = {Mar},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.html},
    Number = {UCB/EECS-2014-22},
    Abstract = {In this report, we analyze the computational and performance aspects of current state-of-the-art object recognition and optical flow algorithms. First, we identify important algorithms for object recognition and optical flow, then we perform a pattern decomposition to identify key computations. We include profiles of the runtime and energy efficiency (GFLOPS/W) for our implementation of these applications on a commercial architecture. Finally, we include an analysis of memory-bandwidth boundedness for optical flow to identify opportunities for communication-avoiding algorithms.

Our results were measured on an Intel i7-4770K (Haswell) reference platform. A five-layer convolutional neural network used for object classification achieves 0.70 GFLOPS/W, which is 21% of the theoretical compute bound for this Haswell processor. On the Horn-Schunck, Lucas-Kanade, and Brox optical flow methods our implementations achieve 0.0338, 0.0103, and 0.0203 GFLOPS/W respectively. Our implementation achieves 7.9% of the theoretical bandwidth bound, assuming no cross-iteration memory optimization, for Horn-Schunk optical flow using the Jacobi solver, and 9.7% of the bandwidth bound for the conjugate-gradient solver. To improve performance, we will focus first on increasing bandwidth utilization, then on doing cross-iteration memory optimizations such as blocking and tiling the Jacobi solver and employing communication-avoiding linear solvers.

We also compare the runtime-accuracy tradeoffs for each optical flow method. We find that each method has distinct advantages over the other methods in terms of the runtime-accuracy tradeoff, so we will continue to develop and support all three methods in the future.}
}

EndNote citation:

%0 Report
%A Anderson, Michael
%A Iandola, Forrest
%A Keutzer, Kurt
%T Quantifying the Energy Efficiency of Object Recognition and Optical Flow
%I EECS Department, University of California, Berkeley
%D 2014
%8 March 28
%@ UCB/EECS-2014-22
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.html
%F Anderson:EECS-2014-22