Quantifying the Energy Efficiency of Object Recognition and Optical Flow

Michael Anderson, Forrest Iandola and Kurt Keutzer

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2014-22
March 28, 2014

http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.pdf

In this report, we analyze the computational and performance aspects of current state-of-the-art object recognition and optical flow algorithms. First, we identify important algorithms for object recognition and optical flow, then we perform a pattern decomposition to identify key computations. We include profiles of the runtime and energy efficiency (GFLOPS/W) for our implementation of these applications on a commercial architecture. Finally, we include an analysis of memory-bandwidth boundedness for optical flow to identify opportunities for communication-avoiding algorithms.

Our results were measured on an Intel i7-4770K (Haswell) reference platform. A five-layer convolutional neural network used for object classification achieves 0.70 GFLOPS/W, which is 21% of the theoretical compute bound for this Haswell processor. On the Horn-Schunck, Lucas-Kanade, and Brox optical flow methods our implementations achieve 0.0338, 0.0103, and 0.0203 GFLOPS/W respectively. Our implementation achieves 7.9% of the theoretical bandwidth bound, assuming no cross-iteration memory optimization, for Horn-Schunk optical flow using the Jacobi solver, and 9.7% of the bandwidth bound for the conjugate-gradient solver. To improve performance, we will focus first on increasing bandwidth utilization, then on doing cross-iteration memory optimizations such as blocking and tiling the Jacobi solver and employing communication-avoiding linear solvers.

We also compare the runtime-accuracy tradeoffs for each optical flow method. We find that each method has distinct advantages over the other methods in terms of the runtime-accuracy tradeoff, so we will continue to develop and support all three methods in the future.


BibTeX citation:

@techreport{Anderson:EECS-2014-22,
    Author = {Anderson, Michael and Iandola, Forrest and Keutzer, Kurt},
    Title = {Quantifying the Energy Efficiency of Object Recognition and Optical Flow},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2014},
    Month = {Mar},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.html},
    Number = {UCB/EECS-2014-22},
    Abstract = {In this report, we analyze the computational and performance aspects of current state-of-the-art object recognition and optical flow algorithms. First, we identify important algorithms for object recognition and optical flow, then we perform a pattern decomposition to identify key computations. We include profiles of the runtime and energy efficiency (GFLOPS/W) for our implementation of these applications on a commercial architecture. Finally, we include an analysis of memory-bandwidth boundedness for optical flow to identify opportunities for communication-avoiding algorithms.

Our results were measured on an Intel i7-4770K (Haswell) reference platform. A five-layer convolutional neural network used for object classification achieves 0.70 GFLOPS/W, which is 21% of the theoretical compute bound for this Haswell processor. On the Horn-Schunck, Lucas-Kanade, and Brox optical flow methods our implementations achieve 0.0338, 0.0103, and 0.0203 GFLOPS/W respectively. Our implementation achieves 7.9% of the theoretical bandwidth bound, assuming no cross-iteration memory optimization, for Horn-Schunk optical flow using the Jacobi solver, and 9.7% of the bandwidth bound for the conjugate-gradient solver. To improve performance, we will focus first on increasing bandwidth utilization, then on doing cross-iteration memory optimizations such as blocking and tiling the Jacobi solver and employing communication-avoiding linear solvers.

We also compare the runtime-accuracy tradeoffs for each optical flow method. We find that each method has distinct advantages over the other methods in terms of the runtime-accuracy tradeoff, so we will continue to develop and support all three methods in the future.}
}

EndNote citation:

%0 Report
%A Anderson, Michael
%A Iandola, Forrest
%A Keutzer, Kurt
%T Quantifying the Energy Efficiency of Object Recognition and Optical Flow
%I EECS Department, University of California, Berkeley
%D 2014
%8 March 28
%@ UCB/EECS-2014-22
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2014/EECS-2014-22.html
%F Anderson:EECS-2014-22