Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Towards Energy Efficient MapReduce

Yanpei Chen, Laura Keys and Randy H. Katz

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2009-109
August 5, 2009

http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-109.pdf

Energy considerations are important for Internet datacenters operators, and MapReduce is a common Internet datacenter application. In this work, we use the energy efficiency of MapReduce as a new perspective for increasing Internet datacenter productivity. We offer a framework to analyze software energy efficiency in general, and MapReduce energy efficiency in particular. We characterize the performance of the Hadoop implementation of MapReduce under different workloads. We also introduce quantitative models to guide operators and developers in improving the performance of MapReduce/Hadoop. A major, but somewhat unsurprising finding is that for workloads where the work rate is proportional to the amount of resources used, improving the performance as measured by traditional metrics such as job duration is equivalent to improving the performance as measured by lower energy consumed.


BibTeX citation:

@techreport{Chen:EECS-2009-109,
    Author = {Chen, Yanpei and Keys, Laura and Katz, Randy H.},
    Title = {Towards Energy Efficient MapReduce},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2009},
    Month = {Aug},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-109.html},
    Number = {UCB/EECS-2009-109},
    Abstract = {Energy considerations are important for Internet datacenters operators, and MapReduce is a common Internet datacenter application. In this work, we use the energy efficiency of MapReduce as a new perspective for increasing Internet datacenter productivity. We offer a framework to analyze software energy efficiency in general, and MapReduce energy efficiency in particular. We characterize the performance of the Hadoop implementation of MapReduce under different workloads. We also introduce quantitative models to guide operators and developers in improving the performance of MapReduce/Hadoop. A major, but somewhat unsurprising finding is that for workloads where the work rate is proportional to the amount of resources used, improving the performance as measured by traditional metrics such as job duration is equivalent to improving the performance as measured by lower energy consumed.}
}

EndNote citation:

%0 Report
%A Chen, Yanpei
%A Keys, Laura
%A Katz, Randy H.
%T Towards Energy Efficient MapReduce
%I EECS Department, University of California, Berkeley
%D 2009
%8 August 5
%@ UCB/EECS-2009-109
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2009/EECS-2009-109.html
%F Chen:EECS-2009-109