Failure Analysis of Internet Services

Archana S. Ganapathi

EECS Department
University of California, Berkeley
Technical Report No. UCB/CSD-03-1255
July 2003

http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/CSD-03-1255.pdf

We present operational characteristics and failure data of several large-scale Internet services. Case studies and data are used to broaden the range of metrics used in this analysis. We found that operator-induced errors are most impacting and also the hardest failure to mask. Failure-mitigation techniques, such as configuration checking, online testing as well as fault/load injection, improve Internet service availability.


BibTeX citation:

@techreport{Ganapathi:CSD-03-1255,
    Author = {Ganapathi, Archana S.},
    Title = {Failure Analysis of Internet Services},
    Institution = {EECS Department, University of California, Berkeley},
    Year = {2003},
    Month = {Jul},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/5456.html},
    Number = {UCB/CSD-03-1255},
    Abstract = {We present operational characteristics and failure data of several large-scale Internet services. Case studies and data are used to broaden the range of metrics used in this analysis. We found that operator-induced errors are most impacting and also the hardest failure to mask. Failure-mitigation techniques, such as configuration checking, online testing as well as fault/load injection, improve Internet service availability.}
}

EndNote citation:

%0 Report
%A Ganapathi, Archana S.
%T Failure Analysis of Internet Services
%I EECS Department, University of California, Berkeley
%D 2003
%@ UCB/CSD-03-1255
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2003/5456.html
%F Ganapathi:CSD-03-1255