Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

Data Triage

Frederick Ralph Reiss

EECS Department
University of California, Berkeley
Technical Report No. UCB/EECS-2007-79
June 1, 2007

http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-79.pdf

Enterprise networks are becoming more complex and more vital to daily operations. To cope with these changes, network administrators need new tools for troubleshooting problems quickly in the face of ever more sophisticated adversaries. Passive network monitoring with declarative queries can provide the combination of responsiveness, focus, and flexibility that administrators need. But networks are subject to high-speed bursts of data, and keeping the cost of passive monitoring hardware under control is a major problem. In this dissertation, I propose an approach to passive network monitoring in which the monitor is provisioned for the average data rate on the network. This average rate is generally an order of magnitude or more lower than the peak rate. I describe Data Triage, an architecture that wraps a general-purpose streaming query processor with a software fallback mechanism that uses approximate query processing to provide timely answers during bursts. I analyze the policy issues that this architecture exposes and present Delay Constraints, an API and associated scheduling algorithm for managing Data Triage. I then describe my work on novel query approximation techniques to make Data Triage¿s fallback mechanism work with an important class of monitoring queries. Finally, I describe a deployment study of Data Triage in the context of a prototype end-to-end network monitoring system at Lawrence Berkeley National Laboratory.

Advisor: Joseph M. Hellerstein


BibTeX citation:

@phdthesis{Reiss:EECS-2007-79,
    Author = {Reiss, Frederick Ralph},
    Title = {Data Triage},
    School = {EECS Department, University of California, Berkeley},
    Year = {2007},
    Month = {Jun},
    URL = {http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-79.html},
    Number = {UCB/EECS-2007-79},
    Abstract = {Enterprise networks are becoming more complex and more vital to daily operations. To cope with these changes, network administrators need new tools for troubleshooting problems quickly in the face of ever more sophisticated adversaries. Passive network monitoring with declarative queries can provide the combination of responsiveness, focus, and flexibility that administrators need. But networks are subject to high-speed bursts of data, and keeping the cost of passive monitoring hardware under control is a major problem.

In this dissertation, I propose an approach to passive network monitoring in which the monitor is provisioned for the average data rate on the network. This average rate is generally an order of magnitude or more lower than the peak rate. I describe Data Triage, an architecture that wraps a general-purpose streaming query processor with a software fallback mechanism that uses approximate query processing to provide timely answers during bursts. I analyze the policy issues that this architecture exposes and present Delay Constraints, an API and associated scheduling algorithm for managing Data Triage. I then describe my work on novel query approximation techniques to make Data Triage¿s fallback mechanism work with an important class of monitoring queries. Finally, I describe a deployment study of Data Triage in the context of a prototype end-to-end network monitoring system at Lawrence Berkeley National Laboratory.}
}

EndNote citation:

%0 Thesis
%A Reiss, Frederick Ralph
%T Data Triage
%I EECS Department, University of California, Berkeley
%D 2007
%8 June 1
%@ UCB/EECS-2007-79
%U http://www.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-79.html
%F Reiss:EECS-2007-79