Ling Huang

EECS Department, University of California, Berkeley

Technical Report No. UCB/EECS-2007-119

September 24, 2007

http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-119.pdf

As the Internet evolves into a valuable and critical service platform for our business and daily life, there has been growing interest in large-scale distributed monitoring for network infrastructures. The monitoring systems collect and aggregate information describing status and performance of networked systems. Remote monitor sensors are typically deployed throughout the network generating numerous large and widely-distributed, continuous or discrete timeseries data streams, representing large-scale information flows from multiple vantage points. Existing research in system monitoring and data management involves periodically pushing all monitored data to a central Network Operation Center (NOC) for sophisticated analysis and anomaly detection. However, such a ``periodic push'' approach suffers from timescale and size scalability limitations. Many anomalies occur on much smaller time scales than typical polling periods. Detecting events on a second sub-second time scale requires that the volume of measurement data transmitted through the network increase dramatically because the monitoring data must be collected on a second (or sub-second) time scale. Similarly, an order of magnitude (or more) increase in the number of monitors also causes a massive growth in the volume of collected data, and could overload the central processing site's network capacity, especially for networks such as sensor networks, wireless networks, and enterprise networks.

In this dissertation, we design and develop D-Trigger as a general framework for efficient online anomaly detection. D-Trigger addresses the lack of efficiency and flexibility in today's distributed monitoring and anomaly detection systems, and proposes a general framework which gracefully integrates a variety of decision-making and optimization algorithms for online detection. The key goals and accomplishments in this dissertation are to: 1) enable real-time detection where the system's state is tracked continuously, so even the smallest anomalies will be exposed; 2) significantly reduce the data collected for anomaly detection, thus reducing the communication burden placed on the network; 3) guarantee desired detection accuracy even with the reduced amount of collected data. To achieve the three goals, D-Trigger combines in-network processing at distributed local sites, and decision making at the NOC. The combination of distributed local processing strategies, sophisticated detection algorithms, and theoretical analysis tools enables D-Trigger to perform in-network tracking with very high detection accuracy and low communication overhead. In addition, D-Trigger is able to accommodate a broad set of statistical learning algorithms for the detection of various unusual events, including botnet attacks, volume anomalies in an ISP network, electric power grid anomalies, etc.

Our work on D-Trigger has resulted in an efficient detection system which is capable of detecting a wide-range of anomaly types in distributed systems in near real-time with bounded detection error. The system can be applied to a wide variety of monitoring problems and domains, ranging from simple monitor functions (e.g., SUM, AVG, MIN, and MAX) to complex mathematical functions, and spanning areas such as sensor networks, enterprise networks, and even power distribution networks.

Advisors: Anthony D. Joseph


BibTeX citation:

@phdthesis{Huang:EECS-2007-119,
    Author= {Huang, Ling},
    Title= {D-Trigger: A General Framework for Efcient Online Detection},
    School= {EECS Department, University of California, Berkeley},
    Year= {2007},
    Month= {Sep},
    Url= {http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-119.html},
    Number= {UCB/EECS-2007-119},
    Abstract= {As the Internet evolves into a valuable and critical 
service platform for our business and daily life, there has 
been growing interest in large-scale distributed monitoring
for network infrastructures. The monitoring systems collect 
and aggregate information describing status and performance of networked systems. Remote monitor sensors are typically deployed throughout the network generating numerous 
large and widely-distributed, continuous or discrete timeseries data streams, representing large-scale information flows from multiple vantage points. Existing research in system monitoring and data management involves periodically pushing all monitored data to a central Network Operation Center (NOC) for sophisticated analysis and anomaly detection. However, such a ``periodic push'' 
approach suffers from timescale and size scalability limitations. Many anomalies occur on much smaller time scales than typical polling periods. Detecting events on a second sub-second time scale requires that the volume of measurement data transmitted through the network increase dramatically because the monitoring data must be collected on a second (or sub-second) time scale. Similarly, an order of magnitude (or more) increase in the number of monitors also causes a massive growth in the volume of collected 
data, and could overload the central processing site's network capacity, especially for networks such as sensor networks, wireless networks, and enterprise networks. 

In this dissertation, we design and develop D-Trigger as a 
general framework for efficient online anomaly detection. 
D-Trigger addresses the lack of efficiency and flexibility 
in today's distributed monitoring and anomaly detection systems, and proposes a general framework which gracefully integrates a variety of decision-making and optimization algorithms for online detection. The key goals and accomplishments in this dissertation are to: 1) enable real-time detection where the system's state is tracked continuously, so even the smallest anomalies will be exposed; 2) significantly reduce the data collected for 
anomaly detection, thus reducing the communication burden placed on the network; 3) guarantee desired detection accuracy even with the reduced amount of collected data. 
To achieve the three goals, D-Trigger combines in-network 
processing at distributed local sites, and decision making at the NOC. The combination of distributed local processing strategies, sophisticated detection algorithms, and theoretical analysis tools enables D-Trigger to perform in-network tracking with very high detection accuracy and low communication overhead. In addition, D-Trigger is able to accommodate a broad set of statistical learning algorithms for the detection of various unusual events, including botnet attacks, volume anomalies in an ISP network, electric power grid anomalies, etc.

Our work on D-Trigger has resulted in an efficient detection system which is capable of detecting a wide-range of anomaly types in distributed systems in near real-time with bounded detection error. The system can be applied to a wide variety of monitoring problems and domains, ranging from simple monitor functions (e.g., SUM, AVG, MIN, and MAX) to complex mathematical functions, and spanning areas such as sensor networks, enterprise networks, and even power distribution networks.},
}

EndNote citation:

%0 Thesis
%A Huang, Ling 
%T D-Trigger: A General Framework for Efcient Online Detection
%I EECS Department, University of California, Berkeley
%D 2007
%8 September 24
%@ UCB/EECS-2007-119
%U http://www2.eecs.berkeley.edu/Pubs/TechRpts/2007/EECS-2007-119.html
%F Huang:EECS-2007-119