Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2009 Research Summary

RIOT Backplane/Chukwa/Monitoring Tools

View Current Project Information

Ariel Rabkin and Andrew Konwinski

Any large distributed system will have one, and often several, systems for managing and aggregating monitoring data, particularly telemetry, trace data, and alarms. A system such as the RAD Lab's proposed director has unusually demanding data collection needs. The director will require both large batch tasks for detecting trends and anomalies, and low-latency processing for real-time control.

We're working on a system, called Chukwa, to deliver on both goals. Chukwa decouples the application-specific logic for collection and for analysis, from the problem of scalable processing and data retention. Chukwa does this by leveraging Hadoop's scalable MapReduce processing for batch analysis and long-term data storage. Indexing, as well as low-latency analysis, are major research priorities for us. Chukwa is being developed in cooperation with Yahoo!.