Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2010 Research Summary

Scaleable Consistency Adjustable Data Storage (SCADS)

View Current Project Information

Michael Armbrust, Armando Fox, Michael Franklin, Nick Lanham, David A. Patterson, Beth Trushkowsky and Jesse Trutna

Sun Microsystems, Google, Microsoft, Hewlett-Packard, Cisco Systems, Oracle, Cisco Systems, IBM, Fujitsu, NetApp, Siemens, VMWare and Facebook

Modern user-facing web applications such as Facebook, Flickr, Yelp, the Amazon storefront, and the various Google properties present new challenges for storing and querying data at the multiple-terabyte scale. Generally tolerant of stale reads, such systems have requirements for response times measured in milliseconds, face availability requirements nearing 100%, and must scale under bursty and exponentially increasing data storage loads. Under these requirements, traditional usage patterns change, with ad-hoc queries against the production system becoming dangerous and and intrusive migration schemes becoming infeasible. We believe there exists an opportunity for a highly-scalable system which provides reasoned trade-offs between consistency, availability, and performance in this space. We propose a new system, SCADS, which provides for the declarative specification of the consistency and performance requirements of an application, takes advantage of utility computing to provide cost-effective rapid scale-up and scale-down, pre-computes queries to decrease response time, and uses machine learning models to anticipate performance problems and predict the runtime of new queries before execution.