FlashBack is a peer-to-peer replication middleware designed specifically for power-constrained devices. FlashBack provides a reliable storage layer for personal area networks (PANs) by spreading backup data among peers. It is designed to be scalable, fault-tolerant and opportunistic in the presence of fixed infrastructure. It also leverages heterogeneity within the PANs. This project examines the challenges presented by mobile personal devices. We propose the FlashBack architecture and algorithms, and present experimental results that illustrate FlashBack's performance characteristics.
1Staff, Intel Research Laboratory at Seattle
2Staff, Intel Research Laboratory at Seattle
This project examines the feasibility of peer-to-peer full-text keyword search of the Web. Two classes of keyword search techniques are in use or have been proposed: flooding of queries over an overlay network (as in Gnutella), and intersection of index lists stored in a distributed hash table (DHT). A simple model of capacity and workload suggests that the Internet does not have enough capacity to make naive use of either of these techniques attractive for Web search. We then present a number of existing and novel optimizations, such as caching and precomputation, bloom filters, gap compression and adaptive set intersections  with clustering. We estimate their effect on performance, and conclude that in combination these optimizations would bring the problem to within an order of magnitude of feasibility. To achieve the last order of magnitude, we propose using Fagin's algorithm (FA)  in conjunction with a ranking function to generate incremental results. This may allow savings in communication if users terminate their queries early.
We intend to build a DHT-based search engine, and compare its performance characteristics with Gnutella and KaZaA.
Recent work in the networking and OS research communities have developed (and continue to improve) highly scalable peer-to-peer systems called distributed hash tables (DHTs). These systems can provide the foundation for designing a massively distributed query engine. Such a query engine can be utilized not only to query over a wide-area network (tapping into distributed resources), but also query data that naturally resides in the network without the need to transport the data to a central collection point. We believe such as system will be useful in many applications, particularly for network monitoring.
DHTs are able to map a key (a sequence of bits) to a node in the network in the face of simultaneous nodes entering, leaving, and failing in the network. Above this mapping, the common hash table function, put() and get() can be implemented. We extend this interface slightly to allow for query processing.
We use existing hash-based distributed/parallel relational join algorithms (such as the symmetric hash join) while relaxing traditional consistency semantics (ACID) to allow the system to scale to over ten-thousand machines in simulation, and (so-far) dozens in an actual implementation over the Millennium Cluster.
Our primary research focus is on supporting continuous queries, which appear to be natural to the class of applications we are interested in supporting. In addition to relational joins, aggregation is also needed for many of the applications. Key properties for any operator include low communication, little or no synchronization, and are non-blocking. These open the door to questions of dynamic query optimization, approximate answers (such as online aggregation), and multi-query sharing.
1International Computer Science Institute
Typically, database systems use a static query execution plan for executing a given query. Adaptive query execution refers to adapting the query execution plan in response to changing characteristics of data. Recently, there has been a lot of interest in this problem, e.g., .
In this work, we try to apply competitive analysis techniques to the adaptive query exeuction problem to understand the problem better. We are currently addressing a simplified version of the problem where the queries only contain join operations, and the joins are implemented using the symmetric hash join operator . We also assume that the execution model of the online algorithms is similar to the Eddies mechanism proposed by Avnur and Hellerstein . We are later planning to extend the analysis to more complicated cost models, as well as to join algorithms that spill data to disk.
Current database systems treat database relations as a whole when executing queries over them. For example, for executing the declaratively specified query R JOIN S JOIN T, a typical database system considers only two execution plans: (1) join R with S, and join the result with T, and (2) join S with T, and join the result with R. But in many cases, it might be significantly cheaper to partition the relations involved into multiple parts, and to use different execution plans for different partitions of the relations. For example, for the above example query, partitioning S into two partitions S1 and S2, and executing different plans for the two subqueries R JOIN S1 JOIN T, and R JOIN S2 JOIN T, may lead to faster execution times, as well as lower memory footprint.
Currently, we are trying to define the problem of horizontal data partitioning formally, and trying to develop exhaustive algorithms for finding the optimal partitioning of the database relations that minimize the number of intermediate tuples generated. Such an exhaustive algorithm will probably be infeasible in practice, and we are planning to develop heuristic algorithms to find good partitioning (using summary information such as histograms present on the relations).
Most relational query optimizers make use of information about the costs of accessing tuples and data structures on various storage devices. This information can at times be off by several orders of magnitude due to human error in configuration setup, sudden changes in load, or hardware failure. We are interested the following questions:
1IBM Almaden Research Center
Continuous query (CQ) systems have been proposed for a variety of critical, high-throughput, long-running tasks. Examples include network monitoring tasks such as intrusion or denial-of-service detection, phone call record processing, web-log, or click-stream processing. These applications can be implemented as continuously processing dataflows, a generalization of database query plans. To provide high-throughput and low-latencies, CQ dataflows can be scaled by parallelizing them across a network of workstations, also known as a cluster or a shared-nothing platform.
Due to the long-running nature of CQ dataflows, two types of problems that arise in a cluster must be addressed for this method of scaling dataflows to be viable. First, as workload and runtime conditions evolve, detrimental imbalances in resource availability can arise across the cluster. CQ dataflows must continue to run efficiently in the face of these imbalances. Second, the machines on which CQ dataflows are running are bound to unexpectedly fail. Without using any redundancy, a single failure can seriously hamper or defeat the purpose of the CQ dataflow altogether. The CQ system must prevent data loss and provide high-availability for critical, long-running applications.
Our solution is to embed the logic for load-balancing and fault-tolerance into the communication abstraction, Flux, that is used to glue together a dataflow's constituent operators. In the absence of faults, Flux manages the state and input data partitioning of CQ operators to rebalance and keep them running efficiently. It also coordinates redundant processing of CQ operators and automatic fail-over to provide high-availability. The Flux mechanism is tunable so that an application designer can trade reliability for performance in portions of a CQ dataflow that do not require high-reliability. By judiciously embedding Flux in a CQ dataflow, an application designer can provide controlled degradation for the dataflow in the face of resource imbalances and machine faults that arise unexpectedly.
We are currently implementing Flux in the TelegraphCQ code base and developing robust CQ applications using Flux to demonstrate its utility.