Scale-Independent
Query Processing
with PQL
Michael Armbrust
http://www.cs.berkeley.edu/~marmbrus/

Applications Grow Rapidly

Success Disaster

Databases Often the Culprit

Easy to add stateless application servers. [1]

Harder to scale stateful storage systems. [2]

[1] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A view of cloud computing. Commun. ACM 53, 4 (April 2010), 50-58. DOI=10.1145/1721654.1721672 http://doi.acm.org/10.1145/1721654.1721672 online
[2] John Adams. "Billions of Hits: Scaling Twitter" Chirp 2010. online

Traditional Relational Databases

Data Independence

Dirty Secret: Declarative languages hide complexity.

Traditional Relational Databases

Minimum Average Cost Optimization

Dirty Secret: Fails for the 0.1%! http://www.flickr.com/photos/stephen_downes/6246522381/

NoSQL 'Solution'

PIQL Solution

Guarantee scalability for sophisticated relational queries

Architecture Overview

PIQL uses a distributed key/value store [1] for:

[1] Michael Armbrust, Armando Fox, David Patterson, Nick Lanham, Haruki Oh, Beth Trushkowsky, and J. Trutna. SCADS: Scale-independent Storage for Social Computing Applications. In Conference on Innovative Data Systems Research (CIDR), 2009. online
[2] Beth Trushkowsky, Peter Bodík, Armando Fox, Michael J. Franklin, Michael I. Jordan, and David A. Patterson. 2011. The SCADS director: scaling a distributed storage system under stringent performance requirements. In Proceedings of FAST'11 online

Outline

Scale-Independent Optimization

Bound work required to answer all queries [1]

Given the set of all queries in an application $Q$. Let $Exec(q_i)$ denote the number of operations performed in the worst case by a query and $c_{ops}^{q_i}$ be a constant.

$$\forall q_i \in Q \exists c_{ops}^{q_i} : Exec(q_i) < c_{ops}^{q_i}$$
[1] M. Armbrust, K. Curtis, T. Kraska, A. Fox, M. J. Franklin, and D. A. Patterson. Piql: Success-tolerant query processing in the cloud. PVLDB, 5(3):181–192, 2011. online

Mapping Queries to K/V Ops

Example

SELECT *
FROM Posts
WHERE topicId = <id>
ORDER BY timestamp
LIMIT 10
$c_{ops}=1$ GetRange

Mapping Queries to Key/Value Ops

Index Scan

$c_{ops}=1$ GetRange

Mapping Queries to Key/Value Ops

Foreign Key Join

$c_{ops}=|ChildPlan|$ Gets

Mapping Queries to Key/Value Ops

(Sorted) Index Join

$c_{ops}=|Child Plan|$ GetRanges

Handling Harder Queries

Ex: Timeline queries

SELECT p.text
FROM Post p, Subscriptions s
WHERE s.owner = <currentUser> AND
      s.topicId = p.topicId AND
      s.approved = true
ORDER BY p.timestamp

Why is this safe to run? Developers have more information than optimizer.

Language Extensions

  • Bounding Data Returned

    PAGINATE turns an unbounded query into an unbounded set of bounded queries

Language Extensions

  • Bounding Data Returned

    PAGINATE turns an unbounded query into an unbounded set of bounded queries

  • Bounding Intermediate Results

    Cardinality constraints in the schema allows developers to inform the optimizer of natural relationship limits.

Bound Intermediate and Final Results

SLO Compliance Prediction

Suggests solutions to scale-dependent queries and helps choose acceptable cardinalities

Measure Single-Op
Performance

Predict Query SLO
Compliance

Evaluation

TPC-W Scale Experiment

Throughput

Linear scaling as application grows.

Latency

Constant response time as application grows.

Scale-Independent Precomputation

Precomputable Queries

Precomputation Workflow

Example: Precomputable Query

Show the 10 most recent updates tagged with two words:

SELECT t1.docId 
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
      t1.docId = t2.docId AND
      t1.tag = <tag1> AND
      t2.tag = <tag2>
ORDER BY d.timestamp DESC
PAGINATE 10

Example: Precomputing the Answer

Created a view that precomputes all two-tag combinations for each document

CREATE VIEW twoTags
SELECT t1.tag, t2.tag, d.timestamp,
       t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
      t1.docId = t2.docId AND
      t1.tag = <tag1> AND 
      t2.tag = <tag2> 
ORDER BY d.timestamp
PAGINATE 10
		

View Selection Algorithm

PIQL creates a view that contains the results of a query for any set of parameters as adjacent rows.

[1] Ceri, S. and Widom, J. (1991) Deriving Production Rules for Incremental View Maintenance. In: VLDB.

Bounding Storage Resources

PIQL ensures the view will grow linearly with the size of the base relations

Let $V$ be the set of all created views required to answer the queries in $Q$. Let $c_{size}^{v_i}$ be a constant for a view $v_i$ and let $r_{size}^{v_i}$ be a relation in $R$. PIQL bounds the storage independent of scale by ensuring that:

$$\forall v_i \in V \exists c_{size}^{v_i}, r_{size}^{v_i} \in R : c_{size}^{v_i}|r_{size}^{v_i}| \ge |v_i|$$

Algorithm for Bounding View Size

Find edge cover in the functional dependencies

Size Bound for TwoTags View

CREATE VIEW twoTags
SELECT t1.tag, t2.tag, 
       d.timestamp, t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
      t1.id = t2.id
$|twoTagsView| \le K|Tags|$

Bounding Maintenance Operations

PIQL ensures each update can be processed with a bounded number of operations

Let $Update(r_i)$ denote the number of operations performed in the worst case by index and view maintenance when updating a single tuple in $r_i$, and let $c_{ops}^{r_i}$ be a constant. The PIQL optimizer will only create views where it can ensure the maintenance cost obeys the following:

$$\forall r_i \in R \exists c_{ops}^{r_i} : Update(r_i) < c_{ops}^{r_i}$$

Bounding Maintenance Operations

Reuse optimization techniques on delta queries.

Original Query
CREATE VIEW twoTags
SELECT t1.tag, t2.tag, 
       d.timestamp, t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
      t1.id = t2.id
t1 Delta Query
CREATE RULE dt1
ON INSERT t1
INSERT INTO twoTags
SELECT <tag>, t2.tag, 
       d.timestamp, d.docId
FROM tags t2, documents d 
WHERE t2.docId = <docId> AND
      d.id = <docId>

Splitting Unbounded Updates

To increase expressivity allow unbounded updates to be split

Due to cost requires explicit authorization from developer

SLO Compliance of Updates

Results for running two tags query on increasing cluster sizes

99% Response Time (ms) Log-Scale

Read

Write

On-Demand
# of Machines

SLO Compliance of Updates

Results for running two tags query on increasing cluster sizes

99% Response Time (ms) Log-Scale

Read

Write

On-Demand
Precomputed
# of Machines

SLO Compliance of Updates

Results for running two tags query on increasing cluster sizes

99% Response Time (ms) Log-Scale

Read

Write

On-Demand
Precomputed
# of Machines

Future Work

Managing contention

PIQL currently allows data models with inherent contention:

Challenges

Expand the operator space

Other possible execution strategies include:

Challenges

Handling partial failures

Consistency requirements can delay update processing

Alternative: Eventually consistent view maintenance

Challenges

Language Integration

Developers prefer language bindings to opaque SQL strings.

case class Tag(var docId: Int, var tag: String) extends AvroPair
	
val twoTags = tags.as('t1).join(tags, 't2)
                  .where('t1_docId === 't2_docId)
                  .where('t1_tag === 0.?)
                  .where('t2_tag === 1.?)
                  .orderBy('timestamp)
                  .toPiql()

Challenges

Conclusion

By adding scale independence to the relational model, PIQL makes it easier to build web application that work even as they explode in popularity.

Techniques:

Acknowledgements

Michael Franklin, Armando Fox, David Patterson, Kristal Curtis, Tim Kraska, Stephen Tu, Eric Liang, deck.js

Learn More

http://www.cs.berkeley.edu/~marmbrus/

Questions?

/

#