Scale-Independent
Query Processing
with P
QL
Michael Armbrust
http://www.cs.berkeley.edu/~marmbrus/
Applications Grow Rapidly
Success Disaster
Databases Often the Culprit
Easy to add stateless application servers. [1]
Harder to scale stateful storage systems. [2]
[1] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski, Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica, and Matei Zaharia. 2010. A view of cloud computing. Commun. ACM 53, 4 (April 2010), 50-58. DOI=10.1145/1721654.1721672 http://doi.acm.org/10.1145/1721654.1721672
online
[2] John Adams. "Billions of Hits: Scaling Twitter" Chirp 2010.
online
Traditional Relational Databases
Data Independence
- Pro:
- Declarative queries specify what data, not how to retrieve it.
- Allows optimization and changes to physical/logical schema without changes to application.
- Con: Problems hidden until they cause issues in production.
Dirty Secret: Declarative languages hide complexity.
Traditional Relational Databases
Minimum Average Cost Optimization
http://www.flickr.com/photos/stephen_downes/6246522381/
NoSQL 'Solution'
- Pro: Provides predictable performance for simple operations.
-
Con: Simple operations force developers to reinvent complex abstractions.
PIQL Solution
Guarantee scalability for sophisticated relational queries
- Built on existing scalable key/value stores
- Developers retain productivity benefits of SQL
- Optimizer ensures all queries are scalable even for outliers
- Indexes and views created as needed when they can be proven scalable
- Model for reasoning about high-quantile SLO compliance
Architecture Overview
PIQL uses a distributed key/value store [1] for:
- Operations like
get(key) and getRange(prefix, limit)
- Scalability with predictable performance [2]
- Consistency
[1] Michael Armbrust, Armando Fox, David Patterson, Nick Lanham, Haruki Oh, Beth Trushkowsky, and J. Trutna. SCADS: Scale-independent Storage for Social Computing Applications. In Conference on Innovative Data Systems Research (CIDR), 2009.
online
[2] Beth Trushkowsky, Peter Bodík, Armando Fox, Michael J. Franklin, Michael I. Jordan, and David A. Patterson. 2011. The SCADS director: scaling a distributed storage system under stringent performance requirements. In Proceedings of FAST'11
online
Outline
- Answering Scale-Independent Queries On-Demand
- Mapping relational queries to bounded key/values store ops
- Language extensions
- SLO compliance prediction
- Expanding Scale Independence with Precomputation
- Transforming scale-dependent queries
- Bounding storage and maintenance costs
- Future Work
Scale-Independent Optimization
Bound work required to answer all queries [1]
Given the set of all queries in an application $Q$. Let $Exec(q_i)$ denote the number of operations performed in the worst case by a query and $c_{ops}^{q_i}$ be a constant.
$$\forall q_i \in Q \exists c_{ops}^{q_i} : Exec(q_i) < c_{ops}^{q_i}$$
[1] M. Armbrust, K. Curtis, T. Kraska, A. Fox, M. J. Franklin, and D. A. Patterson. Piql: Success-tolerant query processing in the cloud. PVLDB, 5(3):181–192, 2011.
online
Mapping Queries to K/V Ops
Example
SELECT *
FROM Posts
WHERE topicId = <id>
ORDER BY timestamp
LIMIT 10
|
 |
$c_{ops}=1$ GetRange |
Mapping Queries to Key/Value Ops
Index Scan
 |
$c_{ops}=1$ GetRange |
Mapping Queries to Key/Value Ops
Foreign Key Join
 |
$c_{ops}=|ChildPlan|$ Gets |
Mapping Queries to Key/Value Ops
(Sorted) Index Join
 |
$c_{ops}=|Child Plan|$ GetRanges |
Handling Harder Queries
Ex: Timeline queries
SELECT p.text
FROM Post p, Subscriptions s
WHERE s.owner = <currentUser> AND
s.topicId = p.topicId AND
s.approved = true
ORDER BY p.timestamp
|
 |
Why is this safe to run? Developers have more information than optimizer.
Bound Intermediate and Final Results
SLO Compliance Prediction
Suggests solutions to scale-dependent queries and helps choose acceptable cardinalities
Measure Single-Op Performance |
Predict Query SLO Compliance |
Evaluation
Workloads
- TPC-W - Standard e-commerce database benchmark
- SCADr - Clone of twitter
Qualitative Analysis
- Introduced suggested cardinality constraints.
- PIQL automatically selected indexes.
- Prediction model accurately predicted SLO compliance.
Methodology
- Ran each benchmark on EC2 using SCADS
- Increasing both data and system size (weak scaling).
- Measured throughput and 99th percentile response time.
TPC-W Scale Experiment
Throughput Linear scaling as application grows. |
Latency Constant response time as application grows. |
Scale-Independent Precomputation
Precomputation Workflow
Example: Precomputable Query
Show the 10 most recent updates tagged with two words:
SELECT t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
t1.docId = t2.docId AND
t1.tag = <tag1> AND
t2.tag = <tag2>
ORDER BY d.timestamp DESC
PAGINATE 10
Example: Precomputing the Answer
Created a view that precomputes all two-tag combinations for each document
CREATE VIEW twoTags
SELECT t1.tag, t2.tag, d.timestamp,
t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
t1.docId = t2.docId AND
t1.tag = <tag1> AND
t2.tag = <tag2>
ORDER BY d.timestamp
PAGINATE 10
View Selection Algorithm
PIQL creates a view that contains the results of a query for any set of parameters as adjacent rows.
- Parameterized predicates
- Ordering
- Check adjacency
- Add keys for safe maintenance [1]
[1] Ceri, S. and Widom, J. (1991) Deriving Production Rules for Incremental View Maintenance. In: VLDB.
Bounding Storage Resources
PIQL ensures the view will grow linearly with the size of the base relations
Let $V$ be the set of all created views required to answer the queries in $Q$. Let $c_{size}^{v_i}$ be a constant for a view $v_i$ and let $r_{size}^{v_i}$ be a relation in $R$. PIQL bounds the storage independent of scale by ensuring that:
$$\forall v_i \in V \exists c_{size}^{v_i}, r_{size}^{v_i} \in R : c_{size}^{v_i}|r_{size}^{v_i}| \ge |v_i|$$
Algorithm for Bounding View Size
Find edge cover in the functional dependencies
- Primary Key: $(keyAttributes) \rightarrow (otherAttributes)$
- Cardinality Constraint:
$(keyAttributes) \xrightarrow{cardinality} (constrainedFields)$
- Equality Predicate:
$attribute_1 \leftrightarrow attribute_2$
Size Bound for TwoTags View
CREATE VIEW twoTags
SELECT t1.tag, t2.tag,
d.timestamp, t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
t1.id = t2.id
$|twoTagsView| \le K|Tags|$
Bounding Maintenance Operations
PIQL ensures each update can be processed with a bounded number of operations
Let $Update(r_i)$ denote the number of operations performed in the worst case by index and view maintenance when updating a single tuple in $r_i$, and let $c_{ops}^{r_i}$ be a constant. The PIQL optimizer will only create views where it can ensure the maintenance cost obeys the following:
$$\forall r_i \in R \exists c_{ops}^{r_i} : Update(r_i) < c_{ops}^{r_i}$$
Bounding Maintenance Operations
Reuse optimization techniques on delta queries.
Original QueryCREATE VIEW twoTags
SELECT t1.tag, t2.tag,
d.timestamp, t1.docId
FROM tags t1, tags t2, docs d
WHERE d.docId = t1.docId AND
t1.id = t2.id
|
t1 Delta QueryCREATE RULE dt1
ON INSERT t1
INSERT INTO twoTags
SELECT <tag>, t2.tag,
d.timestamp, d.docId
FROM tags t2, documents d
WHERE t2.docId = <docId> AND
d.id = <docId>
|
Splitting Unbounded Updates
To increase expressivity allow unbounded updates to be split
- Parallel Updates
- Requirement: unbounded over only one relation
- Cost: Possibly spiky computation requirements
- Partial/Lazy Updates
- Requirement: limited/paginated original query, update follows sort order
- Cost: Increased staleness for later pages
Due to cost requires explicit authorization from developer
SLO Compliance of Updates
Results for running two tags query on increasing cluster sizes
| 99% Response Time (ms) Log-Scale |
Read |
Write |
On-Demand
|
| # of Machines |
SLO Compliance of Updates
Results for running two tags query on increasing cluster sizes
| 99% Response Time (ms) Log-Scale |
Read |
Write |
On-Demand
Precomputed
|
| # of Machines |
SLO Compliance of Updates
Results for running two tags query on increasing cluster sizes
| 99% Response Time (ms) Log-Scale |
Read |
Write |
On-Demand
Precomputed
|
| # of Machines |
Future Work
Managing contention
PIQL currently allows data models with inherent contention:
- Global counter in a single record updated by many users
- Indexing hot-spot due to temporal locality of inserts
Challenges
- Model for expressing locality of updates and understanding contention
- Detection of problematic data structures
- Automatically suggest approximation and sampling
Expand the operator space
Other possible execution strategies include:
- Pushing computation to the store (Bigtable co-processors)
- Distributed low-latency execution (Dremel)
Challenges
- Defining additional scale-independent operators
- SLO Compliance modeling in the face of stragglers
Handling partial failures
Consistency requirements can delay update processing
Alternative: Eventually consistent view maintenance
- Process update partially during failure
- Repair inconstancies after failure is resolved
Challenges
- For what types of queries is this possible?
- What bookkeeping needs to be done?
Language Integration
Developers prefer language bindings to opaque SQL strings.
case class Tag(var docId: Int, var tag: String) extends AvroPair
val twoTags = tags.as('t1).join(tags, 't2)
.where('t1_docId === 't2_docId)
.where('t1_tag === 0.?)
.where('t2_tag === 1.?)
.orderBy('timestamp)
.toPiql()
Challenges
- Type-safety for queries and results
- Support for scale-independent UDFs
Conclusion
By adding scale independence to the relational model, PIQL makes it easier to build web application that work even as they explode in popularity.
Techniques:
- Scale independent plan selection
- PIQL language extensions
- Automatic index / view selection
- SLO compliance modeling
Acknowledgements
Michael Franklin, Armando Fox, David Patterson, Kristal Curtis, Tim Kraska, Stephen Tu, Eric Liang,
deck.js
Learn More
http://www.cs.berkeley.edu/~marmbrus/
Questions?
←
→
/
#