Reynold S. Xin

I am currently on leave at Databricks, a startup to revolutionize what you can do with Data. I am a PhD student at UC Berkeley in the AMPLab, advised by Michael Franklin.

Research Projects

Below is a list of projects that I have worked on during my PhD:

Shark: An open source SQL query engine. It uses Spark as the physical execution engine and can run Hive QL queries up to 100x faster without losing the fault-tolerance and scale-out properties of MapReduce.

GraphX: Proposing a new way to think about graph computation.

Apache Spark: One of the most popular Big Data framework.

CrowdDB: A pioneering database system that incorporates crowd-sourced query processing. The project presents a vision in which humans are simply resources database systems can use to answer queries.

Readings in Databases: I maintain a list of papers essential to the understanding of database systems online.