David Mandelin

Office:
517 Soda Hall
Email:
mandelin@cs.berkeley.edu

I am a fifth-year Ph.D student in computer science at UC-Berkeley. My advisor is Rastislav Bodík.

Research

I am primarily interested in programming languages and software engineering.

Prospector

My current project is Prospector, a "programmer's search engine" for Java that creates code snippets to help programmers use complex APIs, which pretty much means all APIs nowadays. It's not just a search engine, though, because it grinds up class declarations and collections of real-world example code, and then puts them back together as easily understood code snippets. This places Prospector in the broader area of code mining, which means inferring information about software from collections of sample code or test behavior (in contrast to direct analysis).

We are both improving the search technology and developing an Eclipse plugin that integrates Prospector with Java Content Assist (a.k.a. code completion).

For more information about Prospector:

Working prototype plugin. Tested with Eclipse 3.1RC2. It will probably work with other Eclipse 3.1 versions, but I haven't tested it. Note: New versions of Eclipse do not automatically scan for newly installed plugins, so you may need to run Eclipse once with the -dev command line option after copying in the files. Click here for more information about installing plugins.

Slides from PDLI coming soon

Web demo (also has information about how to use Prospector and how it works)

Slides. From a talk given at the 2004 Open Source Quality (OSQ) project retreat.

Paper (PLDI 2005): Jungloid Mining: Helping to Navigate the API Jungle.
With Lin Xu, Rastislav Bodík, and Doug Kimelman.

Other Research

During a summer internship at IBM T.J. Watson Research Center, I designed (along with Doug Kimelman and Danny Yellin) an algorithm for finding correspdondences between related systems models. The problem is a sort of "fuzzy matching" problem for graphs with attributes. The matching is fuzzy enough to include many-to-one and none-to-one matches. The algorithm is designed around a Bayesian model of the probability that each pair of graph elements corresponds. Our prototype implementation shows promise: it's more accurate and precise than a simple algorithm that picks off the node pairs with the most similar names. See our paper for more information:

Paper: A Bayesian Approach to Diagram Matching with Application to Architectural Models.
David Mandelin, Doug Kimelman, Danny Yellin. (ICSE 2006)

Before Prospector, I worked with Glenn Ammons a bit on Strauss, another application of code mining. Strauss learns finite automaton specifications for C APIs, such as X, from run-time traces of procedure call sequences. I helped test Cable, a tool to help Strauss users debug learned specifications. I also ported Cable to Java, designed and experimented with some alternate user interfaces. Finally, I designed and implemented a tweak to the generic automaton learner used by Strauss that makes it more effective for learning API specifications based on the observation that most API procedures can be invoked some subset of zero, one, or many times, but generally not exactly 3 or 239 times.

Paper: Debugging Temporal Specifications with Concept Analysis.
Glenn Ammons, David Mandelin, Rastislav Bodík, James Larus (PLDI 2003).

I'm also interested in statistical learning and other AI techniques. I've taken courses on statistical learning theory and done a couple of course projects: an automaton learner for Strauss specifications based on the EM algorithm; and a program that learns how to find features of web pages, such as the name of the author or a hardware vendor's support link, based on examples, using Adaboost to boost very weak learners.

Business

Greg Waldoch and I are the founders and owners of Koboldsoft, a provider of online gaming software. We plan to release our first product, RPZen, later this year.

Work Experience

I had a lot of experience as a programmer before starting grad school. In spring and summer of 2002, I worked for the Niagara project at UW-Madison, implementing the index manager for the Niagara XML database in C++. In summer and fall of 2001, I worked for the Wisconsin Center for Education Research on miscellaneous tasks relating to an audiovisual application they were developing: Python scripting, adapting an open-source MP3 decoder to produce waveform visualizations, a little hacking on AbiWord, and testing.

Before that, I worked a long time for the Division of Information Technology at UW-Madison. The last thing I did was lead development of WISDM, the web data mart for the UW System's financial information. Before that, I developed various Access applications, and before that, I even did some programming on OS/390 mainframes.

Coursework

PL
  
CS 263 Design of Programming Languages

CS 264 Program Analysis

CS 265 Compiler Optimization and Code Generation

CS 294 Software Synthesis

CS 294 Techniques for Automated Deduction

EE 219C Model Checking and Computational Logic (audited/not for credit)
AI/Statistics

CS 281A Statistical Learning Theory

CS 281B Advanced Topics in Learning and Decision Making

Stat 200B Introduction to Statistics at Graduate Level

Stat 205A Probability Theory
The Rest

CS 262A Advanced Topics in Computer Systems

CS 270 Combinatorial Algorithms and Data Structures

Software

Random New York Times Crossword Link

Useful if you like to do crosswords from the archives. Check off the days that you want to play, and it will generate a link to a random puzzle from one of those days.

svgViewer

This is a simple SVG viewer I created so I could more conveniently view and print GraphViz graphs.

Type Discovery Prototype

A description of what I've been working on lately along with some sketchy prototype code.

Total Fragmentation

Along with Bill McCloskey and AJ Shankar, I created a Scorched Earth-style game called Total Fragmentation. See the Total Fragmentation page for details and download.



Page updated 25 Feb 2007