Use of Randomization in Program Analysis

Sumit Gulwani
(Professor George C. Necula)
Microsoft Research, (NSF) CCR-0081588, (NSF) CCR-0085949 , and (NSF) CCR-987517

Analyzing programs is provably hard and we have learned to pay a price in terms of loss of completeness (or precision), and the complexity and running time of the analysis. This project is trying to explore how randomization can be used to obtain simplicity and efficiency at the expense of making soundness probabilistic.

We have used some randomized techniques for the problem of discovering affine equalities in a program [1]. This problem arises in several program analyses like compiler optimizations (e.g., constant propagation, common sub-expression elimination, elimination of conditionals), discovering loop invariants, and translation validation. The key idea of the randomized algorithm for this problem is to execute a code fragment on a few random inputs, but in such a way that all paths are covered on each run. This makes it possible to rule out invalid relationships even with very few runs. The algorithm is based on two main techniques. First, both branches of a conditional are executed on each run and at joint points we perform an affine combination of the joining states. Secondly, in the branches of an equality conditional we adjust the data values on the fly to reflect the truth value of the guarding Boolean expression. This increases the number of affine equalities that the analysis discovers. The algorithm is simpler to implement than alternative deterministic versions, has better computational complexity, and has an extremely small probability of error for even a small number of runs. This algorithm is an example of how randomization can provide a trade-off between the cost and complexity of program analysis, and a small probability of unsoundness. We are currently exploring if these ideas can be extended to handle other analyses involving inequalities, non-linear assignments, memory expressions, and dependent conditionals.

It is also interesting to see if randomized techniques can be helpful in the context of theorem proving. We have demonstrated this for the congruence closure problem involving uninterpreted functions and the theory of linear equalities, and are yet to explore if these ideas can be extended to handle a combination of other convex theories too.

[1]
S. Gulwani and G. C. Necula, "Discovering Affine Equalities Using Random Interpretation," ACM Symp. Principles of Programming Languages, New Orleans, LA, January 2003.

Send mail to the author : (gulwani@eecs.berkeley.edu)


Edit this abstract