Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2009 Research Summary

Grammar Learning Using Bayesian Nonparametrics

View Current Project Information

Percy Shuo Liang, Slav Orlinov Petrov, Michael Jordan and Daniel Klein

In grammar learning applications such as inducing a grammar from unannotated sentences or refining an existing grammar, the question of how to select the complexity of the grammar, i.e., how many symbols to allocate, is an important problem. Much of the previous work has focused on procedural approaches to control the complexity of the grammar. Bayesian nonparametrics offer a declarative framework for specifying one's prior beliefs over the number of symbols, which could possibly unbounded.

We have developed an extension of probabilistic context-free grammars using hierarchical Dirichlet processes (HDP-PCFG) [1] and a variational inference algorithm for learning such grammars.

P. Liang, S. Petrov, M. I. Jordan, and D. Klein, "The Infinite PCFG Using Hierarchical Dirichlet Processes," Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP/CoNLL), 2007.