Grammar Learning Using Bayesian Nonparametrics
Percy Shuo Liang, Slav Orlinov Petrov, Michael Jordan and Daniel Klein
In grammar learning applications such as inducing a grammar from unannotated sentences or refining an existing grammar, the question of how to select the complexity of the grammar, i.e., how many symbols to allocate, is an important problem. Much of the previous work has focused on procedural approaches to control the complexity of the grammar. Bayesian nonparametrics offer a declarative framework for specifying one's prior beliefs over the number of symbols, which could possibly unbounded.
We have developed an extension of probabilistic context-free grammars using hierarchical Dirichlet processes (HDP-PCFG)  and a variational inference algorithm for learning such grammars.
- P. Liang, S. Petrov, M. I. Jordan, and D. Klein, "The Infinite PCFG Using Hierarchical Dirichlet Processes," Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP/CoNLL), 2007.