Real world problems such as machine translation involve complex dependencies. Generative models have provided an elegant and flexible framework to model those dependencies, but they appear to lack robustness to model misspecification compared to discriminative models for classification. In this talk, we present methods for leveraging the advantages of generative models in the discriminative framework.
In the first part of the talk, we tackle the word alignment problem from natural language processing. We formulate it as a weighted bipartite matching problem and show how to learn the weights by using a large-margin approach for structured prediction. By providing a flexible discriminative modeling framework, we were able to cut the Alignment Error Rate in half compared to the previous best performing generative models for word alignment.
In the second part of the talk, we study probabilistic topic models which have been popular for modeling latent structures in text documents (as bag of words) or images (as bag of visual words). They are usually trained as generative models with maximum likelihood estimation, though this could be suboptimal if one is interested in doing classification. In contrast, we present a discriminative version of the Latent Dirichlet Allocation (LDA) model which attempts to uncover the latent structure in the documents while optimizing its predictive power for the task of classification. We show positive results on the 20 Newsgroup dataset for document classification.
(joint work with Fei Sha, Ben Taskar, Dan Klein and Michael I. Jordan)