Modeling Annotated Data

David M. Blei
(Professor Michael I. Jordan)
Microsoft Fellowship, (NSF) IIS-9988642, and (ONR) N00014-01-1-0890

We consider the problem of modeling annotated data--data with multiple types where the instance of one type (such as a caption) serves as a description of the other type (such as an image). We describe three hierarchical mixture models that are aimed at such data, culminating in the Corr-LDA model, a latent variable model that is effective at both joint clustering and automatic annotation. We conduct experiments to test these models using the Corel database of images and captions.


Send mail to the author : (blei@cs.berkeley.edu)


Edit this abstract