
|
Dan Klein
Associate Professor
Computer Science Division
University of California at Berkeley
Contact
Information
| Email |
|
 |
| Mail |
|
Dan Klein, Soda Hall, Berkeley, CA
94720-1776 |
| Phone |
|
(510) 643-0805 (email works best) |
Research
My research focuses on the automatic organization of natural language
information. Some topics of interest to me are:
- Unsupervised language acquisition
- Machine translation
- Efficient algorithms for NLP
- Information extraction
- Linguistically rich models of language
- Integrating symbolic and statistical methods for NLP
- Organization of the web
My group's web page (the Berkeley
Natural Language Processing Group).
Our agent, the Overmind, won the AIIDE 2010 StarCraft
AI competition!
Background
My education, in reverse order.
Some fellowships / awards:
Some paper awards I've won:
- Best Paper Award, ACL 2003, for "Accurate Unlexicalized
Parsing" with Chris Manning
- Best Paper Award, EMNLP 2004, for "Max-Margin Parsing"
with Ben Taskar, Mike Collins, Chris Manning, and Daphne Koller
- Best Student Paper Award, NAACL 2006, for "Prototype-Driven
Learning for Sequence Models" with Aria Haghighi
- Best Paper Award, ACL 2009, for "K-Best A* Parsing" with Adam Pauls
- Best Paper Award, NAACL 2010, for "Coreference Resolution in a Modular, Entity-Centered Model" with Aria Haghighi
An out-of-date CV. [pdf]
Teaching
I am currently teaching
cs188, the undergraduate introduction to artificial intelligence.
Last term I taught
cs288, the graduate statistical NLP course (once known as cs294-5, -7, and -19).
My tutorials are below, in the publication list.
Publications
-
2010
- A Game-Theoretic Approach to Generating Spatial Descriptions, Dave Golland, Percy Liang, and Dan Klein, In proceedings of EMNLP 2010. [pdf]
- A Simple Domain-Independent Probabilistic Approach to Generation, Gabor Angeli, Percy Liang,
and Dan Klein, In proceedings of EMNLP 2010. [pdf]
- Learning Programs: A Hierarchical Bayesian Approach, Percy Liang, Michael Jordan, and Dan Klein, In proceedings of ICML 2010. [pdf]
- Learning Better Monolingual Models with Unannotated Bilingual Text, David Burkett, John Blitzer, and Dan Klein, In proceedings of CoNLL 2010. [pdf]
- An Entity-Level Approach to Information Extraction, Aria Haghighi and Dan Klein, In proceedings of ACL 2010. [pdf]
- Discriminative Modeling of Extraction Sets for Machine Translation, John DeNero and Dan Klein, In proceedings of ACL 2010. [pdf]
- Top-Down K-Best A* Parsing, Adam Pauls, Dan Klein, and Chris Quirk, In proceedings of ACL 2010. [pdf]
- Hierarchical A* Parsing with Bridge Outside Scores, Adam Pauls and Dan Klein, In proceedings of ACL 2010. [pdf]
- Simple, Accurate Parsing with an All-Fragments Grammar, Mohit Bansal and Dan Klein, In proceedings of ACL 2010. [pdf]
- Phylogenetic Grammar Induction, Taylor Berg-Kirkpatrick and Dan Klein, In proceedings of ACL 2010. [pdf]
- Finding Cognate Groups using Phylogenies, David LW Hall and Dan Klein, In proceedings of ACL 2010. [pdf]
- Coreference Resolution in a Modular, Entity-Centered Model, Aria Haghighi and Dan Klein, In proceedings of NAACL 2010. [pdf]
- Joint Parsing and Alignment with Weakly Synchronized Grammars, David Burkett, John Blitzer, and Dan Klein, In proceedings of NAACL 2010. [pdf]
- Type-Based MCMC, Percy Liang, Michael Jordan, and Dan Klein, In proceedings of NAACL 2010. [pdf]
- Painless Unsupervised Learning with Features, Taylor Berg-Kirkpatrick, John DeNero, and Dan Klein, In proceedings of NAACL 2010. [pdf]
- Unsupervised Syntactic Alignment with Inversion Transduction Grammars, Adam Pauls, David Chiang, and Kevin Knight, In proceedings of NAACL 2010. [pdf]
- Probabilistic grammars and hierarchical Dirichlet processes, Percy Liang, Michael Jordan, and Dan Klein, Book chapter in The Oxford Handbook of Applied Bayesian Analysis 2009. [pdf]
-
2009
- Consensus Training for Consensus Decoding in Machine Translation, Adam Pauls, John DeNero, and Dan Klein, In proceedings of EMNLP 2009. [pdf]
- Asynchronous Binarization for Synchronous Grammars, John DeNero, Adam Pauls, and Dan Klein, In proceedings of ACL-IJCNLP Short Paper Track 2009. [pdf]
- Better Word Alignments with Supervised ITG Models, Aria Haghighi, John Blitzer, John DeNero, and Dan Klein, In proceedings of ACL-IJCNLP 2009. [pdf]
- Simple Coreference Resolution with Rich Syntactic and Semantic Features, Aria Haghighi and Dan Klein, In proceedings of EMNLP 2009. [pdf]
- Efficient Parsing for Transducer Grammars, John DeNero, Mohit Bansal, Adam Pauls, and Dan Klein, In proceedings of NAACL 2009. [pdf]
- Convergence Bounds for Language Evolution by Iterated Learning, Anna N. Rafferty, Thomas L. Griffiths, and Dan Klein, In Proceedings of the 31st Annual Conference of the Cognitive Science Society 2009. [pdf]
- Learning Semantic Correspondences with Less Supervision, Percy Liang, Michael Jordan, and Dan Klein, In proceedings of ACL 2009. [pdf] [slides]
- Learning from Measurements in Exponential Families, Percy Liang, Michael Jordan, and Dan Klein, In proceedings of ICML 2009. [pdf] [slides]
- Online EM for Unsupervised Models, Percy Liang and Dan Klein, In proceedings of NAACL 2009. [pdf] [slides]
- K-Best A* Parsing, Adam Pauls and Dan Klein, In Proceedings of ACL 2009. [pdf]
- Hierarchical Search for Parsing, Adam Pauls and Dan Klein, In Proceedings of NAACL 2009. [pdf]
- Efficient Inference in Phylogenetic InDel Trees , Alexandre Bouchard-Côté, Michael I. Jordan, and Dan Klein, In proceedings of NIPS 2009. [pdf]
- Improved Reconstruction of Protolanguage Word Forms, Alexandre Bouchard-Côté, Thomas Griffiths, and Dan Klein, In proceedings of NAACL 2009. [pdf]
-
2008
- Coarse-to-Fine Syntactic Machine Translation using Language Projections, Slav Petrov, Aria Haghighi and Dan Klein, In proceedings of EMNLP 2008. [pdf] [bib] [slides]
- Sparse Multi-Scale Grammars for Discriminative Latent Variable Parsing, Slav Petrov and Dan Klein, In proceedings of EMNLP 2008. [pdf] [bib] [slides]
- Two Languages are Better than One (for Syntactic Parsing), David Burkett and Dan Klein, In proceedings of EMNLP 2008. [pdf]
- Sampling Alignment Structure under a Bayesian Translation Model, John DeNero, Alex Bouchard-Côté, and Dan Klein, In proceedings of EMNLP 2008. [pdf]
- Fully Distributed EM for Very Large Datasets, Jason Wolfe, Aria Haghighi, and Dan Klein, In proceedings of ICML 2008. [pdf] [slides]
- Learning Bilingual Lexicons from Monolingual Corpora, Aria Haghighi, Taylor Berg-Kirkpatrick, and Dan Klein, In proceedings of ACL 2008. [pdf] [slides]
- Structured Compilation: Trading off Structure for Features, Percy Liang, Hal Daume, and Dan Klein, In proceedings of ICML 2008. [pdf] [slides]
- Analyzing the Errors of Unsupervised Induction, Percy Liang and Dan Klein, In proceedings of ACL 2008. [pdf] [slides]
- The Complexity of Phrase Alignment Models, John DeNero and Dan Klein, In proceedings of ACL Short Paper Track 2008. [pdf] [slides]
- Discriminative Log-Linear Grammars with Latent Variables, Slav Petrov and Dan Klein, In proceedings of NIPS 2008. [pdf] [bib] [slides]
- Efficient Sentence Segmentation using Syntactic Features, Benoit Favre, Dile Hakkani-Tur, Slav Petrov and Dan Klein, In proceedings of SLT 2008. [pdf] [bib] [slides]
- A Probabilistic Approach to Language Change, Alexandre Bouchard-Côté, Thomas Griffiths, and Dan Klein, In proceedings of NIPS 2008. [pdf] [slides]
- Agreement-Based Learning, Percy Liang, Dan Klein, and Michael Jordan, In proceedings of NIPS 2008. [pdf] [slides]
-
2007
- Mixture-of-Parents Maximum Entropy Markov Models, David Rosenberg, Dan Klein, and Ben Taskar, In proceedings of Uncertainty in Artificial Intelligence (UAI) 2007. [pdf]
- A Probabilistic Approach to Diachronic Phonology, Alexandre Bouchard-Côté, Percy Liang, Thomas Griffiths, and Dan Klein, In proceedings of EMNLP 2007. [pdf] [slides]
- The Infinite PCFG using Hierarchical Dirichlet Processes, Percy Liang, Slav Petrov, Michael Jordan, and Dan Klein, In proceedings of EMNLP 2007. [pdf] [slides]
- Learning Structured Models for Phone Recognition, Slav Petrov, Adam Pauls, and Dan Klein, In proceedings of EMNLP-CoNLL 2007. [pdf] [slides] [bib]
- A* Search via Approximate Factoring, Aria Haghighi, John DeNero, and Dan Klein, In proceedings of AAAI (Nectar Track) 2007. [pdf]
- Learning and Inference for Hierarchically Split PCFGs, Slav Petrov and Dan Klein, In proceedings of AAAI (Nectar Track) 2007. [pdf] [slides] [bib]
- Unsupervised Coreference Resolution in a Nonparametric Bayesian Model, Aria Haghighi and Dan Klein, In proceedings of ACL 2007. [pdf] [slides] [bib]
- Tailoring Word Alignments to Syntactic Machine Translation, John DeNero and Dan Klein, In proceedings of ACL 2007. [pdf] [slides]
- Improved Inference for Unlexicalized Parsing, Slav Petrov and Dan Klein, In proceedings of HLT-NAACL 2007. [pdf] [slides] [bib]
- Approximate Factoring for A* Search, Aria Haghighi, John DeNero, and Dan Klein, In proceedings of HLT-NAACL 2007. [pdf] [slides] [bib]
-
2006
- Learning Accurate, Compact, and Interpretable Tree Annotation, Slav Petrov, Leon Barrett, Romain Thibaux, and Dan Klein, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Non-Local Modeling with a Mixture of PCFGs, Slav Petrov, Leon Barrett, and Dan Klein, In proceedings of CoNLL 2006. [pdf] [slides] [bib]
- An End-to-End Discriminative Approach to Machine Translation, Percy Liang, Alexandre Bouchard-Côté, Dan Klein, and Ben Taskar, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Why Generative Phrase Models Underperform Surface Heuristics, John DeNero, Dan Gillick, James Zhang, and Dan Klein, Workshop on Statistical Machine Translation at HLT-NAACL 2006. [pdf] [slides] [bib]
- Alignment by Agreement, Percy Liang, Ben Taskar, and Dan Klein, In proceedings of NAACL 2006. [pdf] [slides] [bib]
- Protoype-Driven Learning for Sequence Models, Aria Haghighi and Dan Klein, In proceedings of HLT-NAACL 2006. [pdf] [slides] [bib]
- Protoype-Driven Grammar Induction, Aria Haghighi and Dan Klein, In proceedings of COLING-ACL 2006. [pdf] [slides] [bib]
- Word Alignment Via Quadratic Assignment, Simon Lacoste-Julien, Ben Taskar, Dan Klein, and Michael Jordan, In proceedings of NAACL 2006. [pdf] [bib]
-
2005
- A Discriminative Matching Approach to Word Alignment, Ben Taskar, Simon Lacoste-Julien, and Dan Klein, In proceedings of EMNLP 2005. [pdf] [bib]
- The Unsupervised Learning of Natural Language Structure, Dan Klein, Ph.D. Thesis, Stanford University 2005. [pdf]
- Unsupervised Learning of Field Segmentation Models for Information Extraction, Trond Grenager, Dan Klein, and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2005. [pdf]
-
2004
- Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2004. [pdf]
- Max-Margin Parsing, Ben Taskar, Dan Klein, Michael Collins, Daphne Koller, and Chris Manning, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2004. [pdf]
- Review of Data-Oriented Parsing, edited by Rens Bod, Remko Scha, and Khalil Sima'an, Dan Klein, Computational Linguistics 2004.
-
2003
- Accurate Unlexicalized Parsing, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2003. [pdf]
- Factored A* Search for Models over Sequences and Trees, Dan Klein and Chris Manning, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) 2003. [pdf]
- A* Parsing: Fast Exact Viterbi Parse Selection, Dan Klein and Chris Manning, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) 2003. [pdf]
- Named Entity Recognition with Character-Level Models, Dan Klein, Joseph Smarr, Huy Nguyen, and Chris Manning, In Proceedings of the Conference on Natural Language Learning (CoNLL) 2003. [pdf]
- Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network, Kristina Toutanova, Dan Klein, Chris Manning, and Yoram Singer, In Proceedings of the North American Chapter of the Association for Computational Linguistics (NAACL) 2003. [pdf]
- Spectral Learning, Sepandar Kamvar, Dan Klein, and Chris Manning, In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI) 2003. [pdf]
-
2002
- A Generative Constituent-Context Model for Improved Grammar Induction, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2002. [pdf]
- Parsing and Hypergraphs, Dan Klein and Chris Manning, Bunt, Carroll, and Satta, eds., New Developments in Parsing Technology, Kluwer Academic Publishers 2002.
- Fast Exact Inference with a Factored Model for Natural Language Processing, Dan Klein and Chris Manning, In Advances in Neural Information Processing Systems 15 (NIPS) 2002. [pdf]
- Conditional Structure versus Conditional Estimation in NLP Models, Dan Klein and Chris Manning, In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP) 2002. [pdf]
- Combining Heterogeneous Classifiers for Word-Sense Disambiguation, Dan Klein, Kristina Toutanova, Tolga Ilhan, Sepandar Kamvar, and Chris Manning, ACL Workshop on Word Sense Disambiguation 2002. [pdf]
- From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering, Dan Klein, Sepandar Kamvar, and Chris Manning, In Proceedings of the International Conference on Machine Learning (ICML) 2002. [pdf]
- Interpreting and Extending Classical Agglomerative Clustering Algorithms using a Model-Based Approach, Sepandar Kamvar, Dan Klein, and Chris Manning, In Proceedings of the International Conference on Machine Learning (ICML) 2002. [pdf]
- Evaluating Strategies for Similarity Search on the Web, Taher Haveliwala, Aristides Gionis, Dan Klein,, and Piotr Indyk, In Proceedings of the International World Wide Web Conference (WWW) 2002. [pdf]
-
2001
- Natural Language Grammar Induction Using a Constituent-Context Model, Dan Klein and Chris Manning, In Advances in Neural Information Processing Systems (NIPS) 2001. [pdf]
- Distributional Phrase Structure Induction, Dan Klein and Chris Manning, In Proceedings of the Conference on Natural Language Learning (CoNLL) 2001. [pdf]
- Parsing and Hypergraphs, Dan Klein and Chris Manning, In Proceedings of the International Workshop on Parsing Technologies (IWPT) 2001. [pdf]
- Parsing with Treebank Grammars: Empirical Bounds, Theoretical Models, and the Structure of the Penn Treebank, Dan Klein and Chris Manning, In Proceedings of the Association for Computational Linguistics (ACL) 2001. [pdf]
- An O(n^3) Agenda-Based Chart Parser for Arbitrary Probabilistic Context-Free Grammars, Dan Klein and Chris Manning, Stanford Technical Report 2001. [pdf]
-
Tutorials
- Structured Bayesian Nonparametric Models with Variational Inference, Presented at ACL 2007 with Percy Liang. [pdf]
- Introduction to Classification: Likelihoods, Margins, Features, and Kernels, Presented at NAACL 2007. [pdf]
- Machine Learning for Natural Language Processing: New Developments and Challenges, Presented at NIPS 2006.
- Max-Margin Methods for NLP: Estimation, Structure, and Applications, Presented at ACL 2005 with Ben Taskar. [pdf]
- Maxent Models, Conditional Estimation, and Optimization, without the Magic, Presented at NAACL 2003 and ACL 2003 with Chris Manning. [pdf slides] [pdf handouts]
- Lagrange Multipliers without Permanent Scarring. Permanently in rough draft form, it seems! [pdf-draft]
Personal
I do actually exist outside of the CS/linguistics world. I took
karate for most of my life, and then spent many year with ballroom
dance. Competitive ballroom dance is just like karate, but with more music
and less scowling. I competed and taught for the Stanford
Ballroom Dance Team, and previously competed for the Cornell
Team and the Oxford Team.
Last modified:
|