# Learning with imperfect data

### Mehryar Mohri

Courant Institute of Mathematical Sciences

### Abstract

Earlier learning theory and algorithms were developed for an ideal
world. Modern large-scale data sets and applications bring
forth problems that must be addressed for learning to be effective,
e.g., training points are often poorly labeled, the sample can be
biased, the distributions may drift with time, and the sample points
may not be i.i.d.

This talk will address the specific problem of domain adaptation which
arises when the distribution of the source labeled data somewhat
differs from that of the target domain. It will present novel
theoretical results for adaptation and provide algorithmic solutions
derived from that theory. It will also report some preliminary
experimental results.

Joint work with Yishay Mansour and Afshin Rostamizadeh.

Mehryar Mohri is a Professor of Computer Science at the Courant
Institute of Mathematical Sciences in NY. He has done his
undergraduate studies at Ecole Polytechnique and his graduate and
Ph.D. studies in math and computer science in Paris at Ecole Normale
Superieure d'Ulm and University Paris 7 - Denis Diderot.

Mohri worked for about ten years at AT&T Research, formerly AT&T Bell
Labs (1995-2004), where, in the last four years, he served as the Head
of the Speech Algorithms Department and as a Technology Leader,
overseeing research projects in machine learning, text and speech
processing, and the design of general algorithms.

Dr. Mohri is also a Research consultant at Google Research. His
current topics of interest are machine learning, theory and
algorithms, text and speech processing, and computational biology.