A Statistical Model for Integrating Narrowband Cues in Speech

EECS Joint Colloquium Distinguished Lecture Series

pic of Larry Saul

Dr. Larry Saul
AT&T Research

Wednesday, February 28, 2001
Hewlett Packard Auditorium, 306 Soda Hall
4:00-5:00 p.m.

Abstract:

Problems in voice processing present a number of unifying challenges for researchers in artificial intelligence, statistical learning, and signal processing. How are sensory cues distributed in frequency and time? How is information combined from multiple sources? How can we match the incredible robustness of human listeners? These are fundamental questions that will frame the future of man-machine communication. With these issues in mind, I will describe a statistical model that we have investigated for combining narrowband cues in speech. The model is inspired by two old ideas in human speech perception: (i) Fletcher's hypothesis (1953) that independent detectors, working in narrow frequency bands, account for the robustness of auditory strategies, and (ii) Miller and Nicely's analysis (1955) that perceptual confusions in noisy bandlimited speech are correlated with phonetic features. I will show that the structure of our model makes it naturally robust to many types of noise and filtering. Finally, I will sketch directions for future work, concluding that problems in voice processing provide an exceptionally rich proving ground for modern methods in pattern recognition and machine learning. (Joint work with Mazin Rahim and Jont Allen, AT&T Labs.)



Biography:

Dr. Larry Saul is a principal technical staff member in the Speech and Image Processing center at AT&T Labs. He joined the center in 1996. His research aims to develop all aspects of man-machine communication, particularly the areas of voice processing and automatic speech recognition. He also has broad interests in artificial intelligence, statistical learning, pattern recognition, and neural computation. He received his Bachelor's Degree summa cum laude in Physics at Harvard College and his PhD in Physics from MIT. He was an NSF postdoctural fellow at the Center for Biological and Computational Learning at MIT. Dr. Saul was named one of the Top 100 Young Innovators of 1999 by Technology Review for inventing the concept of Markov Processes on Curves, which could lead to further breakthroughs in speech recognition technology. He is currently a member of the editorial board for the Journal of Machine Learning Research and has also served on program committees for conferences in Neural Information Processing Systems (1997-1999) and Artificial Intelligence and Statistics (1999,2001).



231cory@EECS.Berkeley.EDU