Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2009 Research Summary

Discriminative Features for Large Vocabulary Speech Recognition

View Current Project Information

Arlo Faria, Nelson Morgan and Suman Ravuri

Defense Advanced Research Projects Agency GALE

Large-vocabulary speech recognition can be improved by using discriminative features produced with a multi-layer perceptron (MLP) that classifies phones based on a local acoustic context. We have found that performance can be further improved by preparing the training data with idealized features, using forward-backward alignment with hidden Markov models corresponding to the reference word transcriptions. Additionally, we have substantially decreased MLP training times by sampling the training data such that all phone classes are nearly uniformly distributed. Future work will explore MLP structures that process many streams of information, possibly exploiting massively parallel computing with specialized hardware.