Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences


UC Berkeley


2008 Research Summary

Discriminative Features for Large Vocabulary Speech Recognition

View Current Project Information

Arlo Faria and Nelson Morgan

Large-vocabulary speech recognition can be improved by using discriminative features produced with a multi-layer perceptron (MLP) that classifies phones based on a local acoustic context. We have found that performance can be further improved by preparing the training data with idealized features, using forward-backward alignment with hidden Markov models corresponding to the reference word transcriptions. Additionally, we have substantially decreased MLP training times by sampling the training data such that all phone classes are nearly uniformly distributed. Future work will explore MLP structures that process many streams of information, possibly exploiting massively parallel computing with specialized hardware.