Electrical Engineering
      and Computer Sciences

Electrical Engineering and Computer Sciences

COLLEGE OF ENGINEERING

UC Berkeley

   

2008 Research Summary

Using Laughter in Speaker Recognition

View Current Project Information

Mary Tai Knox, Nikki Mirghafori1 and Nelson Morgan

Audio communication contains a wealth of information in addition to spoken words. Specifically, laughter provides cues regarding the emotional state of the speaker, topic changes in a conversation, and the speaker's identity. Currently, our goal is to develop an automatic speaker recognition system which relies on features from laughter segments.

Since most speaker recognition datasets do not consistently transcribe laughter, we need to first build an automatic laughter segmenter. Previously, we used neural networks trained with short-term features (including Mel-cepstral coefficients, pitch, and energy) to compute the probability that each frame was laughter. This system had an 8% equal error rate (EER). While the EER was quite promising we found that within a laughter segment, the output probability varied more than desired causing the system to classify sequential frames as both laughter and non-laughter. We are currently working to improve our results such that the audio is more consistently marked with a single start and stop time for each laughter segment.

1International Computer Science Institute