EUSIPCO'2002 banner

Paper data
Posteriors correction using feedback synthesis loop in robust ASR

Glotin Herve, erss-cnrs

Page numbers in the proceedings:
Volume III pp 603-606

Language and Speech Recognition

Paper abstract
Current Automatic Speech Recognition (ASR) systems are not efficient under noisy speech. We propose a new strategy to reinforce ASR robustness, based on a feedback loop from recognition posteriors to signal synthesis. The key idea is to use phonemes posteriors generated by recognition to calculate at each frame an acoustic image (AI) and to calculate its correlation with the input signal. AI is the weighted sum of phoneme clean spectrum. Where weights are directly taken as the corresponding phonemes' posteriors. Correlation between AI and the input spectrum gives a Recognition Index (RI). We then show how a simple correction function of posteriors' distribution using RI improves the Word Error Rate in a continuous speech recognition task compared to a state of the art ASR system (Jrasta).

A PDF version is available here