PERFORMANCE ANALYSIS OF THE AURORA LARGE VOCABULARY BASELINE SYSTEM (WedAmOR2)
Author(s) :
Naveen Parihar (Institute for Signal and Information Processing, USA)
Joseph Picone (Institute for Signal and Information Processing, USA)
David Pearce (Speech and MultiModal Group, Motorola Labs, UK, UK)
Hans-Günter Hirsch (Department of Electrical and Computer Science, Niederrhein University, Germany)
Abstract : In this paper, we present the design and analysis of the baseline recognition system used for ETSI Aurora large vocabulary (ALV) evaluation. The experimental paradigm is presented along with the results from a number of experiments designed to minimize the computational requirements for the system. The ALV baseline system achieved a WER of 14.0% on the standard 5K Wall Street Journal task, and required 4 xRT for training and 15 xRT for decoding (on an 800 MHz Pentium processor). It is shown that increasing the sampling frequency from 8 kHz to 16 kHz improves performance significantly only for the noisy test conditions. Utterance detection resulted in significant improvements only on the noisy conditions for the mismatched training conditions. Use of the DSR standard VQ-based compression algorithm did not result in a significant degradation. The model mismatch and microphone mismatch resulted in a relative increase in WER by 300% and 200%, respectively.

Menu