EUSIPCO'2002 - Actes du colloque

> Home > Paper #005

Paper data Title: A Comparison of Sinusoidal Model Variants for Speech and Audio Representation Author(s): Jensen Jesper, Technical University of Delft, Delft, The Netherlands. Heusdens Richard, Technical University of Delft, Delft, The Netherlands. Page numbers in the proceedings: Volume I pp 479-482 Session: Echo Cancellation and Speech Enhancement Paper abstract Two sinusoidal model variants for speech and audio representation are compared: the traditional constant-amplitude, constant-frequency sinusoidal model, and a generalized model where amplitudes can vary exponentially with time. Two classes of methods for estimation of model parameters are reviewed: matching pursuit (MP) and subspace based schemes. Furthermore, Newton optimized versions of these schemes are included in the study. The influence of model type and parameter estimation scheme on model performance was evaluated in simulation experiments with audio and speech signals. As expected, the exponential model outperforms the traditional sinusoidal model in segments with large signal level variations. For the non-optimized estimation schemes, the subspace method generally performs better than the MP method (an SNR gain of 2-7 dB was observed). Newton optimization improves the modeling performance significantly in all cases, and results in slightly better performance with MP (an SNR gain of 1-2 dB) compared to the subspace method. Paper A PDF version is available here
[ Programme \| Find by author \| Find by keyword \| Find by paper code \| Contribution list \| About Eusipco'2002 \| Help \| Eusipco'2004 \| Home page]

Title:
A Comparison of Sinusoidal Model Variants for Speech and Audio Representation

Author(s):
Jensen Jesper, Technical University of Delft, Delft, The Netherlands.
Heusdens Richard, Technical University of Delft, Delft, The Netherlands.

Page numbers in the proceedings:
Volume I pp 479-482

Session:
Echo Cancellation and Speech Enhancement

Two sinusoidal model variants for speech and audio representation are compared: the traditional constant-amplitude, constant-frequency sinusoidal model, and a generalized model where amplitudes can vary exponentially with time. Two classes of methods for estimation of model parameters are reviewed: matching pursuit (MP) and subspace based schemes. Furthermore, Newton optimized versions of these schemes are included in the study. The influence of model type and parameter estimation scheme on model performance was evaluated in simulation experiments with audio and speech signals. As expected, the exponential model outperforms the traditional sinusoidal model in segments with large signal level variations. For the non-optimized estimation schemes, the subspace method generally performs better than the MP method (an SNR gain of 2-7 dB was observed). Newton optimization improves the modeling performance significantly in all cases, and results in slightly better performance with MP (an SNR gain of 1-2 dB) compared to the subspace method.