Paper data
A Comparison of Sinusoidal Model Variants for Speech and Audio Representation

Jensen Jesper, Technical University of Delft, Delft, The Netherlands.
Heusdens Richard, Technical University of Delft, Delft, The Netherlands.

Page numbers in the proceedings:
Volume I pp 479-482

Echo Cancellation and Speech Enhancement

Paper abstract
Two sinusoidal model variants for speech and audio representation are compared: the traditional constant-amplitude, constant-frequency sinusoidal model, and a generalized model where amplitudes can vary exponentially with time. Two classes of methods for estimation of model parameters are reviewed: matching pursuit (MP) and subspace based schemes. Furthermore, Newton optimized versions of these schemes are included in the study. The influence of model type and parameter estimation scheme on model performance was evaluated in simulation experiments with audio and speech signals. As expected, the exponential model outperforms the traditional sinusoidal model in segments with large signal level variations. For the non-optimized estimation schemes, the subspace method generally performs better than the MP method (an SNR gain of 2-7 dB was observed). Newton optimization improves the modeling performance significantly in all cases, and results in slightly better performance with MP (an SNR gain of 1-2 dB) compared to the subspace method.

