AUDIO SOURCE SEGMENTATION USING SPECTRAL CORRELATION FEATURES FOR AUTOMATIC INDEXING OF BROADCAST NEWS (FriAmOR6)
Author(s) :
Shoich Matsunaga (NTT Cyber Space Laboratories, Japan)
Osamu Mizuno (NTT east, Japan)
Katsutoshi Ohtsuki (NTT Cyber Space Laboratories, Japan)
Yoshihiko Hayashi (NTT Cyber Space Laboratories, Japan)
Abstract : This paper proposes a new segmentation procedure to detect audio source intervals for automatic indexing of broadcast news. The procedure is composed of an audio source detec-tion part and a part that smoothes the detected sequences. The detection part uses three new acoustic feature parame-ters that are based on spectral cross-correlation: spectral stability, white noise similarity, and sound spectral shape. These parameters make it possible to capture the audio sources more accurately than can be done with conventional parameters. The smoothing part has a new merging method that drops erroneous detection results of short duration. Au-dio source classification experiments are conducted on broadcast news segments. Performance is increased by 6.6% when the proposed parameters are used and by 3.1% when the proposed merging method is used, showing the useful-ness of our approach. Experiments confirm the impact of this proposal on broadcast news indexing.

Menu