VOICE SEPARATION OF OVERLAPPING SPEECH USING TRACKING TECHNIQUES AND THE GATING PROCESS (WedPmOR6)
Author(s) :
Ilyas Potamitis (WCL, Greece)
Panos Zervas (WCL, Greece)
Nikos Fakotakis (WCL, Greece)
Abstract : This paper investigates the use of tracking techniques suc-cessfully applied to aircraft tracking and navigation to segment possibly overlapping speech of multiple static speakers in an enclosure. The tracking technique applied, namely the probabilistic data association (PDA) in con-junction with the interacting multiple model (IMM) esti-mator directly accounts for measurement origin uncer-tainty, i.e., which direction of arrival (DOA) measurement comes from which speaker and rejects spurious DOAs. The estimated DOAs are utilized by a single microphone array to provide separation through its directional receptive field. Based on the prediction of the IMM filter that constructs permissible DOA regions for each speaker (gates), we elaborate on the concept and application of the so called ‘gating process’ that can be utilized in the initialization and termination of speech tracks, thus serving as a voice activ-ity detector (VAD). The effectiveness of the approach is illustrated by extensive simulation study on tracking and separating three static speakers having a conversation with partially overlapping speech and long pauses.

Menu