Author(s) :
Sungyub Yoo (Univ of Pittsburgh, USA)
J. Robert Boston (Univ of Pittsburgh, USA)
John D. Durrant (Univ of Pittsburgh, USA)
Kristie Kovacyk (Univ of Pittsburgh, USA)
Stacey Karn (Univ of Pittsburgh, USA)
Susan Shaiman (Univ of Pittsburgh, USA)
Amro El-Jaroudi (Univ of Pittsburgh, USA)
Ching-Chung Li (Univ of Pittsburgh, USA)
Abstract : It is generally recognized that consonants are more critical than vowels to speech intelligibility, but we suggest that important information is contained in transient speech components, rather than the quasi-steady-state components of both consonants and vowels. Fixed-frequency filters cannot uniquely separate transients from the more steady-state vowel formants and consonant hubs, even though the former are predominately low frequency and the latter, high frequency. To study the relative speech intelligibility of the transient versus steady-state components, we employed an algorithm based on time-frequency analysis to extract quasi-steady-state energy from the speech signal, leaving a residual signal of predominantly transient components. Psychometric functions were measured for speech recognition of processed and unprocessed monosyllabic words. The transient components were found to account for approximately 2% of the energy of the original speech, yet were nearly equally intelligible. As hypothesized, the quasi-steady-state components contained much greater energy while providing significantly less intelligibility.