Skip to main content
Erschienen in: Journal of the Association for Research in Otolaryngology 4/2018

09.05.2018 | Research Article

Factors Affecting Speech Reception in Background Noise with a Vocoder Implementation of the FAST Algorithm

verfasst von: Shaikat Hossain, Raymond L. Goldsworthy

Erschienen in: Journal of the Association for Research in Otolaryngology | Ausgabe 4/2018

Einloggen, um Zugang zu erhalten

Abstract

Speech segregation in background noise remains a difficult task for individuals with hearing loss. Several signal processing strategies have been developed to improve the efficacy of hearing assistive technologies in complex listening environments. The present study measured speech reception thresholds in normal-hearing listeners attending to a vocoder based on the Fundamental Asynchronous Stimulus Timing algorithm (FAST: Smith et al. 2014), which triggers pulses based on the amplitudes of channel magnitudes in order to preserve envelope timing cues, with two different reconstruction bandwidths (narrowband and broadband) to control the degree of spectrotemporal resolution. Five types of background noise were used including same male talker, female talker, time-reversed male talker, time-reversed female talker, and speech-shaped noise to probe the contributions of different types of speech segregation cues and to elucidate how degradation affects speech reception across these conditions. Maskers were spatialized using head-related transfer functions in order to create co-located and spatially separated conditions. Results indicate that benefits arising from voicing and spatial cues can be preserved using the FAST algorithm but are reduced with a reduction in spectral resolution.
Literatur
Zurück zum Zitat Arbogast TL, Mason CR, Kidd G (2002) The effect of spatial separation on informational and energetic masking of speech. J Acoust Soc Am 112:2086–2098CrossRef Arbogast TL, Mason CR, Kidd G (2002) The effect of spatial separation on informational and energetic masking of speech. J Acoust Soc Am 112:2086–2098CrossRef
Zurück zum Zitat Balakrishnan U, Freyman RL (2008) Speech detection in spatial and non-spatial speech maskers. J Acoust Soc Am 123:2680–2691CrossRef Balakrishnan U, Freyman RL (2008) Speech detection in spatial and non-spatial speech maskers. J Acoust Soc Am 123:2680–2691CrossRef
Zurück zum Zitat Başkent D, Gaudrain E (2016) Musician advantage for speech-on-speech perception. J Acoust Soc Am 139:EL51–EL56CrossRef Başkent D, Gaudrain E (2016) Musician advantage for speech-on-speech perception. J Acoust Soc Am 139:EL51–EL56CrossRef
Zurück zum Zitat Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press, Cambridge Blauert J (1997) Spatial hearing: the psychophysics of human sound localization. MIT Press, Cambridge
Zurück zum Zitat Bolia RS et al (2000) A speech corpus for multitalker communications research. J Acoust Soc Am 107:1065–1066CrossRef Bolia RS et al (2000) A speech corpus for multitalker communications research. J Acoust Soc Am 107:1065–1066CrossRef
Zurück zum Zitat Brokx JPL, Nooteboom SG (1982) Intonation and the perceptual separation of simultaneous voices. J Phon 10:23–36 Brokx JPL, Nooteboom SG (1982) Intonation and the perceptual separation of simultaneous voices. J Phon 10:23–36
Zurück zum Zitat Bronkhorst AW (2000) The cocktail party phenomenon: a review of research on speech intelligibility in multiple-talker conditions. Act Acust U Acust 86(1):117–128 Bronkhorst AW (2000) The cocktail party phenomenon: a review of research on speech intelligibility in multiple-talker conditions. Act Acust U Acust 86(1):117–128
Zurück zum Zitat Brungart DS (2001a) Evaluation of speech intelligibility with the coordinate response measure. J Acoust Soc Am 109:2276–2279CrossRef Brungart DS (2001a) Evaluation of speech intelligibility with the coordinate response measure. J Acoust Soc Am 109:2276–2279CrossRef
Zurück zum Zitat Brungart DS (2001b) Informational and energetic masking effects in the perception of two simultaneous talkers. J Acoust Soc Am 109:1101–1109CrossRef Brungart DS (2001b) Informational and energetic masking effects in the perception of two simultaneous talkers. J Acoust Soc Am 109:1101–1109CrossRef
Zurück zum Zitat Brungart D, Simpson B (2002) Within-ear and across-ear interference in a cocktail-party listening task. J Acoust Soc Am 112:2985–2995CrossRef Brungart D, Simpson B (2002) Within-ear and across-ear interference in a cocktail-party listening task. J Acoust Soc Am 112:2985–2995CrossRef
Zurück zum Zitat Carlile S, Corkhill C (2015) Selective spatial attention modulates bottom-up informational masking of speech. Sci Rep 5(8662):1–7 Carlile S, Corkhill C (2015) Selective spatial attention modulates bottom-up informational masking of speech. Sci Rep 5(8662):1–7
Zurück zum Zitat Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am 25(5):975–979CrossRef Cherry EC (1953) Some experiments on the recognition of speech, with one and with two ears. J Acoust Soc Am 25(5):975–979CrossRef
Zurück zum Zitat Churchill T et al (2014) Spatial hearing benefits demonstrated with presentation of acoustic temporal fine structure cues in bilateral cochlear implant listeners. J Acoust Soc Am 136:1246–1256CrossRef Churchill T et al (2014) Spatial hearing benefits demonstrated with presentation of acoustic temporal fine structure cues in bilateral cochlear implant listeners. J Acoust Soc Am 136:1246–1256CrossRef
Zurück zum Zitat Cooke M (2006) A glimpsing model of speech perception in noise. J Acoust Soc Am 119(3):1562–1573CrossRef Cooke M (2006) A glimpsing model of speech perception in noise. J Acoust Soc Am 119(3):1562–1573CrossRef
Zurück zum Zitat Darwin C, Hukin R (2000) Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. J Acoust Soc Am 107:970–977CrossRef Darwin C, Hukin R (2000) Effectiveness of spatial cues, prosody, and talker characteristics in selective attention. J Acoust Soc Am 107:970–977CrossRef
Zurück zum Zitat Darwin C, Brungart D, Simpson B (2003) Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. J Acoust Soc Am 114:2913–2922CrossRef Darwin C, Brungart D, Simpson B (2003) Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. J Acoust Soc Am 114:2913–2922CrossRef
Zurück zum Zitat Dorman MF et al (1998) The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6–20 channels. J Acoust Soc Am 104:3583–3585CrossRef Dorman MF et al (1998) The recognition of sentences in noise by normal-hearing listeners using simulations of cochlear-implant signal processors with 6–20 channels. J Acoust Soc Am 104:3583–3585CrossRef
Zurück zum Zitat Dubbelboer F, Houtgast T (2008) The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. J Acoust Soc Am 124:3937–3946CrossRef Dubbelboer F, Houtgast T (2008) The concept of signal-to-noise ratio in the modulation domain and speech intelligibility. J Acoust Soc Am 124:3937–3946CrossRef
Zurück zum Zitat Durlach NI, Mason CR, Kidd Jr. G, Arbogast TL, Colburn HS, Shinn-Cunningham B (2003) Note on informational masking. J Acoust Soc Am in press Durlach NI, Mason CR, Kidd Jr. G, Arbogast TL, Colburn HS, Shinn-Cunningham B (2003) Note on informational masking. J Acoust Soc Am in press
Zurück zum Zitat Fitch WT, Giedd J (1999) Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J Acoust Soc Am 106:1511–1522CrossRef Fitch WT, Giedd J (1999) Morphology and development of the human vocal tract: a study using magnetic resonance imaging. J Acoust Soc Am 106:1511–1522CrossRef
Zurück zum Zitat Freyman RL et al (1999) The role of perceived spatial separation in the unmasking of speech. J Acoust Soc Am 106(6):3578–3588CrossRef Freyman RL et al (1999) The role of perceived spatial separation in the unmasking of speech. J Acoust Soc Am 106(6):3578–3588CrossRef
Zurück zum Zitat Freyman R, Balakrishnan U, Helfer K (2001) Spatial release from informational masking in speech recognition. J Acoust Soc Am 109:2112–2122CrossRef Freyman R, Balakrishnan U, Helfer K (2001) Spatial release from informational masking in speech recognition. J Acoust Soc Am 109:2112–2122CrossRef
Zurück zum Zitat Freyman RL, Balakrishnan U, Helfer KS (2008) Spatial release from masking with noise-vocoded speech. J Acoust Soc Am 124:1627–1637CrossRef Freyman RL, Balakrishnan U, Helfer KS (2008) Spatial release from masking with noise-vocoded speech. J Acoust Soc Am 124:1627–1637CrossRef
Zurück zum Zitat Fuller CD et al (2014) Gender categorization is abnormal in cochlear implant users. J Assoc Res Otolaryngol 15:1037–1048CrossRef Fuller CD et al (2014) Gender categorization is abnormal in cochlear implant users. J Assoc Res Otolaryngol 15:1037–1048CrossRef
Zurück zum Zitat Gallun FJ, Mason CR, Kidd G (2005) Binaural release from informational masking in a speech recognition task. J Acoust Soc Am 118:1614–1625CrossRef Gallun FJ, Mason CR, Kidd G (2005) Binaural release from informational masking in a speech recognition task. J Acoust Soc Am 118:1614–1625CrossRef
Zurück zum Zitat Gaudrain E, Başkent D (2015) Factors limiting vocal-tract length discrimination in cochlear implant simulations. J Acoust Soc Am 137:1298–1308CrossRef Gaudrain E, Başkent D (2015) Factors limiting vocal-tract length discrimination in cochlear implant simulations. J Acoust Soc Am 137:1298–1308CrossRef
Zurück zum Zitat Goldsworthy R (2015) Correlations between pitch and phoneme perception in cochlear implant users and their normal hearing peers. J Assoc Res Otolaryngol 16(6):797–809CrossRef Goldsworthy R (2015) Correlations between pitch and phoneme perception in cochlear implant users and their normal hearing peers. J Assoc Res Otolaryngol 16(6):797–809CrossRef
Zurück zum Zitat Hillenbrand JM, Clark MJ (2009) The role of F0 and formant frequencies in distinguishing the voices of men and women. Atten Percept Psychophys 71(5), pp. 16 Hillenbrand JM, Clark MJ (2009) The role of F0 and formant frequencies in distinguishing the voices of men and women. Atten Percept Psychophys 71(5), pp. 16
Zurück zum Zitat Hirsh IJ (1948) The influence of interaural phase on interaural summation and inhibition. J Acoust Soc Am 20:536–544CrossRef Hirsh IJ (1948) The influence of interaural phase on interaural summation and inhibition. J Acoust Soc Am 20:536–544CrossRef
Zurück zum Zitat Hirsh IJ (1950) The relation between localization and intelligibility. J Acoust Soc Am 22:196–200CrossRef Hirsh IJ (1950) The relation between localization and intelligibility. J Acoust Soc Am 22:196–200CrossRef
Zurück zum Zitat van Hoesel RJ, Tyler RS (2003) Speech perception, localization, and lateralization with bilateral cochlear implants. J Acoust Soc Am 113:1617–1630CrossRef van Hoesel RJ, Tyler RS (2003) Speech perception, localization, and lateralization with bilateral cochlear implants. J Acoust Soc Am 113:1617–1630CrossRef
Zurück zum Zitat Jorgensen S, Ewert SD, Dau T (2013) A multi-resolution envelope-power based model for speech intelligibility. J Acoust Soc Am 134(1):436–446CrossRef Jorgensen S, Ewert SD, Dau T (2013) A multi-resolution envelope-power based model for speech intelligibility. J Acoust Soc Am 134(1):436–446CrossRef
Zurück zum Zitat Kates JM (2011) Spectro-temporal envelope changes caused by temporal fine structure modification. J Acoust Soc Am 129(6):3981–3990CrossRef Kates JM (2011) Spectro-temporal envelope changes caused by temporal fine structure modification. J Acoust Soc Am 129(6):3981–3990CrossRef
Zurück zum Zitat Kidd G Jr et al. (2007) Informational masking. Springer handbook of auditory research 29: auditory perception of sound sources, edited by W. Yost (Springer, New York), pp. 143–190 Kidd G Jr et al. (2007) Informational masking. Springer handbook of auditory research 29: auditory perception of sound sources, edited by W. Yost (Springer, New York), pp. 143–190
Zurück zum Zitat Kidd G Jr et al (1998) Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns. J Acoust Soc Am 104:422–431CrossRef Kidd G Jr et al (1998) Release from masking due to spatial separation of sources in the identification of nonspeech auditory patterns. J Acoust Soc Am 104:422–431CrossRef
Zurück zum Zitat Kidd G Jr, Mason C, Gallun F (2005) Combining energetic and informational masking for speech identification. J Acoust Soc Am 118:982–992CrossRef Kidd G Jr, Mason C, Gallun F (2005) Combining energetic and informational masking for speech identification. J Acoust Soc Am 118:982–992CrossRef
Zurück zum Zitat Leek M, Brown ME, Dorman MF (1991) Informational masking and auditory attention. Percept Psychophys 50:205–214CrossRef Leek M, Brown ME, Dorman MF (1991) Informational masking and auditory attention. Percept Psychophys 50:205–214CrossRef
Zurück zum Zitat Li T, Fu QJ (2011) Voice gender discrimination provides a measure of more than pitch-related perception in cochlear implant users. Int J Audiol 50:498–502CrossRef Li T, Fu QJ (2011) Voice gender discrimination provides a measure of more than pitch-related perception in cochlear implant users. Int J Audiol 50:498–502CrossRef
Zurück zum Zitat Marrone N, Mason CR, Kidd G Jr (2008) Tuning in the spatial dimension: evidence from a masked speech identification task. J Acoust Soc Am 124:1146–1158CrossRef Marrone N, Mason CR, Kidd G Jr (2008) Tuning in the spatial dimension: evidence from a masked speech identification task. J Acoust Soc Am 124:1146–1158CrossRef
Zurück zum Zitat Moon IJ, Won J-H, Park M-H, Ives DT, Nie K, Heinz MG, Lorenzi C, Rubinstein JT (2014) Optimal combination of neural temporal envelope and fine structure cues to explain speech identification in background noise. J Neurosci 34:12145–12154CrossRef Moon IJ, Won J-H, Park M-H, Ives DT, Nie K, Heinz MG, Lorenzi C, Rubinstein JT (2014) Optimal combination of neural temporal envelope and fine structure cues to explain speech identification in background noise. J Neurosci 34:12145–12154CrossRef
Zurück zum Zitat Moore BCJ (2012) An introduction to the psychology of hearing. 6. The Netherlands, Brill Moore BCJ (2012) An introduction to the psychology of hearing. 6. The Netherlands, Brill
Zurück zum Zitat Oxenham AJ, Kreft HA (2014) Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing. Trends Hear 18:1–14 Oxenham AJ, Kreft HA (2014) Speech perception in tones and noise via cochlear implants reveals influence of spectral resolution on temporal processing. Trends Hear 18:1–14
Zurück zum Zitat Ping L et al (2017) Implementation and preliminary evaluation of ‘C-tone’: a novel algorithm to improve lexical tone recognition in Mandarin-speaking cochlear implant users. Cochlear Implants Int 18(5):240–249CrossRef Ping L et al (2017) Implementation and preliminary evaluation of ‘C-tone’: a novel algorithm to improve lexical tone recognition in Mandarin-speaking cochlear implant users. Cochlear Implants Int 18(5):240–249CrossRef
Zurück zum Zitat Poissant SF, Whitmal NA III, Freyman RL (2006) Effects of reverberation and masking on speech intelligibility in cochlear implant simulations. J Acoust Soc Am 119:1606–1615CrossRef Poissant SF, Whitmal NA III, Freyman RL (2006) Effects of reverberation and masking on speech intelligibility in cochlear implant simulations. J Acoust Soc Am 119:1606–1615CrossRef
Zurück zum Zitat Pollack I (1975) Auditory informational masking. J Acoust Soc Am 57:S5CrossRef Pollack I (1975) Auditory informational masking. J Acoust Soc Am 57:S5CrossRef
Zurück zum Zitat Qin MK, Oxenham AJ (2003) Effects of simulated cochlearimplant processing on speech reception in fluctuating maskers. J Acoust Soc Am 114:446–454CrossRef Qin MK, Oxenham AJ (2003) Effects of simulated cochlearimplant processing on speech reception in fluctuating maskers. J Acoust Soc Am 114:446–454CrossRef
Zurück zum Zitat Shannon R et al (1995) Speech recognition with primarily temporal cues. Science 270:303–304CrossRef Shannon R et al (1995) Speech recognition with primarily temporal cues. Science 270:303–304CrossRef
Zurück zum Zitat Skuk VG, Schweinberger SR (2014) Influences of fundamental frequency, formant frequencies, aperiodicity and spectrum level on the perception of voice gender. J Speech Lang Hear Res 57(1):285–296CrossRef Skuk VG, Schweinberger SR (2014) Influences of fundamental frequency, formant frequencies, aperiodicity and spectrum level on the perception of voice gender. J Speech Lang Hear Res 57(1):285–296CrossRef
Zurück zum Zitat Smith DR, Patterson RD (2005) The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. J Acoust Soc Am 118:3177–3186CrossRef Smith DR, Patterson RD (2005) The interaction of glottal-pulse rate and vocal-tract length in judgements of speaker size, sex, and age. J Acoust Soc Am 118:3177–3186CrossRef
Zurück zum Zitat Smith ZM et al (2014) Hearing better with interaural time differences and bilateral cochlear implants. J Acoust Soc Am 135(4):2190–2191CrossRef Smith ZM et al (2014) Hearing better with interaural time differences and bilateral cochlear implants. J Acoust Soc Am 135(4):2190–2191CrossRef
Zurück zum Zitat Stickney G et al (2004) Cochlear implant speech recognition with speech maskers. J Acoust Soc Am 116(2):1081–1091CrossRef Stickney G et al (2004) Cochlear implant speech recognition with speech maskers. J Acoust Soc Am 116(2):1081–1091CrossRef
Zurück zum Zitat Stone MA, Moore BCJ (2014) On the near non-existence of “pure” energetic masking release for speech. J Acoust Soc Am 135(4):1967–1977CrossRef Stone MA, Moore BCJ (2014) On the near non-existence of “pure” energetic masking release for speech. J Acoust Soc Am 135(4):1967–1977CrossRef
Zurück zum Zitat Stone MA et al (2011) The importance for speech intelligibility of random fluctuations in “steady” background noise. J Acoust Soc Am 130(5):2874–2881CrossRef Stone MA et al (2011) The importance for speech intelligibility of random fluctuations in “steady” background noise. J Acoust Soc Am 130(5):2874–2881CrossRef
Zurück zum Zitat Stone MA, Fullgrabe C, Moore BCJ (2012) Notionally steady background noise acts primarily as a modulation masker of speech. J Acoust Soc Am 132(1):317–326CrossRef Stone MA, Fullgrabe C, Moore BCJ (2012) Notionally steady background noise acts primarily as a modulation masker of speech. J Acoust Soc Am 132(1):317–326CrossRef
Zurück zum Zitat Swaminathan J et al (2016) Role of binaural temporal fine structure and envelope cues in cocktail-party listening. J Neurosci 36(31):8250–8257CrossRef Swaminathan J et al (2016) Role of binaural temporal fine structure and envelope cues in cocktail-party listening. J Neurosci 36(31):8250–8257CrossRef
Zurück zum Zitat Vandali AE et al (2005) Pitch ranking ability of cochlear implant recipients: a comparison of sound-processing strategies. J Acoust Soc Am 117(5):3126–3138CrossRef Vandali AE et al (2005) Pitch ranking ability of cochlear implant recipients: a comparison of sound-processing strategies. J Acoust Soc Am 117(5):3126–3138CrossRef
Zurück zum Zitat Vandali AE, Dawson PW, Arora K (2016) Results using the OPAL strategy in Mandarin speaking cochlear implant recipients. Int J Audiol Jun 22, pp. 1–12 Vandali AE, Dawson PW, Arora K (2016) Results using the OPAL strategy in Mandarin speaking cochlear implant recipients. Int J Audiol Jun 22, pp. 1–12
Zurück zum Zitat Watson CS (2005) Some comments on informational masking. Acta Acoust 91:502–512 Watson CS (2005) Some comments on informational masking. Acta Acoust 91:502–512
Zurück zum Zitat Yost B (2006) Informational masking: what is it?, in paper presented at the 2006 Computational and Systems Neuroscience (Cosyne) meeting Yost B (2006) Informational masking: what is it?, in paper presented at the 2006 Computational and Systems Neuroscience (Cosyne) meeting
Zurück zum Zitat Zirn S et al (2016) Perception of interaural phase differences with envelope and fine structure coding strategies in bilateral cochlear implant users. Trends Hear 20:2331216516665608PubMedPubMedCentral Zirn S et al (2016) Perception of interaural phase differences with envelope and fine structure coding strategies in bilateral cochlear implant users. Trends Hear 20:2331216516665608PubMedPubMedCentral
Zurück zum Zitat Zurek PM (1993) Binaural advantages and directional effects in speech intelligibility. Acoustical factors affecting hearing aid performance, edited by G.A. Studebaker & I. Hochberg, pp. 255-275 Zurek PM (1993) Binaural advantages and directional effects in speech intelligibility. Acoustical factors affecting hearing aid performance, edited by G.A. Studebaker & I. Hochberg, pp. 255-275
Metadaten
Titel
Factors Affecting Speech Reception in Background Noise with a Vocoder Implementation of the FAST Algorithm
verfasst von
Shaikat Hossain
Raymond L. Goldsworthy
Publikationsdatum
09.05.2018
Verlag
Springer US
Erschienen in
Journal of the Association for Research in Otolaryngology / Ausgabe 4/2018
Print ISSN: 1525-3961
Elektronische ISSN: 1438-7573
DOI
https://doi.org/10.1007/s10162-018-0672-x

Weitere Artikel der Ausgabe 4/2018

Journal of the Association for Research in Otolaryngology 4/2018 Zur Ausgabe

Kinder mit anhaltender Sinusitis profitieren häufig von Antibiotika

30.04.2024 Rhinitis und Sinusitis Nachrichten

Persistieren Sinusitisbeschwerden bei Kindern länger als zehn Tage, ist eine Antibiotikatherapie häufig gut wirksam: Ein Therapieversagen ist damit zu über 40% seltener zu beobachten als unter Placebo.

CUP-Syndrom: Künstliche Intelligenz kann Primärtumor finden

30.04.2024 Künstliche Intelligenz Nachrichten

Krebserkrankungen unbekannten Ursprungs (CUP) sind eine diagnostische Herausforderung. KI-Systeme können Pathologen dabei unterstützen, zytologische Bilder zu interpretieren, um den Primärtumor zu lokalisieren.

Sind Frauen die fähigeren Ärzte?

30.04.2024 Gendermedizin Nachrichten

Patienten, die von Ärztinnen behandelt werden, dürfen offenbar auf bessere Therapieergebnisse hoffen als Patienten von Ärzten. Besonders gilt das offenbar für weibliche Kranke, wie eine Studie zeigt.

Akuter Schwindel: Wann lohnt sich eine MRT?

28.04.2024 Schwindel Nachrichten

Akuter Schwindel stellt oft eine diagnostische Herausforderung dar. Wie nützlich dabei eine MRT ist, hat eine Studie aus Finnland untersucht. Immerhin einer von sechs Patienten wurde mit akutem ischämischem Schlaganfall diagnostiziert.

Update HNO

Bestellen Sie unseren Fach-Newsletter und bleiben Sie gut informiert – ganz bequem per eMail.