Introduction
It is well known that noise overexposure can impair the auditory system by producing a sensorineural hearing loss, seen in a permanent elevation of pure-tone detection thresholds. This has led to the interpretation that sound stimulation producing only a temporary threshold shift (TTS), but not a permanent threshold shift (PTS), does not permanently damage the auditory system. However, it has been reported that, despite normal sensitivity to pure tones, some listeners complain about having listening difficulties in challenging acoustical situations (Hind et al.
2011; Kumar et al.
2007; Saunders and Haggard
1989; Tremblay et al.
2015).
Recent animal studies have shown that noise overexposure producing TTS can in fact lead to the loss of AN fiber synapses, without damaging the sensitive hair cells in the cochlea Kujawa and Liberman (
2009). As this neuronal degeneration does not result in a PTS, it has been termed “hidden” hearing loss (Schaette and McAlpine
2011). (Kujawa and Liberman
2009) demonstrated in mice that “hidden” hearing loss, or more accurately cochlear synaptopathy (CS) (for a review, see Liberman and Kujawa
2017), resulting from carefully controlled noise exposure, did not alter hearing thresholds. It was further shown that the magnitude-level function of distortion-product otoacoustic emissions (DPOAE) remained unaffected in the same mice. These results indicate that the outer hair cells (OHC) were not damaged due to the noise exposure. The amplitude of the auditory brainstem response (ABR) wave-I, on the other hand, was reduced at supra-threshold sound pressure levels (SPL). Wave-I is thought to reflect the action potentials of the AN, and should therefore be sensitive to a loss of AN fiber synapses. It has been suggested that a
selective (meaning
predominant) loss of medium- and low-spontaneous rate (SR) fibers could account for the reduction of supra-threshold ABR wave-I magnitudes, while still preserving normal thresholds (Furman et al.
2013). A reanalysis of the data from Furman et al. (
2013) concluded that there was indeed a loss of high-SR fibers at a ratio of about 1:3 with loss of low- and medium-SR fibers (Marmel et al.
2015). Thus, although medium- and low-SR fibers may be more affected than high-SR fibers, all fibers are likely affected to some degree. In fact, Bourien et al. (
2014) showed that changes in ABR wave-I amplitudes are more likely to be due to loss of high-SR fibers than of medium- and low-SR fibers. Additionally, Lobarinas et al. (
2013) reported that, even in the case of a substantial loss of inner hair cells (IHC) and AN fibers, behavioral pure-tone thresholds remained unchanged, suggesting that even a substantial loss of high-SR fibers would not produce PTS. Nevertheless, many hypotheses about CS in humans (including this study) start with an assumption that low-SR fibers are more affected than other fibers and that the spiking rate of the high-SR fibers saturates at supra-threshold levels (e.g., Bharadwaj et al.
2015; Mehraei et al.
2016; Paul et al.
2017; Valero et al.
2018).
Noise-induced CS has been observed in several non-human mammalian species, such as mice (Furman et al.
2013; Kujawa and Liberman
2009), guinea pigs (Lin et al.
2011; Liu et al.
2012), rats (Lobarinas et al.
2017), and rhesus macaques (Valero et al.
2017). CS has also been reported as a natural phenomenon in the normally aging (non-exposed) mouse ear (Sergeyenko et al.
2013). Noise exposure seems to accelerate this natural degeneration of the AN (Fernandez et al.
2015). In humans, there is some evidence of such age-related CS (Makary et al.
2011; Viana et al.
2015; Wu et al.
2018). Elderly subjects show losses of over 60 % of their synapses compared to younger (Wu et al.
2018). In addition, the loss of peripheral axons in normal aging humans is significantly greater than the loss of spiral ganglion cells (SGC) (Viana et al.
2015), like as reported in mice (Sergeyenko et al.
2013). This suggests that SGC survive for months after the loss of their peripheral axons (Kujawa and Liberman
2015). However, clear evidence of noise-induced CS in living humans has not yet been proven, and the potential perceptual consequences remain unknown (Oxenham
2016; Plack et al.
2014), despite attempts to identify them in large studies (e.g., Grose et al.
2017; Le Prell et al.
2018; Lopez-Poveda et al.
2017; Prendergast et al.
2017).
Animal studies suggest that CS is reflected in electroencephalographic (EEG) evoked response measurements, such as ABR wave-I (Furman et al.
2013; Kujawa and Liberman
2009) or envelope following responses (EFR) (Parthasarathy and Kujawa
2018; Shaheen et al.
2015). Some researchers have attempted to relate changes in evoked responses to self-reported estimates of noise exposure in humans (Prendergast et al.
2017). To date, no correlation has been found. However, noise exposure scores derived from self-reported questionnaires of lifetime noise exposure rely on the subjective recall of noisy events. Furthermore, they are generally based on numerous assumptions limiting their reliability (Coughlin
1990). Other studies have found correlations between evoked responses and behavioral measures of temporal processing at supra-threshold levels in individual NH threshold listeners (Bharadwaj et al.
2015; Mehraei et al.
2016). In these studies, poorly performing listeners were hypothesized to suffer from CS. The inconclusive outcome of the human studies can be attributed, in part, to the impossibility of directly assessing the status of the AN fiber synapses in living humans. Non-invasive evoked responses can be performed both in humans and non-human animals. Comparing these measures across different species could help to connect careful experimentally induced CS in non-human animals to its (potential) presence in humans. However, evoked responses measured using surface (scalp) electrodes represent the far-field sum of the activity of large populations of neurons, which might not be sensitive to specific local neuronal damage, or may require carefully designed stimuli and recording techniques to reveal such loss.
In the present study, EFRs were measured as a function of stimulus level using both deep and shallow modulations of SAM tones. The listeners had either normal audiometric thresholds or a mild hearing impairment above 3 kHz. We hypothesized that a preferential loss of medium- and low-SR fibers would reduce the EFR magnitudes at high supra-threshold stimulus levels, whereas the responses at lower levels would remain unaffected. We therefore predicted that depending on whether or not medium- and low-SR fibers were present, the slope of the EFR magnitude-level functions at supra-threshold input levels would differ. We expected that such a reduction or slope change would be more pronounced in the EFR responses elicited by shallowly modulated tones than deeply modulated tones. This was based on the argument that high-intensity shallowly modulated stimuli are preferentially encoded by medium- and low-SR fibers (Bharadwaj et al.
2014; Bharadwaj et al.
2015). For HI listeners, the EFR magnitude-level functions at both modulation depths were recorded with the stimulus presented only at a frequency where listener’s audiograms were within the normal range, to increase the likelihood of the presence of CS. It has been proposed that CS might be a precursor of subsequent hair-cell damage (Kujawa and Liberman
2015; Liberman and Kujawa
2017; Sergeyenko et al.
2013). It was assumed that listeners who already show a threshold elevation (and therefore hair-cell dysfunction) at higher audiometric frequencies potentially suffer from CS at lower audiometric frequencies with normal thresholds.
As the history of noise exposure in both NH threshold listeners and HI listeners in this study is unknown, and given that estimates of lifetime noise exposure have failed to predict CS in humans in previous studies (e.g., Prendergast et al.
2017), the present study focused on individual differences in EFR magnitude-level functions and their potential relation to CS. In order to assist with the interpretation and the potential effect of CS on the obtained EFRs, a computational model of the AN was used to study the effects of a differential loss of the different AN fiber types on the EFR magnitude-level functions. The aim of the study was thus to investigate whether a computational model of the AN with simulated CS can account for individual patterns observed in the EFR magnitude-level functions recorded in audiometrically homogeneous listeners at the stimulus frequencies at which they were excited (below 3 kHz).
Conclusions
EFR magnitude-level functions recorded from a group of young NH threshold listeners showed individual differences for deeply and shallowly modulated tones, indicating differences in neural supra-threshold encoding of envelope modulations. Similar differences for mild HI listeners measured at an audiometrically normal center frequency supported the idea of coexisting hearing loss due to hair-cell dysfunction and supra-threshold deficits at frequencies of normal sensitivity.
A model of AN activity was able to account for the monotonic growth with level observed in the recorded EFR magnitude-level functions of the NH threshold listeners. Hair-cell dysfunction, with or without a postulated steep sloping threshold elevation at extended audiometric frequencies beyond 8 kHz, was not sufficient to explain the non-monotonic trends obtained in the EFR data for some of the particular NH threshold listeners. Similarly, hair-cell dysfunction alone could not account for the EFR data recorded in the HI listeners. This suggests that additional damage, namely CS, must be included in the model to account for the recorded EFRs. A loss of all types of AN fibers (including high-SR fibers) at a specific cochlear frequency range needed to be implemented in the model to account for the data of some NH threshold listeners showing reduced EFR magnitudes at mid-stimulation levels. A loss of exclusively medium- and low-SR fibers had no impact on the simulated EFR magnitude-level functions, which were essentially the same as those obtained in non-synaptopathic simulations. The same was found for CS in HI listeners, where a large loss of all three AN fiber types had to be included in a broad CF range to match measured results.
Overall, the data and the simulations suggest that, when using SAM tones in quiet as sound stimuli, EFRs are dominated by high-SR fibers, and that off-frequency neurons increasingly contribute to the EFR with increasing stimulus level. The finding that the envelope is better encoded at off-frequency CFs (rather than on-frequency) when SAM tones in quiet are presented at high stimulus levels must be considered when using EFRs to investigate supra-threshold coding with these stimulus paradigms. An in-depth modeling and experimental analysis on the effect of noise makers to fully attenuate off-frequency contributions could be of interest when investigating the use of EFR to diagnose CS in living human listeners. In addition, parallel electrophysiological studies in humans and non-human animals where CS has been characterized (e.g., mice), together with the use of species-specific computational models, are needed to quantify the potential consequences of CS in humans.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.