INTRODUCTION
An important ability of the auditory system is spatial hearing. This ability enables the localization of sound sources in auditory space (see Middlebrooks and Green
1991, for a review) and to improve the understanding of speech in environments with interfering sound sources (e.g., Bronkhorst
2000). The complete set of cues underlying spatial hearing can be derived from the head-related transfer functions of the two ears relative to the location of the sound source. Two of these cues are interaural disparities in time (interaural time difference, ITD) and interaural disparities in level (interaural level difference, ILD). When presented in isolation under headphone conditions, sounds with ITD and ILD information are commonly not localized in space outside the listeners’ head but rather lateralized towards one side inside the listeners’ head. Both ITDs and ILDs separately as well as in combination allow a listener to lateralize an acoustic stimulus. Under free-field conditions, ITDs are predominantly used at low frequencies and ILDs are predominantly used at high frequencies (Rayleigh
1907). The spectral dominance of ITD and ILD is commonly referred to as the Duplex theory and are widely acknowledged in the literature (e.g., Macpherson and Middlebrooks
2002). It is, however, less clear what the contribution of the different cues in the different frequency bands is in more realistic conditions with broadband stimuli.
At low frequencies, the ITD is a reliable cue and small changes in ITD can be detected. At high frequencies, the ITD of the fine structure becomes imperceptible (Rayleigh
1907; Blauert
1984; Moore
2014; Brughera et al.
2013) and hence unavailable as a cue for lateralization. For narrow band signals, ITD detection thresholds have been shown to be lowest between 700 and 1000 Hz and to increase towards lower and higher frequencies (Klumpp and Eady
1956; Brughera et al.
2013). The upper frequency limit of fine structure ITD detection was shown to be at about 1.5 kHz (Moore
2014; Zwislocki and Feldman
1956; Klumpp and Eady
1956; Brughera et al.
2013). At frequencies above 1.5 kHz, ITD cues in the envelope of a signal can be detected when imposed on carrier frequencies well above 1.5 kHz (Henning
1974; McFadden and Pasanen
1976; Bernstein and Trahiotis
1994; Nuetzel and Hafter
1976; Leakey et al.
1958). At high frequencies, the wavelength of the sound waves becomes small in comparison with the size of the head. This leads to a reduction in sound intensity at the contralateral ear relative to the ipsilateral ear and leads to ILD cues (Blauert
1984). ILD detection thresholds have been shown to be approximately constant over a broad range of frequencies, except at 1 kHz where the threshold has been shown to be higher (Yost and Dye
1988; Grantham
1984; Rowland and Tobias
1967; Mills
1960).
Previous studies have suggested that the frequency-specific detection thresholds of the interaural disparities are related to their relative contribution to spatial hearing. One approach used to derive the spectral weighting is to invert discrimination thresholds for narrowband signals and to calculate sensitivity under the assumption that high sensitivity indicates a high relative contribution, and hence a high weighting for the cue in that frequency band. For ILDs, this approach leads to a constant ILD weighting across frequency bands and for ITDs to a weighting that is maximal between 700 and 1000 Hz and decreases towards high and low frequencies (Raatgever
1980; Stern et al.
1988; Buchholz et al.
2018).
However, detection (McFadden and Pasanen
1976) and discrimination (Heller and Richards
2010; Trahiotis and Bernstein
1990) thresholds of ITD and ILD as well as the lateralization extent (Heller and Trahiotis
1996) are affected by the presence of signals in remote spectral regions, a phenomenon known as binaural interference. For example, the ITD threshold of a probe in a frequency band is increased in the presence of an interfering signal in a frequency band lower than the probe frequency (Best et al.
2007). Thus, the spectral weights obtained by inverting thresholds for narrowband signals in isolation might not be applicable for broadband signals.
Furthermore, Buell and Hafter (
1991) showed that binaural information are summed across frequency bands only if the information belong to the same auditory object. Thus, when multiple auditory objects of target and interfering signal are formed, no binaural interference occurs. In addition, it has been shown that pre- and post-cursors, i.e., signals preceding and following a target, can reduce the detection threshold of a masked signal and make the signal perceptually ‘pop out’ of a simultaneous interfering signal, also referred to as auditory enhancement (Viemeister
1980). Byrne et al. (
2011) showed that such an enhancement paradigm led to a 4–5 dB increase in perceived level of the target signal in a binaural centering task. Thus, the effect of auditory enhancement might also affect the apparent spectral weighting of the interaural disparities.
In the present study, we investigated how the auditory system integrates either of the binaural cues of ITD or ILD across frequency for broadband signals in a lateralization task. An observer-response weighting analysis paradigm was used to determine relative contributions of spectral bands to sound lateralization. Previously, this analysis has been used to estimate the spectral and temporal weights for the judgment of spectral shape (Lutfi and Jesteadt
2006; Berg
1990), for spectral weights of loudness (Jesteadt et al.
2014; Joshi et al.
2016; Leibold et al.
2007; Leibold et al.
2009; Oberfeld et al.
2012), for level discrimination (Lutfi
1989; Kortekaas et al.
2003), and for temporal weights of ITDs and ILDs (Stecker and Hafter
2002; Brown and Stecker
2010; Brown and Stecker
2011; Stecker et al.
2013; Stecker
2014; Stecker
2018; Dye et al.
2005). Using an observer-response weighting analysis enables the estimation of weights using stimuli with interaural disparities above threshold and taking binaural interference into consideration. In the first experiment, the spectral weights were derived by imposing semi-random permutations of ITD or ILD in multiple frequency bands and asking listeners to lateralize (left or right) the stimulus. This was done separately for ITD and ILD. In a second experiment, the spectral weights of ITDs and ILDs were derived in the presence of pre- and post-cursors to investigate the effect of auditory enhancement on the spectral weights. The range of values of the ITDs and ILD were chosen to be independent of frequency to anticipate potential differences in ITD and ILD detection threshold between the stimulus used in the current study and isolated pure tones or narrow band noises as previously used in literature.
DISCUSSION
In the present study, the spectral weighting for a stimulus consisting of 11 simultaneously presented 1-ERB-wide noise bands with ITDs or ILDs was investigated. It was shown that the highest weight for ITDs was given to the frequency band with the lowest center frequency, and the highest weight for ILD was given to the frequency band with the highest center frequency. The remaining bands received substantially lower weights than these edge bands. This “edge effect” was also found when reducing the overall bandwidth of the stimulus. When presenting interaurally uncorrelated noise as the edge bands, no change in weight was observed, resulting in a weighting function with equal spectral weights. The auditory enhancement paradigm in experiment 2 led to an increase of the on-frequency band. This enhancement of the weight was found for ITDs at low frequencies and for ILDs at low and high frequencies.
The results from this study are in general agreement with the Duplex theory (Rayleigh
1907; Macpherson and Middlebrooks
2002): ITDs receive the highest weight at low frequencies and ILDs at high frequencies. However, when assuming that ITD information is prominent at low frequencies and that the amount of useful information gradually decreases towards high frequencies (Klumpp and Eady
1956; Brughera et al.
2013), and vice versa for ILDs (Mills
1960), the findings of the present study show a different pattern. Instead of a gradual change of the weights, a sharp transition from high to low weights was found at the frequency bands located at the spectral edges of the stimulus. Such an edge effect has previously been shown in temporal weighting functions using a similar method (e.g., Stecker and Hafter
2002). Generally, the auditory system seems to weigh information on the edges stronger, as found for the binaural edge pitch effect (Klein and Hartmann
1981) or for loudness perception of multiple spectral or temporal components (Joshi et al.
2016; Oberfeld et al.
2012). However, this edge effect is at odds with estimates of spectral weights based on ITD and ILD thresholds where a gradual change of the spectral weight for ITDs has been shown (Raatgever
1980; Stern et al.
1988; Buchholz et al.
2018). A reason for this difference might be that the listeners in the current study had to integrate the binaural information across frequencies and, thus, binaural interference (McFadden and Pasanen
1976), i.e., across-channel interference, was taken into consideration.
Besides the possibility of additional effects being present when lateralizing broadband stimuli as in the present study compared with the isolated narrowband stimuli as used in previous studies, there might exist a fundamental difference in the applied methods. A direct comparison of these methods using identical stimuli would be required to rule out this systematic factor.
The current study was designed to reduce within-channel interference by separating the 1-ERB-wide noise bands by 1-ERB-wide spectral gaps. Thus, only little energy “leaked” into neighboring auditory channels. However, binaural auditory filters have been shown to be wider than monaural auditory filters (van de Par and Kohlrausch
1999; Kolarik and Culling
2010; van der Heijden and Trahiotis
1998; Bernstein and Oxenham
2006; Holube et al.
1998). Assuming wider binaural auditory filters, this might have introduced a within-channel interference of the interaural disparities. This possible interference might have led to a reduction of the weight in a given frequency band as conflicting binaural information from two or more bands were integrated within one binaural auditory filter. Thus, the frequency bands at the spectral edges with only a single neighboring band were less affected by within-channel interference than the remaining bands. This explanation is supported by the conditions with altered edge frequency bands where bands were removed or replaced with uncorrelated noise. When a band was removed, the interference for the new edge band is reduced and thus the spectral weight of this band increases. However, when there is uncorrelated noise on the edge band, the interference remains constant and thus the weight is unchanged. However, the edge effect only occurred if usable interaural information for the auditory system were available. This was the case mainly in low frequencies for ITDs and in high frequencies for ILDs. Yet, in the condition with removed edge bands, also the lowest frequency band for ILDs was increased in comparison with the reference condition, which suggests that enough ILD information was available at the mid-frequency range but not at the low-frequency range. An alternative explanation is that the ITD of 0 μs was a stronger opposing cue at the lower frequency band and thus the ILD weight was lower.
Because of the behavioral nature of the present study, other mechanisms might underlie the results. One alternative interpretation for the increased weights at the edge frequencies and for the conditions with imposed auditory enhancement (experiment2) could be a perceptual separation of one of the noise bands. In the light of this interpretation, if none of the noise bands would be perceptually separated, there is statistically seen a very small probability that the cue of one band would dominate the perception. Any attribute in the stimulus that would help to separate one noise band from the others might then result in the judgment of the listener to be dominated by the cue imposed on that specific band. Such a separation could happen due to spectral placement at the edge (see also Klein and Hartmann
1981), or by preceding or following sounds in the same spectral region (Viemeister
1980; Byrne et al.
2011). If no other mechanism leading to perceptual weighting existed, then all weights would be equally low in the absence of a cue supporting separation. The edge frequencies would then receive high weights as a consequence of perceptual separation (lowest for ITD due to the presence of phase locking, and ILD at high frequencies). This explanation would require that all bands are processed independently from each other and that the importance of the cues would be constant across frequency. The connection to phenomena like binaural interference and the gradual decrease of phase locking from low to high frequencies as well as a mutual interaction of spectral components in such listening conditions might provide some additional insights into the mechanisms underlying the behavioral data.
The mechanisms in the auditory system that might underlie a frequency weighting are not fully known. Previous studies have found specific sensitivity to spectral edges in the dorsal cochlear nucleus (DCN) (Reiss and Young
2005). While the DCN has been shown to play a role in sound localization both in azimuth and elevation and projects directly to the inferior colliculus (IC) (May
2000), it is bypassing binaural structures. Thus, it is not clear if the monaural DCN might affect binaural processing. One might also speculate that adaptation effects as observed at the level of the IC might lead to an enhancement of spectral edges due to the asymmetry in excitation around the edge band (Nelson and Young
2010). The perceptually measured frequency weighting might, however, be the compound result of such various phenomena along the auditory pathway. Hence, physiological studies will have to provide the insight into the exact origin of the observed weights.
Assuming that the ITD and ILD detection thresholds for the stimulus of the current study are similar to the thresholds proposed in the literature for narrow-band stimuli, a specific value of ITD or ILD could lead to differences in the amount of lateralization when presented in different frequency bands. Hence, one might interpret that a lower weight of ITD at higher frequencies is caused by the fact that it was close to or even below detection threshold compared with a lower frequency where the corresponding ITD was well above detection threshold. The data of the current study are partially in agreement with this interpretation. The weights are, however, even at the highest frequencies not equal to zero which indicates a small contribution of these cues, even at high frequencies. In order to quantify this in more detail, detection threshold and the role of the magnitude of the cue above detection threshold needs to be investigated with more complex stimuli such as those used in the present study.
Sensitivity to a specific stimulus attribute and the perceptual weights likely provide different information about sensory processing. It has, however, been argued that these two measures might be interconnected (Leibold et al.
2009; Kortekaas et al.
2003). This aspect has been discussed in the light of signal detection theory and, for example, the interaction between streaming and masking (Lutfi et al.
2012; Chang et al.
2016). In the current study, it is unclear if the detectability of, for example, ITD in frequency bands above 1.5 kHz might have an impact on the derived weights. It is challenging to compare weights derived from sensitivity for isolated pure tones (Brughera et al.
2013) with those of narrowband noises (Buchholz et al.
2018), as sensitivity for these stimuli differs. In order to shed light on these factors, a comparison across these data might be possible when considering these different stimuli in conditions with a constant
d′ and to evaluate the performance in the signal detection theory framework outlined in Lutfi et al. (
2012). Such a framework would then allow to include the detectability of each cue in each frequency by a combination of
d′. The results of the present study and of the previous studies provide a good starting point to extend the current point of view in this direction. The paradigm applied in the present study allows to evaluate the overall performance even in the presence of the many possible parameter combinations, at the cost that detailed information on the interactions might be obscured.
Multiple studies showed that spectral bandwidth plays a role in processing of binaural information, but in a non-trivial way. Thavam and Dietz (
2019) directly compared ITD thresholds for pure tones, tone complexes, and noises of different bandwidth and spectral shape. In their data, a lower ITD threshold was found for a white noise filtered between 600 and 1000 Hz than for pure tones or tone complexes. The ITD threshold was similarly low, however, for a 20- to 1400-Hz noise. Hence, congruent information across frequency ranges exceeding one auditory filter might be beneficial in the processing of ITD, while incongruent information, as in binaural interference experiments, might be detrimental. These results suggest that the weights derived in the present study might differ for other stimulus parameters, spectral shape, and salience of the provided cues.
In experiment 2, spectral weights were determined while using an auditory enhancement paradigm (Viemeister
1980; Byrne et al.
2011). The weights at the cued bands were found to be increased at low and high frequencies for ILDs and at low frequencies for ITDs with respect to the reference condition. The results are in line with findings on ITD and ILD detection thresholds as discussed above. Comparing the weights across the enhanced bands, it is apparent that the edge frequency bands at low and high frequencies receive the highest weights for ITDs and ILDs, respectively. This is not in line with threshold measurements of ITDs and ILDs. While ILD thresholds are constant over frequency except at around 1 kHz (Yost and Dye
1988; Grantham
1984; Rowland and Tobias
1967), ITD thresholds have been shown to be lowest at around 800 Hz (Klumpp and Eady
1956; Brughera et al.
2013).
The reason for the change of the spectral weights using the auditory enhancement paradigm remains unclear. One possibility could be due to an internal gain of the enhanced band. This might lead to an increase in loudness of that band (Viemeister
1980) or to an otherwise perceptually separated band. Second, both the enhancement and the binaural processing might happen at the same stage of the auditory system, which might lead to a more efficient coding of information in the enhanced band. The underlying mechanism in the auditory system for the auditory enhancement effect is unclear (Feng et al.
2018; Beim et al.
2015). Some studies suggested the auditory nerve as the origin of the auditory enhancement (Summerfield et al.
1987; Palmer et al.
1995); in other studies, higher stages such as the inferior colliculus (Nelson and Young
2010; Feng et al.
2018) or the auditory cortex (Feng et al.
2018; Carcagno et al.
2014) were suggested as the possible origins. Carcagno et al. (
2013) stated that the auditory enhancement likely occurs at a stage of the auditory system where the monaural auditory pathways have converged, which is in agreement of the findings in the current study where the auditory enhancement paradigm leads to increased weights. Third, the reason for increased weights could be due to a reduced binaural interference. Best et al. (
2007), and Woods and Colburn (
1992) showed that binaural interference is reduced when auditory information is grouped. However, grouping has been argued to not be a reason for auditory enhancement (Summerfield et al.
1987; Byrne et al.
2011); thus, it might also not be the underlying mechanism for the increased spectral weights. The current study can neither prove nor rule out any of the three reasons and further work is needed to link auditory enhancement and binaural perception.
Previous studies have shown that ITD information carried by the temporal fine structure (TFS) of the signal can be used by the auditory system up to about 1500 Hz, while envelope ITD information can also be used at higher frequencies. In the present study, listeners seemed to primarily rely on low-frequency TFS information when judging the lateralization with ITD cues. This is in line with findings showing that TFS cues are weighted higher than envelope cues (Moore et al.
2018). Additionally, when using the enhancement paradigm, listeners only gave a higher weighting to the low-frequency bands with TFS information but not to the high-frequency bands where only envelope information is available.
In the current study, the spectral weighting of ITDs and ILDs has been investigated in separation. However, to lateralize a sound, both ITDs and ILDs are used jointly. Thus, one might investigate a common lateralization weighting function.
Even though the edge frequency bands have been found to receive the highest weight, also the other weights were found to be above zero. Thus, all frequencies were found to contribute to the lateralization.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.