Hearing in Noise
Audiologists are keenly aware of the problems people with hearing loss experience in background noise. These problems are generally disproportionate to any difficulties experienced in quiet, and they are often the catalyst for an initial visit to an audiologist. People with mild hearing loss often complain that speech is unclear in noisy situations, but that they have no other hearing difficulties. For people with greater degrees of hearing loss, background noise can render speech completely unintelligible, leading to frustration and isolation.
Helping people hear speech in a noisy background has become a primary focus for our profession. Many new digital hearing aids come equipped with a digital noise reduction system, and may also incorporate static or adaptive directional microphone systems. Wireless FM receivers can be attached to many BTE style hearing aids, enabling users to mitigate the effects of background noise, distance and reverberation.
Interestingly, although directional microphone systems and FM systems can significantly improve speech intelligibility in noise, hearing aid based digital noise reduction systems have not been able to accomplish the same goal (e.g. Alcntara et al., 2003). Given that a failure to find a significant effect does not prove a significant effect does not exist, it is possible that hearing aid based digital noise reduction systems can improve speech intelligibility in noise, but the tests have not been sensitive enough to demonstrate this finding. However, since directional microphones and FM systems have been shown to be effective using similar tests, the poor performance of hearing aid based digital noise reduction systems suggests that beneficial effects would likely be fairly small.
The success of directional and FM technologies in contrast to the absent or limited success of hearing aid based digital noise reduction suggests that either digital noise reduction is difficult to implement correctly, or that there is something wrong in general with the current hearing aid based digital noise reduction approach. Through this discussion I will address the fact that digital noise reduction can only improve speech intelligibility in a very limited sense, even when correctly implemented, due to the nature of the hearing in noise problem. Although this technology may be valuable for other purposes, there is a better solution for providing the best speech intelligibility in noise.
Directional Microphones and FM systems
I recommend we initiate this discussion by considering how directional microphones and FM systems are successful in improving speech intelligibility in noise. Long before the implementation of adaptive directional microphone responses and beamforming microphone arrays, evidence was piling up in favor of the use of directional microphones and FM systems (e.g. Valente, Fabry, & Potts, 1995).
Both technologies provide a means by which the listener can control the input to the hearing aid. With FM systems, the placement of the transmitting microphone determines the input to the hearing aid very precisely, such that the desired input is strongly favored and the surrounding noise is easily overcome. With directional microphone systems, listeners control the input by turning their head toward the signal of interest. Undesirable sounds from other directions are attenuated by the directional characteristics of the microphone(s) and the origin of the sound, thereby improving the signal-to-noise ratio. Beamforming and adaptive directional technologies might further improve the signal to noise ratio by allowing more precise directional characteristics that favor the signal of interest and attenuate noise, although they work essentially on the same principle as standard directional microphones.
Digital Noise Reduction Systems
Digital noise reduction systems are completely different from directional and FM technologies in that they do not require the listener to manually specify the signal of interest (e.g. via head orientation), but instead rely on properties of the acoustic signal. This offers a potential advantage: because these systems rely on the acoustic input to determine the signal of interest, there is no training required for correct operation, and they can therefore be used by very young children and individuals with cognitive impairments.
Ironically, the latter apparent benefit (i.e. no action required from the user) is the precise reason why such systems cannot effectively handle the speech in noise problem. In order to remove or reduce noise, or enhance the signal relative to the noise, it is first necessary to determine what part of the acoustic signal is noise. This is where the problem lies! Noise is any part of the acoustic signal that is not part of the signal of interest. But without listener input, there is no way to specify the actual signal of interest! It is only possible to specify the probable signal of interest, as a surrogate for the actual signal of interest. Correspondingly, noise can only be defined as what is probably not the signal of interest.
With hearing aid technology, the common assumption is that speech is the signal of interest, and non-speech sounds are noise. Notwithstanding the inappropriateness of this assumption relative to music and other pleasant or important environmental sounds (e.g. a bubbling brook or a warning siren), this assumption seems to be quite reasonable. Speech is arguably the most important acoustic signal, and problems with speech understanding are typically the reason for wearing hearing aids in the first place. If hearing aids can help someone understand speech, it might be acceptable for the hearing aids to reduce the quality of music and remove some desirable sounds, in other words, the trade-off may be worthwhile.
The problem with this solution is that while the signal of interest is probably speech, the background noise is also probably speech. Most noise is simply speech that a listener wants to ignore. In a party, a mall, a place of worship, a restaurant, an office or a classroom, the noise is mainly produced by other people speaking. Moreover, the speech of interest may frequently become noise (and vice versa), as a listeners attention shifts from one person to another. For instance, when someone goes to a restaurant, they want to hear what the waiter is saying. However, when the waiter is speaking to people at other tables, that same speech is noise! There are no inherent acoustical differences between the signal of interest and the noise in this case. Both are determined subjectively according to the listener. Because digital noise reduction systems require no input from the user, they cannot know which speech the listener wants to hear, and they cannot therefore attenuate this type of noise.
The success of directional microphones and FM systems can easily be attributed to the fact that these systems allow the listener to essentially tell the hearing aid who or what they want to hear, either via head orientation or microphone placement. Because digital noise reduction systems operate with the assumption that all speech is desirable, they will always fail to solve the noise problem except in those rare circumstances where only non-speech noise exists.
The Nature of the Noise Problem
When we come to the realization that all undesirable sounds are noise, and that these sounds may be acoustically similar to the signal of interest (e.g. when you are trying to hear one person at a party), the noise problem begins to look very different. For a listener to clearly hear speech in noise, not only must the speech be audible, the speech must be distinguishable from the noise. This problem of separation is known as auditory scene analysis (see Bregman, 1990 for a comprehensive review of auditory scene analysis). When faced with a number of concurrent sounds, such as a cacophony of voices at a party, all of the acoustic information must be correctly segregated and assigned to correct sources before it is useful. Insofar as sounds are audible, the listener needs to determine which sounds belong to which speaker. Only after this has been accomplished can a listener pay attention to a single source.
Research in auditory scene analysis has shown that we rely on fine spectral information to accomplish this task (Bregman, 1990). In order to separate out voices, we rely on the pattern of resonances associated with the talker we are listening to, and we rely on the pitch and harmonic structure of the talkers voice. This is a practical strategy to use, since every person has slightly different resonances created by a particular mouth shape and size, and every person has a unique voice, which is created by a particular set of vocal folds and vocal tract. In other words, the natural human ability to track speech in noise takes advantages of natural features of the world.
When we remember that people are well adapted to functioning in a complex world, we are thinking ecologically, and this is a valuable perspective. If I am listening to a group of people speaking, I am able to follow what a particular person is saying because that person is distinctive in terms of vocal quality, pitch and harmonic structure, resonance pattern, and even timing (e.g. Darwin et al., 2003). These are the subtle qualities that my human brain exploits.
Given that hearing loss often results in a loss of spectral resolution, due to a broadening of the auditory filters (Glasberg & Moore, 1986), it is quite reasonable that people with hearing loss might have difficulty performing auditory scene analysis. This provides a plausible account for the disproportionate difficulty that such people experience in noisy situations. Even when the problem of inaudibility has been addressed through appropriate amplification, the loss of spectral resolution will impair the separation of sound and put the person with hearing loss at a disadvantage (Baer & Moore, 1994).
Aiding Scene Analysis
It is worthwhile to consider how the various approaches to handling noise could aid or impair the scene analysis process. Directional microphones and FM systems allow listeners control over the spatial source of the primary hearing aid input, which could thereby simplify the scene analysis process by attenuating sounds belonging to undesirable auditory streams. Similarly, in an ideal situation where the signal of interest was speech, and the competing sounds were all non-speech, the noise reduction system might make the process of scene analysis easier. However, the ideal situation is unlikely to be the case in most circumstances, as discussed above. This suggests that digital noise reduction will not likely facilitate auditory scene analysis.
Moreover, it is possible for a digital noise reduction system to interfere with the scene analysis process. Noise reduction systems all operate by increasing or attenuating a portion of the spectrum, and they are therefore a source of spectral distortion. If a noise reduction system is using a small number of analysis bands, it will not be able to precisely locate the noise in the spectrum, and will not be able to attenuate the noise without also attenuating significant speech information. In contrast, with a large number of analysis bands, such a system may be able to more precisely spectrally locate noise, but by modifying the spectrum in narrow bands, more spectral distortion would be created. In an environment with multiple speech inputs, where scene analysis becomes paramount, this processing might therefore degrade the very information that is most important.
Preserving Spectral Integrity
An ecologically sensitive solution is a solution that takes into account the way in which we function in our complex world. Accordingly, when we are talking about hearing speech in noise, we have to consider the manner by which the brain tracks speech in background noise, in the real world. Given that we rely on fine spectral information and resonance patterns to keep speech in focus, this information must be maintained. In other words, we need to make the preservation of spectral integrity a goal.
As a first step towards this goal, we should consider the effects of the hearing aid processing on spectral integrity. A common type of processing in hearing aids is multichannel compression. Hearing aids currently on the market are available with as few as two or as many as fifteen channels. What are the effects of these multichannel compression systems on spectral integrity?
In any given compression channel, gain is always fluctuating, depending on the stimulus in the previous instant. If a loud input sound just occurred in that channel, the gain will be lower than if a soft sound just occurred in the same channel. Thus, at any given time, the gain applied to the spectrum will have a shape based on the sounds processed in a previous moment. Some channels will be at maximum gain levels, while other channels will be at minimum gain levels, thereby subjecting the spectrum to a continuous spectral distortion (OBrien, 2002).
This problem is greater when a large number of compression channels are used, especially in conjunction with fast compression time constants (Plomp, 1988). The output of the hearing aid across the spectrum is never consistently related to the current input, but is always subjected to spectral distortion. Cutting a picture into narrow strips and sliding each strip up and down quickly could create an analogous distortion in the visual world. Although it would still be possible to make out the picture, clarity would be severely reduced. In the auditory domain, we call this spectral smearing (Boothroyd et al.,1996).
Spectral smearing disrupts information that we use to understand speech, such as formant intensity relationships in vowels and vowel-to-consonant transitions. An ''ee sound is different from a ''ah sound and an ''oo sound by virtue of its formant locations and intensities. Similarly, an ''s sound is distinguished from an ''sh sound on the basis of the spectral shape of high frequency turbulent noise (Stevens, 2002).
When a hearing aid actively distorts the spectrum, it degrades the quality of spectral information, and makes speech understanding in noise more difficult. Research suggests that processing with more than four or five channels is potentially detrimental (Keidser & Grant, 2001), although the specific number of channels depends on the type of compression employed. Slow compression will produce much less spectral smearing for a given number of channels.
To preserve spectral integrity and to benefit the process of auditory scene analysis, perhaps the best course of action would be to avoid multichannel compression. However, there are several strong arguments in favor of its use. The first is the most obvious -- multiple channels of compression increase fitting flexibility. Gain for soft and loud sounds can be set independently for different frequencies, which is a desirable feature. The second reason to use multichannel compression is that single channel systems can compromise the audibility of soft high-frequency consonants. Gain is decreased following any speech sound that exceeds the compression threshold kneepoint. Vowels tend to be fairly intense, and will often cause a reduction in hearing aid gain. Given that compression systems typically have release times in excess of 100 ms, which is the average length of a phoneme, this gain reduction will usually persist long after the vowel has ended, and may suppress the trailing high frequency consonant. This problem is easily remedied in multichannel compression, with the soft high-frequency consonants processed in a separate channel from the more intense low-frequency vowels.
There are currently two ecologically sensitive solutions that are offered in the array of commercially available hearing aids. The first is the use of only two (or possibly three) channels. Although a two-channel system sounds simple and old-fashioned, it provides adequate fitting flexibility for almost any hearing loss. It also solves the problem of compression-related consonant suppression, as the low frequency vowels are processed in a separate channel than the high frequency consonants. And it accomplishes these things without introducing very much spectral distortion. A system with two independent channels still introduces some spectral noise, but because this is limited to low versus high frequency gain, the net effect is to be continuously changing the spectral tilt.
Research in auditory scene analysis indicates that spectral tilt is a feature the brain can use to track an auditory stimulus (Bregman, 1990). Might this interfere with speech perception? If the spectral tilt is continuously changing in a way that does not provide useful information, it is unlikely that this information would be used by the auditory system. Importantly, because changes in spectral tilt are wideband effects, the fine spectral information and overall resonance patterns should be subject to little distortion, and tracking of speech in noise should not be adversely affected. Gatehouse (2000) has provided some support for this idea. He found some listeners received significant benefit in noise with a fast, two-channel compression hearing aid.
Another solution, which has been introduced in the Bernafon Symbio' hearing aid, is called ChannelFree' Processing. This processing does not divide inputs into discrete channels and thus does not degrade the integrity of the spectrum. It maintains the information that the brain needs to perform scene analysis in noisy environments.
Symbios unique signal processing is able to provide the advantages of multichannel compression without the corresponding spectral distortion. With a high-end digital processor, it is actually not necessary to have multiple channels in order to have flexibility. Accordingly, channels are mainly used to prevent compression-related consonant suppression, as described above. This issue of can be solved without resorting to the use of multiple channels (Scheller, 2002).
If compression time constants can be shortened and synchronized such that the compression activated by an intense low-frequency vowel releases by the time the trailing high-frequency consonant is processed, the high-frequency consonant can be amplified with the appropriate gain. In other words, if compression works quickly and accurately, it does not need to be done in separate channels. Attempts in the past to design a fast compressor were thwarted by perceptual pumping effects, since fast (linear) compressors can cause distortion in the low frequencies. Bernafon has solved this issue by using a non-linear gain control that is sensitive to level and frequency. This gain control can therefore operate much more quickly without low-frequency distortion, and can provide maximum flexibility without introducing channels.
Digital noise reduction systems in modern hearing aids cannot effectively reduce background noise because background noise is defined by the listener, and is often not acoustically different from the signal of interest. Truly effective noise reduction strategies provide a means by which the listener can specify the signal of interest, such as by changing the orientation of a directional hearing aid. Beyond this, hearing in noise is essentially a problem of auditory scene analysis, which suggests that the most effective processing strategy will be one that maintains spectral integrity.
Two solutions to this problem were discussed. The first solution is to use a simple two-channel hearing aid, which provides adequate flexibility without introducing very much distortion. The second is to do away with channels completely, and find new solutions to the problems with traditional compression, as has been done with the Bernafon Symbio'.
It is important to consider ecological issues when designing hearing aids to help people hear in noise. Tracking speech in noise is not just a matter of maintaining a particular signal to noise ratio, because the human brain requires that information to be intact (i.e. undistorted). The preservation of spectral integrity is required for our natural auditory scene-analyzing abilities to function, and this should not be sacrificed in an effort to artificially reduce background noise.
Alcntara, J. I., Moore, B. C .J., Khnel, V. and Launer, S. (2003). Evaluation of a noise reduction system in a commercial digital hearing aid. International Journal of Audiology, 42, 34-42.
Baer, R., and Moore, B. C. J. (1994). Effects of spectral smearing on the intelligibility of sentences in the presence of competing speech. Journal of the Acoustical Society of America, 95(4), 2277-2280.
Boothroyd, A., Mulhearn, B., Gong, J., and Ostroff, J. (1996). Effects of spectral smearing on phoneme and word recognition. Journal of the Acoustical Society of America, 100(3), 1807-1818.
Bregman, A. S. (1990). Auditory Scene Analysis: the perceptual organization of sound. MIT Press.
Darwin, C. J., Brungart, D. S., and Simpson, B. D. (2003). Effects of fundamental frequency and vocal-tract length changes on attention to one of two simultaneous talkers. Journal of the Acoustical Society of America, 144(5), 2913-2922.
Gatehouse, S. (2000). Aspects of auditory ecology and psychoacoustic function as determinants of benefits from and candidature for non-linear processing in hearing aids. IHCON 2000, Lake Tahoe.
Glasberg, B. R., and Moore, B. C. J. (1986). Auditory filter shapes in subjects with unilateral and bilateral cochlear impairments. Journal of the Acoustical Society of America, 79(4), 1020-1033.
Keidser, G., and Grant, F. (2001). The preferred number of channels (one, two, or four) in NAL-NL1 prescribed wide dynamic range compression (WDRC) devices. Ear and Hearing, 22(6), 516-527.
OBrien, A. (2002). More channels are better, right? Audiology Online, www.audiologyonline.com.
Plomp, R. (1988). The negative effect of amplitude compression in multichannel hearing aids in light of the modulation transfer function. Journal of the Acoustical Society of America, 83(6), 2322-2327.
Stevens, K. N. (2002). Toward a model of lexical access based on acoustic landmarks and distinctive features. Journal of the Acoustical Society of America, 111(4), 1872-1891.
Scheller, T. (2002). Temporal aspects of hearing aid signal processing. Audiology Online,, www.audiologyonline.com.
Valente, M., Fabry, D. A., and Potts, L. G. (1995). Recognition of speech in noise with hearing aids using dual microphones. Journal of the American Academy of Audiology, 6(6), 440-449.
Click here for more information on Bernafon.