Hearing aid simulation

What our baseline hearing aid simulates, with examples.

Our challenge entrants are going to use machine learning to develop better hearing aid processing for listening to speech in noise (SPIN). We’ll provide a baseline hearing aid model for entrants to improve on. The figure below shows our baseline system, where the yellow box to the left is where the simulated hearing aid sits (labelled “Enhancement model”).

This image has an empty alt attribute; its file name is baseline-1024x456.png
The draft baseline system (where SPIN is speech in noise, HL is hearing loss, SI is speech intelligibility, and L & R are Left and Right).

We decided to base our simulated hearing aid on the open Master Hearing Aid (openMHA), which is an open-source software platform for real-time audio signal processing. This was developed by the University of Oldenburg, HörTech gGmbH, Oldenburg, and the BatAndCat Corporation, USA. The original version was developed as one of the outcomes of the Cluster of Excellence Hearing4all project. The openMHA platform includes:

  • a software development kit (C/C++ SDK) including an extensive signal processing  library for algorithm development and a set of Matlab and Octave tools to support development and off-line testing
  • real-time runtime environments for standard PC platforms and mobile ARM platforms
  • a set of baseline reference algorithms that forms a complete hearing aid system, including multi-band dynamic compression and amplification, directional microphones,  binaural beamformers and coherence filters, single-channel noise reduction, and feedback control.

We have written a Python wrapper for the core openMHA system for ease of use within machine learning frameworks. We developed a simple generic hearing aid configuration and translated the Camfit compressive fitting, the prescription that takes a listener’s audiogram and determines the right settings for the hearing aid, based on Moore et al. 1999 and encoded by openMHA.

Some aspects of modern digital hearing aids that we’ve decided to simulate are:

  • differential microphones, and
  • a multiband compressor for dynamic compression.

We’ve decided not to simulate the following on the basis that all these tend to be implemented in proprietary forms, such that we can’t replicate them exactly in our open-source algorithm:

  • coordination of gross processing parameters across ears,
  • binaural processing involving some degree of signal exchange between left and right devices,
  • gain changes influenced by speech-to-noise ratio estimators,
  • frequency shifting or scaling, and
  • dual or adaptive time-constant wide dynamic range compression.

We are using the Oldenburg Hearing Device (OlHeaD) Head Related Transfer Function (HRTF) Database (Denk et al. 2018) to replicate the signals that would be received by the front and rear microphones of the hearing aid and also at the eardrums of the wearer.

Audio examples of hearing aid processing

Here is an example of speech in noise processed by the simulated hearing aid for a moderate level of hearing loss. We can hear that the shape of the frequency spectrum has been modified to suit the listener’s specific pattern of hearing loss.

Here’s the original noisy signal where the noise is generated by a washing machine.
Here’s the same signal processed by the simulated hearing aid for a listener with a moderate level of hearing loss (Pure Tone Average of 38). For illustration purposes, this is presented here at an overall level that is similar to that of the original signal.
Here’s the noisy signal as it would be perceived by the listener wearing the hearing aid. Without the aid, the original noisy signal would be near inaudible.

Information about our hearing loss model can be found here.

The target speech comes from our new 40 speaker British English speech database, while the speech interferer noise comes from the SLR83 database, which comprises recordings of male and female speakers of English from various parts of the UK and Ireland.

Acknowledgements

We are grateful to the developers of the openMHA platform for the use of their software. Special thanks are due to Hendrik Kayser and Tobias Herzke. We are also grateful to Brian Moore, Michael Stone and colleagues for the Camfit compressive prescription, and to the people involved in the preparation of the OlHead HRTF (particularly Florian Denk) and SLR83 databases. The feature image is taken from Denk et al. (2018).

References

Demirsahin, I., Kjartansson, O., Gutkin, A., & Rivera, C. E. (2020). Open-source Multi-speaker Corpora of the English Accents in the British Isles. Available at http://www.openslr.org/83/

Denk, F., Ernst, S. M., Ewert, S. D., & Kollmeier, B. (2018). Adapting hearing devices to the individual ear acoustics: Database and target response correction functions for various device styles. Trends in Hearing, 22, 2331216518779313.

Moore, B. C. J., Alcántara, J. I., Stone, M. A., & Glasberg, B. R. (1999). Use of a loudness model for hearing aid fitting: II. Hearing aids with multi-channel compression. British Journal of Audiology, 33(3), 157-170.

Hearing loss simulation

What our hearing loss algorithms simulate, with audio examples to illustrate hearing loss.

Our challenge entrants are going to use machine learning to develop better processing of speech in noise (SPIN) for hearing aids. For a machine learning algorithm to learn new ways of processing audio for the hearing impaired, it needs to estimate how the sound will be degraded by any hearing loss. Hence, we need an algorithm to simulate hearing loss for each of our listeners. The diagram belows shows our draft baseline system that was detailed in a previous blog. The hearing loss simulation is part of the prediction model. The Enhancement Model to the left is effectively the hearing aid and the Prediction Model to the right is estimating how someone will perceive the intelligibility of the speech in noise.

The draft baseline system (where SPIN is speech in noise, DRC is Dynamic Range Compression, HL is Hearing Loss, SI is Speech Intelligibility and L & R are Left and Right).

There are different causes of hearing loss, but we’re concentrating on the most common type that happens when you age (presbycusis). RNID (formerly Action on Hearing Loss) estimate that more than 40% of people over the age of 50 have a hearing loss, and this rises to 70% of people who are older than 70.

The aspects of hearing loss we’ve decided to simulate are

  1. The loss of ability to sense the quietest sounds (increase in absolute threshold).
  2. How as an audible sound increases in level, the perceived increase in loudness is greater than normal (loudness recruitment) (Moore et al. 1996).
  3. How the ear has a poorer ability to discriminate the frequency of sounds (impaired frequency selectivity).

Audio examples of hearing loss

Here are two samples of speech in noise processed through the simulator. In each audio example there are three versions of the same sentence:

  1. Unimpaired hearing
  2. Mild hearing impairment
  3. Moderate to severe hearing impairment
0 dB signal to noise ratio

And here is an example where the noise is louder:

Noisier: -10dB signal to noise ratio

Acknowledgements

The hearing loss model we’re using was generously supplied by Michael Stone at the University of Manchester as MATLAB code and translated by us into Python. The original code was written by members of the Auditory Perception Group at the University of Cambridge, ca. 1991-2013, including Michael Stone, Brian Moore, Brian Glasberg and Thomas Baer. Information about the model can be found primarily in Nejime and Moore (1997), but also in Nejime and Moore (1998), Baer and Moore (1993 and 1994), and Moore and Glasberg (1993).

The original speech recordings come from the ARU corpus, University of Liverpool (Hopkins et al. 2019). This corpus is freely available at the link in the reference below.

References

Baer, T., & Moore, B. C. (1993). Effects of spectral smearing on the intelligibility of sentences in noise. The Journal of the Acoustical Society of America, 94(3), 1229-1241.

Baer, T., & Moore, B. C. (1994). Effects of spectral smearing on the intelligibility of sentences in the presence of interfering speech. The Journal of the Acoustical Society of America, 95(4), 2277-2280.

Hopkins, C., Graetzer, S., & Seiffert, G. (2019). ARU adult British English speaker corpus of IEEE sentences (ARU speech corpus) version 1.0 [data collection]. Acoustics Research Unit, School of Architecture, University of Liverpool, United Kingdom. DOI: 10.17638/datacat.liverpool.ac.uk/681. Retrieved from http://datacat.liverpool.ac.uk/681/.

Moore, B. C., & Glasberg, B. R. (1993). Simulation of the effects of loudness recruitment and threshold elevation on the intelligibility of speech in quiet and in a background of speech. The Journal of the Acoustical Society of America, 94(4), 2050-2062.

Moore, B. C., Glasberg, B. R., & Vickers, D. A. (1996). Factors influencing loudness perception in people with cochlear hearing loss. B. Kollmeier, World Scientific, Singapore, 7-18.

Nejime, Y., & Moore, B. C. (1997). Simulation of the effect of threshold elevation and loudness recruitment combined with reduced frequency selectivity on the intelligibility of speech in noise. The Journal of the Acoustical Society of America, 102(1), 603-615.

Nejime, Y., & Moore, B. C. (1998). Evaluation of the effect of speech-rate slowing on speech intelligibility in noise using a simulation of cochlear hearing loss. The Journal of the Acoustical Society of America, 103(1), 572-576.