Core Software
The following software is available to download:
- Scene generator
- Hearing aid processor baseline
- Hearing loss model
- Speech intelligibility model
The code is a Python package and accompanying unix shell scripts, with the facility to process a single scene or to bulk process the complete Clarity dataset.
A. Scene generator
Fully open-source python code for generating hearing aid inputs for each scene
- Inputs: target and interferer signals, BRIRs, RAVEN project (rpf) files, scene description JSON files
- Outputs: Mixed target+interferer signals for each hearing aid channel, direct path (simulating a measurement close to the eardrum). Reverberated pre-mixed signals can also be optionally generated.
B. Baseline hearing aid processor
The baseline hearing aid processor is based on openMHA. The python code configures openMHA with a Camfit compressive fitting for a specific listener’s audiogram. This includes a python implementation of the Camfit compressive prescription and python code for driving openMHA.
This configuration of openMHA includes multiband dynamic compression and non-adaptive differential processing. The intention was to produce a basic hearing aid without various aspects of signal processing that are common in high-end hearing aids, but tend to be implemented in proprietary forms so cannot be replicated exactly.
The main inputs and outputs for the processor are as follows:
- Inputs: Mixed scene signals for each hearing aid channel, a listener ID drawn from scene-listener pairs identified in ‘scenes_listeners.json’ and an entry in the listener metadata json file ‘listeners.json’ for that ID
- Outputs: The stereo hearing aid output signal,
<scene>_<listener>_HA-output.wav
C. Hearing Loss model
Open-source python implementation of the Cambridge Auditory Group Moore/Stone/Baer/Glasberg hearing loss model.
- Inputs: A stereo wav audio signal, e.g., the output of the baseline hearing aid processor, and a set of audiograms (both L and R ears).
- Outputs: The signal after simulating the hearing loss as specified by the set of audiograms (stereo wav file),
<scene>_<listener>_HL-output.wav
D. Speech Intelligibility model
Python implementation of a binaural intelligibility model, Modified Binaural Short-Time Objective Intelligibility (MBSTOI). This is an experimental baseline tool that will be used in the stage 1 evaluation of entrants (see Rules). Note that MBSTOI requires signal time-alignment (and alignment within one-third octave bands).
- Inputs: HL-model output signals, audiogram, reference target signal (i.e., the premixed target signal convolved with the BRIR with the reflections “turned off”, specified as ‘target_anechoic’), (scene metadata)
- Outputs: predicted intelligibility score