Software

The following software is available to download:

Scene generator
Hearing aid processor baseline
Hearing loss model
Speech intelligibility model

The code is a Python package and accompanying unix shell scripts, with the facility to process a single scene or to bulk process the complete Clarity dataset.

A. Scene generator#

Fully open-source python code for generating hearing aid inputs for each scene

Inputs: target and interferer signals, BRIRs, RAVEN project (rpf) files, scene description JSON files
Outputs: Mixed target+interferer signals for each hearing aid channel, direct path (simulating a measurement close to the eardrum). Reverberated pre-mixed signals can also be optionally generated.

B. Baseline hearing aid processor#

The baseline hearing aid processor is based on openMHA. The python code configures openMHA with a Camfit compressive fitting for a specific listener’s audiogram. This includes a python implementation of the Camfit compressive prescription and python code for driving openMHA.

This configuration of openMHA includes multiband dynamic compression and non-adaptive differential processing. The intention was to produce a basic hearing aid without various aspects of signal processing that are common in high-end hearing aids, but tend to be implemented in proprietary forms so cannot be replicated exactly.

The main inputs and outputs for the processor are as follows:

Inputs: Mixed scene signals for each hearing aid channel, a listener ID drawn from scene-listener pairs identified in ‘scenes_listeners.json’ and an entry in the listener metadata json file ‘listeners.json’ for that ID
Outputs: The stereo hearing aid output signal, <scene>_<listener>_HA-output.wav

C. Hearing Loss model#

Open-source python implementation of the Cambridge Auditory Group Moore/Stone/Baer/Glasberg hearing loss model.

Inputs: A stereo wav audio signal, e.g., the output of the baseline hearing aid processor, and a set of audiograms (both L and R ears).
Outputs: The signal after simulating the hearing loss as specified by the set of audiograms (stereo wav file), <scene>_<listener>_HL-output.wav

D. Speech Intelligibility model#

Python implementation of a binaural intelligibility model, Modified Binaural Short-Time Objective Intelligibility (MBSTOI). This is an experimental baseline tool that will be used in the stage 1 evaluation of entrants (see Rules). Note that MBSTOI requires signal time-alignment (and alignment within one-third octave bands).

Inputs: HL-model output signals, audiogram, reference target signal (i.e., the premixed target signal convolved with the BRIR with the reflections “turned off”, specified as ‘target_anechoic’), (scene metadata)
Outputs: predicted intelligibility score