Clarity logo
The 2nd Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2022)
EPSRC logo


The workshop will be a one day, online event held on 29th June. Timings and session details are provided below. All times are UK local time (i.e., UTC+1).

9:00Welcome [slides]
9:10The Clarity Prediction Overview / Results [slides]
9:40Challenge Papers: Session I
10:50Challenge Papers: Session II
12:00Challenge Papers: Session III
13:10Invited Talk - Theo Goverts, Vrije Universiteit Amsterdam
14:10 Prizes and conclusions [slides]
14:20 CPC discussion + Future Directions

Invited Talk

Theo Goverts

Theo Goverts
audiologist (MPE), Amsterdam UMC

Speech recognition in realistic scenarios: insights from binaural recordings in natural acoustic environments.

Speech recognition in realistic scenarios: insights from binaural recordings in natural acoustic environments.


In a study together with Steve Colburn (Boston University), we were interested in the acoustic characterization of realistic scenario’s for speech recognition (Goverts & Colburn, 2020) . The essential acoustic information in such scenario’s is a bilateral vibration pattern stimulating eardrums or hearing devices microphone membranes at both sides. This bilateral vibration pattern is the input for an interplay of bottom-up and top-down processing, leading to the actual perception of speech.

Therefore we made bilateral recordings in a variety of environments that were considered relevant by experts and listeners with impaired hearing, e.g. at home, city walk, and public transport. Recordings were made using simple in-the-concha microphones and a data-recorder. We first looked at speech-likeness in the recordings using a non-intrusive modulation-spectrum based measure. We analysed absolute values, interaural differences and temporal dynamics in eight environments.

Furthermore we looked at binaural parameters: Interaural Level Differences, Interaural Time differences and Interaural Coherence. We analysed absolute values and temporal dynamics in the same eight environments.

Results show large variance in speech-likeness both within and between environments. Furthermore, some environments show large interaural differences in speech-likeness. It also shows that useful acoustic information is relatively sparse in realistic environments, putting more strain on the processing effort, especially for listeners with impaired hearing.

Recently we replicated some of the analyses using the ARTE recordings (Weisser & Buchholz, 2019) , yielding comp[arable results.

The implications of these studies for clinical audiology will be discussed with a focus on insights relevant for the aim of improving hearing-aid processing to optimise intelligibility of speech in noise for HI listeners as in the Clarity project.

  • Goverts, S. T., & Colburn, H. S. (2020). Binaural Recordings in Natural Acoustic Environments: Estimates of Speech-Likeness and Interaural Parameters. Trends in Hearing, 24.
  • Weisser, A., & Buchholz, J. M. (2019). Conversational speech levels and signal-to-noise ratios in realistic acoustic conditions. Journal of the Acoustical Society of America, 145(1), 349-360.


Theo Goverts is an audiologist (medical physics expert), researcher and residency director at Amsterdam University Medical Center. His research focuses on speech recognition in realistic scenarios and child, language and hearing.

Challenge paper sessions will consist of oral presentations allowing 15 minutes per team and 5 minutes for Q&A.

Challenge Papers: Session I

MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids [Report] [slides]
Ryandhimas E. Zezario1,2, Fei Chen3, Chiou-Shann Fuh1, Hsin-Min Wang2, Yu Tsao2 (1National Taiwan University; 2Academia Sinica;3Southern University of Science and Technology of China
Conformer-based Fusion of Text, Audio, and Listener Characteristics for Predicting Speech Intelligibility of Hearing Aid Users [Report]
Naoyuki Kamo1, Kenichi Arai1, Atsunori Ogawa1, Shoko Araki1, Tomohiro Nakatani1, Keisuke Kinoshita1, Marc Delcroix1, Tsubasa Ochiai1, and Toshio Irino2 (1NTT Corporation, Japan; 2Wakayama University Japan)
OBISHI: Objective Binaural Intelligibility Score for the Hearing Impaired [Report] [slides]
Candy Olivia Mawalim, Benita Angela Titalim and Masashi Unoki (Japan Advanced Institute of Science and Technology)

Challenge Papers: Session II

Speech Intelligibility Prediction for Hearing-Impaired Listeners with the bBSIM-STI Model [Report]
Saskia Röttges1, Jana Roßbach1, Christopher F. Hauth1, Thomas Biberger1, Bernd T. Meyer1, Rainer Huber2, Jan Rennies2, Thomas Brand1 (1Carl von Ossietzky University, Oldenburg, Germany; 2Fraunhofer IDMT, Oldenburg, Germany)
Non-intrusive Speech Intelligibility Prediction from Binaural Signals Processed for Hearing Aid Users [Report]
Alex F. McKinney1 and Benjamin Cauchi2 (1Durham University, UK; 2OFFIS, Oldenburg, Germany)
Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners [Report] [slides]
Zehai Tu, Ning Ma, Jon Barker (University of Sheffield)
Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction [Report] [slides]
Zehai Tu, Ning Ma, Jon Barker (University of Sheffield)

Challenge Papers: Session III

Speech Intelligibility Prediction for Hearing-Impaired Listeners with Phoneme Classifiers based on Deep Learning [Report] [slides]
Jana Roßbach1, Rainer Huber2, Saskia Roö̈ttges1, Christopher F. Hauth1, Thomas Biberger1, Thomas Brand1, Bernd T. Meyer1, Jan Rennies2 (1Carl von Ossietzky University, Oldenburg, Germany; 2Fraunhofer IDMT, Oldenburg, Germany)
Predicting Speech Intelligibility using SAMII: Spike Activity Mutual Information Index [Report] [slides]
Franklin Alvarez and Waldo Nogueira (Medizinische Hochschule Hannover, Germany)
ELO-SPHERES Intelligibility Prediction Model for the Clarity Prediction Challenge 2022 [Report]
Mark Huckvale1, Mike Brookes2, Pierre Guiraud2, Tim Green1, Gaston Hilkhuysen1, Alastair H. Moore2, Patrick A. Naylor2, Stuart Rosen1, Rebecca Vos2 (1University College London, UK; 2Imperial College London, UK)