Programme

The workshop will be a one day, online event held on 29th June. Timings and session details are provided below. All times are UK local time (i.e., UTC+1).

9:00	Welcome [slides]
9:10	The Clarity Prediction Overview / Results [slides]
9:40	Challenge Papers: Session I
10:40	Break
10:50	Challenge Papers: Session II
11:50	Break
12:00	Challenge Papers: Session III
13:00	Break
13:10	Invited Talk - Theo Goverts, Vrije Universiteit Amsterdam
14:00	Break
14:10	Prizes and conclusions [slides]
14:20	CPC discussion + Future Directions
15:00	Close

Invited Talk

Theo Goverts
audiologist (MPE), Amsterdam UMC

Speech recognition in realistic scenarios: insights from binaural recordings in natural acoustic environments.

Synopsis

In a study together with Steve Colburn (Boston University), we were interested in the acoustic characterization of realistic scenario’s for speech recognition (Goverts & Colburn, 2020) . The essential acoustic information in such scenario’s is a bilateral vibration pattern stimulating eardrums or hearing devices microphone membranes at both sides. This bilateral vibration pattern is the input for an interplay of bottom-up and top-down processing, leading to the actual perception of speech.

Therefore we made bilateral recordings in a variety of environments that were considered relevant by experts and listeners with impaired hearing, e.g. at home, city walk, and public transport. Recordings were made using simple in-the-concha microphones and a data-recorder. We first looked at speech-likeness in the recordings using a non-intrusive modulation-spectrum based measure. We analysed absolute values, interaural differences and temporal dynamics in eight environments.

Furthermore we looked at binaural parameters: Interaural Level Differences, Interaural Time differences and Interaural Coherence. We analysed absolute values and temporal dynamics in the same eight environments.

Results show large variance in speech-likeness both within and between environments. Furthermore, some environments show large interaural differences in speech-likeness. It also shows that useful acoustic information is relatively sparse in realistic environments, putting more strain on the processing effort, especially for listeners with impaired hearing.

Recently we replicated some of the analyses using the ARTE recordings (Weisser & Buchholz, 2019) , yielding comp[arable results.

The implications of these studies for clinical audiology will be discussed with a focus on insights relevant for the aim of improving hearing-aid processing to optimise intelligibility of speech in noise for HI listeners as in the Clarity project.

Goverts, S. T., & Colburn, H. S. (2020). Binaural Recordings in Natural Acoustic Environments: Estimates of Speech-Likeness and Interaural Parameters. Trends in Hearing, 24. doi.org/10.1177/2331216520972858
Weisser, A., & Buchholz, J. M. (2019). Conversational speech levels and signal-to-noise ratios in realistic acoustic conditions. Journal of the Acoustical Society of America, 145(1), 349-360. doi.org/10.1121/1.5087567

Bio

Theo Goverts is an audiologist (medical physics expert), researcher and residency director at Amsterdam University Medical Center. His research focuses on speech recognition in realistic scenarios and child, language and hearing.

Challenge paper sessions will consist of oral presentations allowing 15 minutes per team and 5 minutes for Q&A.

Challenge Papers: Session I

	MBI-Net: A Non-Intrusive Multi-Branched Speech Intelligibility Prediction Model for Hearing Aids [Report] [slides] Ryandhimas E. Zezario^1,2, Fei Chen³, Chiou-Shann Fuh¹, Hsin-Min Wang², Yu Tsao² (¹National Taiwan University; ²Academia Sinica;³Southern University of Science and Technology of China
	Conformer-based Fusion of Text, Audio, and Listener Characteristics for Predicting Speech Intelligibility of Hearing Aid Users [Report] Naoyuki Kamo1, Kenichi Arai¹, Atsunori Ogawa¹, Shoko Araki¹, Tomohiro Nakatani¹, Keisuke Kinoshita¹, Marc Delcroix¹, Tsubasa Ochiai¹, and Toshio Irino² (¹NTT Corporation, Japan; ²Wakayama University Japan)
	OBISHI: Objective Binaural Intelligibility Score for the Hearing Impaired [Report] [slides] Candy Olivia Mawalim, Benita Angela Titalim and Masashi Unoki (Japan Advanced Institute of Science and Technology)

Challenge Papers: Session II

	Speech Intelligibility Prediction for Hearing-Impaired Listeners with the bBSIM-STI Model [Report] Saskia Röttges¹, Jana Roßbach¹, Christopher F. Hauth¹, Thomas Biberger¹, Bernd T. Meyer¹, Rainer Huber², Jan Rennies², Thomas Brand¹ (¹Carl von Ossietzky University, Oldenburg, Germany; ²Fraunhofer IDMT, Oldenburg, Germany)
	Non-intrusive Speech Intelligibility Prediction from Binaural Signals Processed for Hearing Aid Users [Report] Alex F. McKinney¹ and Benjamin Cauchi² (¹Durham University, UK; ²OFFIS, Oldenburg, Germany)
	Exploiting Hidden Representations from a DNN-based Speech Recogniser for Speech Intelligibility Prediction in Hearing-impaired Listeners [Report] [slides] Zehai Tu, Ning Ma, Jon Barker (University of Sheffield)
	Unsupervised Uncertainty Measures of Automatic Speech Recognition for Non-intrusive Speech Intelligibility Prediction [Report] [slides] Zehai Tu, Ning Ma, Jon Barker (University of Sheffield)

Challenge Papers: Session III

	Speech Intelligibility Prediction for Hearing-Impaired Listeners with Phoneme Classifiers based on Deep Learning [Report] [slides] Jana Roßbach¹, Rainer Huber², Saskia Roö̈ttges¹, Christopher F. Hauth¹, Thomas Biberger¹, Thomas Brand¹, Bernd T. Meyer¹, Jan Rennies² (¹Carl von Ossietzky University, Oldenburg, Germany; ²Fraunhofer IDMT, Oldenburg, Germany)
	Predicting Speech Intelligibility using SAMII: Spike Activity Mutual Information Index [Report] [slides] Franklin Alvarez and Waldo Nogueira (Medizinische Hochschule Hannover, Germany)
	ELO-SPHERES Intelligibility Prediction Model for the Clarity Prediction Challenge 2022 [Report] Mark Huckvale¹, Mike Brookes², Pierre Guiraud², Tim Green¹, Gaston Hilkhuysen¹, Alastair H. Moore², Patrick A. Naylor², Stuart Rosen¹, Rebecca Vos² (¹University College London, UK; ²Imperial College London, UK)

Invited Talk

Theo Goverts audiologist (MPE), Amsterdam UMC

Speech recognition in realistic scenarios: insights from binaural recordings in natural acoustic environments.

Speech recognition in realistic scenarios: insights from binaural recordings in natural acoustic environments.

Synopsis

Bio

Challenge Papers: Session I

Challenge Papers: Session II

Challenge Papers: Session III

Theo Goverts
audiologist (MPE), Amsterdam UMC