Clarity logo
The Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2021)
ISCA logo

Programme

The workshop is now over. Papers, videos and slides for the talks and keynotes are available via the links in the schedule below. There is also a YouTube Playlist for all talks.

The workshop was run over two days. A half-day of tutorials on the 16th was followed by the main workshop event on the 17th. All times below are UTC+1. i.e., local time in the UK.

Tutorials - 16th September

15:00Welcome
15:10Tutorial 1 Hearing loss and hearing-aid signal processing (Karolina Smeds, ORCA Europe, WS Audiology)
15:50Tutorial 2 Objective intelligibility and quality measures of hearing-aid processed speech (James M. Kates, University of Colorado)
16:30 Break
16:40Tutorial 3 The Clarity Challenge: Materials and methods (Clarity Team)[YouTube]
17:20 Discussion
18:00 Close

Workshop - 17th September

14:00Welcome[YouTube]
14:10Invited Talk: An acoustician’s experience of using a hearing aid (Barry Gibbs, University of Liverpool)
14:40The Clarity Challenge Overview / Results[YouTube]
15:10Break
15:20Challenge papers: Beamforming approaches
16:45Break
16:55Challenge papers: Non-beamforming approaches
18:15General papers
18:35Break
18:45Future Perspectives Session
18:50Invited Talk: Machine listening in dynamic environments (Christine Evers, University of Southampton)
19:30Future Challenges - Clarity Prediction Challenge and the 2nd Enhancement Challenge[YouTube]
19:50Closing Discussion
20:00Close

Tutorials

Karolina Smeds

Karolina Smeds
ORCA Europe, WS Audiology

Hearing loss and hearing-aid signal processing

YouTube Slides

Hearing loss and hearing-aid signal processing

Synopsis

Hearing loss leads to several unwanted effects. Loss of audibility for soft sounds is one effect, but also when amplification is used to create audibility for soft sounds, many suprathreshold deficits remain. The most common type of hearing loss is a cochlear hearing loss, where haircells or nerve synapses in the cochlea are damaged. Ageing and noise exposure are the most common causes of cochlear hearing loss. This type of hearing loss is associated with atypical loudness perception and difficulties in noisy situations. Background noise masks for instance speech to a higher degree than for a person with healthy hair cells. This explains why listening to speech in noisy backgrounds is such an important topic to work on. A brief introduction to signal processing in hearing aids will be presented. With the use of frequency-specific amplification and compression (automatic gain control, AGC), hearing aids are usually doing a good job in compensating for reduced audibility and for atypical suprathreshold loudness perception. However, it is more difficult to compensate for the increased masking effect. Some examples of strategies will be presented. Finally, natural conversations in noise will be discussed. The balance between being able to have a conversation with a specific communication partner in a group of people and being able to switch attention if someone else starts to talk will be touched upon.

Bio

Karolina Smeds has a background in Engineering Physics and Audiology. Her PhD work focused on loudness aspects of hearing-aid fitting, combining clinical and theoretical aspects on the topic. For 15 years, Karolina led an external research group, ORCA Europe in Stockholm, Sweden, fully funded by the Danish hearing-aid manufacturer Widex A/S, now WS Audiology, where she still works. Recently the research group has focused on investigations of “real-life hearing”. This includes investigations of people’s auditory reality and development and evaluation of outcome measures for hearing-aid fitting success, both in the laboratory and in the field, that can produce results that are indicative of real-life performance and preference. Recently, the group has moved into the field of health psychology and into investigations of spoken conversations. At the University of Nottingham, Karolina is continuing to work on auditory reality, outcome measures that can produce ecologically valid results, and analysis of spoken conversations, primarily in collaboration with the Scottish Section of the Hearing Sciences group.

James Kates

James M. Kates
University of Colorado

Objective intelligibility and quality measures of hearing-aid processed speech

YouTube Slides

Objective intelligibility and quality measures of hearing-aid processed speech (James M. Kates and Kathryn H. Arehart)

Synopsis

Signal degradations, such as additive noise and nonlinear distortion, can reduce the intelligibility and quality of a speech signal. Predicting intelligibility and quality for hearing aids is especially difficult since these devices may contain intentional nonlinear distortion designed to make speech more audible to a hearing-impaired listener. This speech processing often takes the form of time-varying multichannel gain adjustments. Intelligibility and quality metrics used for hearing aids and hearing-impaired listeners must therefore consider the trade-offs between audibility and distortion introduced by hearing-aid speech envelope modifications. This presentation uses the Hearing Aid Speech Perception Index (HASPI) and the Hearing Aid Speech Quality Index (HASQI) to predict intelligibility and quality, respectively. These indices incorporate a model of the auditory periphery that can be adjusted to reflect hearing loss. They have been trained on intelligibility scores and quality ratings from both normal-hearing and hearing-impaired listeners for a wide variety of signal and processing conditions. The basics of the metrics are explained, and the metrics are then used to analyze the effects of additive noise on speech, to evaluate noise suppression algorithms, and to measure differences among commercial hearing aids.

Bio

James (Jim) M. Kates holds Bachelor’s and Master’s degrees in Electrical Engineering and the professional degree of Electrical Engineer, all from MIT. He is currently scholar in residence in the Department of Speech Language and Hearing Sciences at the University of Colorado, Boulder. His primary area of research is signal processing for hearing aids. His research includes developing mathematical models of auditory processing and using those models to predict speech intelligibility, speech quality, and music quality for signals processed through hearing aids. He has also conducted research in binaural hearing, particularly on how hearing-aid signals can be modified to improve spatial awareness. He retired from hearing-aid manufacturer GN ReSound in 2012, where he held the position of Research Fellow. He is a Fellow of the Acoustical Society of America and a Fellow of the Audio Engineering Society, and he received the Samuel F. Lybarger career achievement award from the American Academy of Audiology in 2015 and the Peak Performance Award from the Colorado Academy of Audiology in 2017.

Invited talks

Barry M. Gibbs

Barry M. Gibbs
University of Liverpool

An acoustician’s experience of using a hearing aid

YouTube Slides

An acoustician’s experience of using a hearing aid

Abstract

This is a personal account of the experiences of wearing a hearing aid to control Tinnitus. It is a description by a non-expert, who however, comes from a career in engineering acoustics, both as a researcher and teacher. The Tinnitus has been of long duration (45 years), is high-frequency and broad-band in character, and is confined to the right ear. With the onset of presbycusis the Tinnitus became progressively louder, again only in the right ear. On the recommendation of the NHS, a hearing aid was fitted three years ago, which, after some adjustments, suppressed the Tinnitus quite well. A year later, a purchased digital hearing aid provided more control of both the volume and frequency content. However, the use of these devices has compromised my binaural perception, which I might be able to explain, but also speech perception and classical music appreciation, which members of the audience might be able to explain.

Bio

Professor Barry M. Gibbs is Honorary Professor in the Acoustics Research Unit of the University of the Liverpool School of Architecture. His main research interest is structure-borne sound in buildings and other structures. He has been awarded about 20 major grants on this and other topics, which allowed over 25 postgraduate and postdoctoral appointments. He has authored and co-authored over 290 journal and conference papers and was founding Editor of the journal Building Acoustics, now into its third decade. He is a Fellow of the Institute of Acoustics, of the Acoustical Society of America, and of the International Institute of Acoustics and Vibration. He was President of the International Institute of Acoustics and Vibration in 2002-2004. In 2015, he received the Institute of Acoustics R W Stephens Medal. He was President of the Institute of Acoustics for the period 2018-20. As Past President, he will be Conference President of Internoise 2022, to be held in Glasgow on 21-24 August 2022.

Christine Evers

Christine Evers
University of Southampton

Machine listening in dynamic environments

YouTube Slides

Machine listening in dynamic environments

Abstract

Audio signals encapsulate vital cues about our surrounding environments. However, in everyday environments, audio signals are adversely affected by ambient noise, reverberation, and interference from multiple, competing sources. Therefore, algorithms for acoustic source localization and tracking are required to enable machines (e.g., robots, smart assistants, hearables) to focus on and interact with sound sources of interest. The LOCATA challenge provides an open-access dataset of audio recordings and the accompanying software toolbox that enable researchers to objectively evaluate and benchmark their algorithms against the state of the art. This talk will provide an overview of the LOCATA dataset and challenge results. We will discuss practical insights gained and will explore open opportunities as well as future directions.

Bio

Christine Evers is a Lecturer in the School of Electronics and Computer Science at the University of Southampton. Her research focuses on Bayesian learning for machine listening, with a particular focus on robot audition. Her research is located on the intersection of robotics, machine learning, acoustics, and statistical signal processing. She is a Co-I on the Trustworthy Autonomous Systems Hub, and the cohort lead as well as the theme lead for 'Embedded AI' on the UKRI Centre for Doctoral Training in Machine Intelligence for Nano-Electronic Devices and Systems (MINDS). Christine is a Senior Member of the IEEE, an elected member of the IEEE Signal Processing Society (SPS) Technical Committee on Audio and Acoustic Signal Processing, a member of the IEEE SPS Challenges and Data Collections committee, and serves as an associate editor for IEEE Transactions on Audio, Speech & Language Processing as well as the EURASIP Journal on Audio, Speech, and Music Processing.

Challenge papers: Beamforming approaches

ELO-SPHERES consortium system description [Paper] [YouTube]
Alastair H. Moore1, Sina Hafezi1, Rebecca Vos1, Mike Brookes1, Patrick A. Naylor1, Mark Huckvale2, Stuart Rosen2, Tim Green2, Gaston Hilkhuysen2 (1Imperial College London; 2University College London)
BUT system for the first Clarity enhancement challenge [Paper] [YouTube] [Slides]
Kateřina Žmolíková, Jan "Honza" Cernocky (Brno University of Technology)
Listening with Googlears: Low-latency neural multiframe beamforming and equalization for hearing aids [Paper] [YouTube][Slides]
Samuel Yang, Scott Wisdom, Chet Gnegy, Richard F. Lyon, Sagar Savia (Google)
Combining binaural LCMP beamforming and deep multi-frame filtering for joint dereverberation and interferer reduction in the Clarity-2021 challenge [Paper] [YouTube][Slides]
Marvin Tammen, Henri Gode, Hendrik Kayser, Eike Nustede, Nils Westhausen, Jörn Anemüller, Simon Doclo (University of Oldenburg)

Challenge papers: Non-beamforming approaches

A two-stage end-to-end system for speech-in-noise hearing aid processing [Paper] [YouTube] [Slides]
Zehai Tu, Jisi Zhang, Ning Ma, Jon Barker (Department of Computer Science, University of Sheffield)
A cascaded speech enhancement for hearing aids in noisy-reverberant conditions [Paper] [YouTube][Slides]
Xi Chen1, 2, Yupeng Shi2, Wei Xiao2, Meng Wang2, Tingzhao Wu2, Shi-dong Shang2, Nengheng Zheng1, Qinglin Meng3 (1Shenzhen University; 2Tencent; 3South China University of Technology)
Binaural speech enhancement based on deep attention layers [Paper] [YouTube] [Slides]
Tom Gajecki1, Waldo Nogueira2 (1Hannover Medical School; 2Medical University Hannover, Cluster of Excellence Hearing4all)
Hearing aid speech enhancement using U-Net convolutional neural networks [Paper] [YouTube][Slides]
Paul Kendrick (Music Tribe)

General papers

Towards intelligibility oriented audio-visual speech enhancement [Paper] [YouTube] [Slides]
Tassadaq Hussain, Mandar Gogate, Kia K. Dashtipour, Amir Hussain (Edinburgh Napier University)
Progressing learning for speech enhancement based on non-negative matrix factorisation and deep neural network [Paper] [YouTube] [Slides]
Wenbo Wang, Houguang Liu, Jianhua Yang, Songyong Liu, Shanguo Yang (China University of Mining and Technology)