Programme
The workshop is now over. Papers, videos and slides for the talks and keynotes are available via the links in the schedule below. There is also a YouTube Playlist for all talks.
The workshop was run over two days. A half-day of tutorials on the 16th was followed by the main workshop event on the 17th. All times below are UTC+1. i.e., local time in the UK.
Tutorials - 16th September
15:00 | Welcome | ||
15:10 | Tutorial 1 | Hearing loss and hearing-aid signal processing (Karolina Smeds, ORCA Europe, WS Audiology) | |
15:50 | Tutorial 2 | Objective intelligibility and quality measures of hearing-aid processed speech (James M. Kates, University of Colorado) | |
16:30 | Break | ||
16:40 | Tutorial 3 | The Clarity Challenge: Materials and methods (Clarity Team) | [YouTube] |
17:20 | Discussion | ||
18:00 | Close |
Workshop - 17th September
14:00 | Welcome | [YouTube] |
14:10 | Invited Talk: An acoustician’s experience of using a hearing aid (Barry Gibbs, University of Liverpool) | |
14:40 | The Clarity Challenge Overview / Results | [YouTube] |
15:10 | Break | |
15:20 | Challenge papers: Beamforming approaches | |
16:45 | Break | |
16:55 | Challenge papers: Non-beamforming approaches | |
18:15 | General papers | |
18:35 | Break | |
18:45 | Future Perspectives Session | |
18:50 | Invited Talk: Machine listening in dynamic environments (Christine Evers, University of Southampton) | |
19:30 | Future Challenges - Clarity Prediction Challenge and the 2nd Enhancement Challenge | [YouTube] |
19:50 | Closing Discussion | |
20:00 | Close |
Tutorials
Hearing loss and hearing-aid signal processing
Synopsis
Hearing loss leads to several unwanted effects. Loss of audibility for soft sounds is one effect, but also when amplification is used to create audibility for soft sounds, many suprathreshold deficits remain. The most common type of hearing loss is a cochlear hearing loss, where haircells or nerve synapses in the cochlea are damaged. Ageing and noise exposure are the most common causes of cochlear hearing loss. This type of hearing loss is associated with atypical loudness perception and difficulties in noisy situations. Background noise masks for instance speech to a higher degree than for a person with healthy hair cells. This explains why listening to speech in noisy backgrounds is such an important topic to work on. A brief introduction to signal processing in hearing aids will be presented. With the use of frequency-specific amplification and compression (automatic gain control, AGC), hearing aids are usually doing a good job in compensating for reduced audibility and for atypical suprathreshold loudness perception. However, it is more difficult to compensate for the increased masking effect. Some examples of strategies will be presented. Finally, natural conversations in noise will be discussed. The balance between being able to have a conversation with a specific communication partner in a group of people and being able to switch attention if someone else starts to talk will be touched upon.
Bio
Karolina Smeds has a background in Engineering Physics and Audiology. Her PhD work focused on loudness aspects of hearing-aid fitting, combining clinical and theoretical aspects on the topic. For 15 years, Karolina led an external research group, ORCA Europe in Stockholm, Sweden, fully funded by the Danish hearing-aid manufacturer Widex A/S, now WS Audiology, where she still works. Recently the research group has focused on investigations of “real-life hearing”. This includes investigations of people’s auditory reality and development and evaluation of outcome measures for hearing-aid fitting success, both in the laboratory and in the field, that can produce results that are indicative of real-life performance and preference. Recently, the group has moved into the field of health psychology and into investigations of spoken conversations. At the University of Nottingham, Karolina is continuing to work on auditory reality, outcome measures that can produce ecologically valid results, and analysis of spoken conversations, primarily in collaboration with the Scottish Section of the Hearing Sciences group.
Objective intelligibility and quality measures of hearing-aid processed speech (James M. Kates and Kathryn H. Arehart)
Synopsis
Signal degradations, such as additive noise and nonlinear distortion, can reduce the intelligibility and quality of a speech signal. Predicting intelligibility and quality for hearing aids is especially difficult since these devices may contain intentional nonlinear distortion designed to make speech more audible to a hearing-impaired listener. This speech processing often takes the form of time-varying multichannel gain adjustments. Intelligibility and quality metrics used for hearing aids and hearing-impaired listeners must therefore consider the trade-offs between audibility and distortion introduced by hearing-aid speech envelope modifications. This presentation uses the Hearing Aid Speech Perception Index (HASPI) and the Hearing Aid Speech Quality Index (HASQI) to predict intelligibility and quality, respectively. These indices incorporate a model of the auditory periphery that can be adjusted to reflect hearing loss. They have been trained on intelligibility scores and quality ratings from both normal-hearing and hearing-impaired listeners for a wide variety of signal and processing conditions. The basics of the metrics are explained, and the metrics are then used to analyze the effects of additive noise on speech, to evaluate noise suppression algorithms, and to measure differences among commercial hearing aids.
Bio
James (Jim) M. Kates holds Bachelor’s and Master’s degrees in Electrical Engineering and the professional degree of Electrical Engineer, all from MIT. He is currently scholar in residence in the Department of Speech Language and Hearing Sciences at the University of Colorado, Boulder. His primary area of research is signal processing for hearing aids. His research includes developing mathematical models of auditory processing and using those models to predict speech intelligibility, speech quality, and music quality for signals processed through hearing aids. He has also conducted research in binaural hearing, particularly on how hearing-aid signals can be modified to improve spatial awareness. He retired from hearing-aid manufacturer GN ReSound in 2012, where he held the position of Research Fellow. He is a Fellow of the Acoustical Society of America and a Fellow of the Audio Engineering Society, and he received the Samuel F. Lybarger career achievement award from the American Academy of Audiology in 2015 and the Peak Performance Award from the Colorado Academy of Audiology in 2017.
Invited talks
An acoustician’s experience of using a hearing aid
Abstract
This is a personal account of the experiences of wearing a hearing aid to control Tinnitus. It is a description by a non-expert, who however, comes from a career in engineering acoustics, both as a researcher and teacher. The Tinnitus has been of long duration (45 years), is high-frequency and broad-band in character, and is confined to the right ear. With the onset of presbycusis the Tinnitus became progressively louder, again only in the right ear. On the recommendation of the NHS, a hearing aid was fitted three years ago, which, after some adjustments, suppressed the Tinnitus quite well. A year later, a purchased digital hearing aid provided more control of both the volume and frequency content. However, the use of these devices has compromised my binaural perception, which I might be able to explain, but also speech perception and classical music appreciation, which members of the audience might be able to explain.
Bio
Professor Barry M. Gibbs is Honorary Professor in the Acoustics Research Unit of the University of the Liverpool School of Architecture. His main research interest is structure-borne sound in buildings and other structures. He has been awarded about 20 major grants on this and other topics, which allowed over 25 postgraduate and postdoctoral appointments. He has authored and co-authored over 290 journal and conference papers and was founding Editor of the journal Building Acoustics, now into its third decade. He is a Fellow of the Institute of Acoustics, of the Acoustical Society of America, and of the International Institute of Acoustics and Vibration. He was President of the International Institute of Acoustics and Vibration in 2002-2004. In 2015, he received the Institute of Acoustics R W Stephens Medal. He was President of the Institute of Acoustics for the period 2018-20. As Past President, he will be Conference President of Internoise 2022, to be held in Glasgow on 21-24 August 2022.
Machine listening in dynamic environments
Abstract
Audio signals encapsulate vital cues about our surrounding environments. However, in everyday environments, audio signals are adversely affected by ambient noise, reverberation, and interference from multiple, competing sources. Therefore, algorithms for acoustic source localization and tracking are required to enable machines (e.g., robots, smart assistants, hearables) to focus on and interact with sound sources of interest. The LOCATA challenge provides an open-access dataset of audio recordings and the accompanying software toolbox that enable researchers to objectively evaluate and benchmark their algorithms against the state of the art. This talk will provide an overview of the LOCATA dataset and challenge results. We will discuss practical insights gained and will explore open opportunities as well as future directions.
Bio
Christine Evers is a Lecturer in the School of Electronics and Computer Science at the University of Southampton. Her research focuses on Bayesian learning for machine listening, with a particular focus on robot audition. Her research is located on the intersection of robotics, machine learning, acoustics, and statistical signal processing. She is a Co-I on the Trustworthy Autonomous Systems Hub, and the cohort lead as well as the theme lead for 'Embedded AI' on the UKRI Centre for Doctoral Training in Machine Intelligence for Nano-Electronic Devices and Systems (MINDS). Christine is a Senior Member of the IEEE, an elected member of the IEEE Signal Processing Society (SPS) Technical Committee on Audio and Acoustic Signal Processing, a member of the IEEE SPS Challenges and Data Collections committee, and serves as an associate editor for IEEE Transactions on Audio, Speech & Language Processing as well as the EURASIP Journal on Audio, Speech, and Music Processing.
Challenge papers: Beamforming approaches
ELO-SPHERES consortium system description [Paper] [YouTube] (1Imperial College London; 2University College London) |
|
BUT system for the first Clarity enhancement challenge [Paper] [YouTube] [Slides] (Brno University of Technology) |
|
Listening with Googlears: Low-latency neural multiframe beamforming and equalization for hearing aids [Paper] [YouTube][Slides] (Google) |
|
Combining binaural LCMP beamforming and deep multi-frame filtering for joint dereverberation and interferer reduction in the Clarity-2021 challenge [Paper] [YouTube][Slides] (University of Oldenburg) |
Challenge papers: Non-beamforming approaches
A two-stage end-to-end system for speech-in-noise hearing aid processing [Paper] [YouTube] [Slides] (Department of Computer Science, University of Sheffield) |
|
A cascaded speech enhancement for hearing aids in noisy-reverberant conditions [Paper] [YouTube][Slides] (1Shenzhen University; 2Tencent; 3South China University of Technology) |
|
Binaural speech enhancement based on deep attention layers [Paper] [YouTube] [Slides] (1Hannover Medical School; 2Medical University Hannover, Cluster of Excellence Hearing4all) |
|
Hearing aid speech enhancement using U-Net convolutional neural networks [Paper] [YouTube][Slides] (Music Tribe) |
General papers
Towards intelligibility oriented audio-visual speech enhancement [Paper] [YouTube] [Slides] (Edinburgh Napier University) |
|
Progressing learning for speech enhancement based on non-negative matrix factorisation and deep neural network [Paper] [YouTube] [Slides] (China University of Mining and Technology) |