About Clarity

Why ‘the Clarity Project’? In 1984, a study by Hagerman and Gabrielsson showed that, for a large majority of hearing-aid users, good sound quality was the most important property of hearing aids, with clarity being the most important sound-quality factor.

In this project, we will be organising a series of machine learning challenges for advancing hearing-aid signal processing and speech-in-noise perception modelling. To facilitate these challenges, we will generate open-access datasets, models and infrastructure, including,

  • open-source tools for generating realistic training materials for different listening scenarios;
  • baseline models of hearing impairment;
  • baseline models of hearing-device speech processing;
  • baseline models of speech perception;
  • databases of speech perception in noise for hearing impaired listeners.

Over 5 years we will deliver three challenge rounds. In round one, speech will occur in the context of a ‘living room’, i.e., a person speaking in a moderately reverberant room with minimal background noise.

We expect to open round one in October 2020 for a closing date in June 2021 and results in October 2021.

Background

One in six people in the UK has some level of hearing loss, and this number is certain to increase as the population ages. Yet only 40% of people who could benefit from hearing aids have them, and most people who have the devices don’t use them regularly. A major reason for this low uptake and use is the perception that hearing aids perform poorly.

A critical problem for hearing aids is speech in noise, even for the most sophisticated devices. A hearing aid wearer may have difficulty conversing with family or friends while the television is on, and hearing public announcements at the train station. Such difficulties can lead to social isolation, and thereby reduce emotional and physical well being. Consequently, how hearing aid devices process speech in noise is crucial.

Our approach is inspired by the latest developments in automatic speech recognition and speech synthesis, two areas in which public competitions have led to rapid advancements in technology. We want to encourage more researchers to consider how their skills and technology could benefit the millions of people with hearing impairments.

Round one: the challenges

The two challenges of round one will feature

  • A real or simulated living room;
  • One source of speech;
  • A range of reverberation times (low to moderate);
  • Real or simulated domestic noise backgrounds, e.g., radio, television, air conditioning.

Transcribed utterances will be provided for supplied audio signals. A generative tool will be supplied so that entrants can generate a large database of audio signals that can be used as material for training machine learning models.

The round will comprise two challenges:

  • Hearing aid signal processing;
  • Perception models: hearing loss and/or speech intelligibility.

Participants can submit to one challenge or both. Within the perception challenge, participants can submit one or more models.

Entrants to the processing challenge will be required to provide

  • Processed signals with associated information about the signals (speech material, noise sources, reverberation time, etc.);
  • System information;
  • Documentation.

Entrants to the perception challenge will be required to provide

  • Intelligibility scores and associated information about the signals and/or
  • Signals processed by the hearing loss model with associated information;
  • System information;
  • Documentation.

Entrants are also encouraged to provide their models.

Round one: evaluation

Entries to the processing challenge will be evaluated as follows:

  • Initially, entries will be ranked on the basis of objective speech intelligibility assessments (with a threshold speech quality requirement)
  • Subsequently, a subset of the entries will be ranked on the basis of real speech intelligibility scores from our listener panel.

Entries to the perception challenge will be evaluated according to how well they predict real intelligibility scores from our panel of listeners.

Who we are…