Skip to main content

Launch of CEC3

· 2 min read
Jon Barker
Clarity Team Member

We are pleased to announce the launch of the 3rd Clarity Enhancement Challenge (CEC3).

The challenge follows on from the success of the 2nd Clarity Enhancement Challenge (CEC2) and is about improving the performance of hearing aids for speech-in-noise. The challenge extends CEC2 is three separate directions which have been presented as three different tasks.

  • Task 1: Real ambisonic room impulse responses (🔥 LIVE 🔥)
  • Task 2: Real hearing aid signals (🔥 LIVE 🔥)
  • Task 3: Real dynamic backgrounds (launching 1st May)

Participants are welcome to submit to one or more tasks. We are particularly interested in systems that handle all three cases with little or no redesign/retraining.

The website has been fully updated to provide you with all the information you will need to participate. The necessary data and software are available for download.

The schedule for the challenge is as follows:

  • 2nd April 2024: Launch of Task 1 and Task 2 with training and development data; initial tools.
  • 1st May 2024: Launch of Task 3.
  • 25th July 2024: Evaluation data released
  • 2nd Sept 2024: 1st round submission for evaluation by objective measure
  • 15th Sept 2024: 2nd round submission deadline for listening tests (Task 2 and 3)
  • Sept-Nov 2024: Listening test evaluation period.
  • Dec 2024: Results announced at a Clarity Challenge Workshop (Details TBD); prizes awarded.

If you have any questions please do not hesitate to contact us at claritychallengecontact@gmail.com. If you wish to be kept informed, please sign up to our Google group. If you are considering participating, please complete the registration form on the registration page. Registration is free and carries no obligation to participate, but will help us to keep you informed of any changes to the challenge.

CPC2 eval data released

· One min read
Jon Barker
Clarity Team Member

The CPC2 evaluation data has now been released.

The data is available for download as a single 478 MB file, clarity_CPC2_data.test.v1_0.tgz. The evaluation data should be untarred into the same root as the training data. Further details can be found on the challenge website.

The data consists of the hearing aid algorithm output signals, clean reference signals, listener audiograms, and head rotation information. Listener responses are not provided for the evaluation data but will be made available after the submission window has closed.

For details on how to prepare your submission please see the instructions on the website.

If you have any questions please feel free to post them on this forum.

The submission window will close on the 31st of July.

Good luck!

Clarity-2023 Workshop @ Interspeech, Dublin

· 3 min read
Jon Barker
Clarity Team Member

We are pleased to announce the 4th ISCA Clarity Workshop on Machine Learning Challenges for Hearing Aids (Clarity-2023).

The event will be a one-day workshop held as an ISCA satellite event to Interspeech 2023 in Dublin, Ireland.

For registration and programme details please visit the workshop website

https://claritychallenge.github.io/clarity2023-workshop/

IMPORTANT DATES

  • 2nd June 2023 - Workshop Submission Deadline (Regular Papers)
  • 31st July 2023 - Workshop Submission Deadline (Clarity Challenge Papers)
  • 5th August 2023 - Registration closes
  • 19th August - Workshop / Clarity Challenge results announced

About

One of the biggest challenges for hearing-impaired listeners is understanding speech in the presence of background noise. Everyday social noise levels can have a devastating impact on speech intelligibility. The inability to communicate effectively can lead to social withdrawal and isolation. Disabling hearing impairment affects 360 million people worldwide, with that number increasing because of the ageing population. Unfortunately, current hearing aid technology is often ineffective in noisy situations. Although amplification can restore audibility, it does not compensate fully for the effects of hearing loss.

The Clarity workshops are designed to stimulate a two-way conversation between the speech research community and hearing aid developers. Hearing aid developers, who are not typically represented at Interspeech, will have an opportunity to present the challenges of their industry to the speech community; the speech community will be able to present and discuss potentially transformative approaches to speech in noise processing in the presence of hearing researchers and industry experts.

Topics

Any work related to the challenges of hearing aid signal processing will be considered relevant topics include,

  • Binaural technology for speech enhancement and source separation
  • Multi-microphone processing technology
  • Real-time approaches to speech enhancement
  • Statistical model-driven approaches to hearing aid processing
  • Audio quality & intelligibility assessment hearing aid and cochlear implant users
  • Efficient and effective integration of psychoacoustic testing in machine learning
  • Machine learning for diverse target listeners
  • Machine learning models of hearing impairment

The 2nd Clarity Prediction Challenge

The Clarity-2023 will also host the 2nd Clarity Prediction Challenge, that is addressing the problem of developing new intrusive and non-intrusive approaches to hearing-aid speech intelligibility prediction. The Challenge will be launching on 1st March, is you may be interested in participating please sign up to our Google group for further announcements.

Keynote Talks

  • Prof Fei Chen, SUSTech, China,
  • Prof DeLiang Wang, Ohio State University, US

Organisers

  • Michael Akeroyd, University of Nottingham
  • Jon Barker, University of Sheffield
  • Trevor Cox, University of Salford
  • Fei Chen, Southern University of Science and Technology, China
  • John Culling, University of Cardiff
  • Simone Graetzer, University of Salford
  • Andrew Hines, University College Dublin

For further information

To be kept up to date please join our Clarity Challenge Google group. If you have questions, please contact us directly using the contact details found here.

Funded by the Engineering and Physical Sciences Research Council (EPSRC), UK

Supported by RNID (formerly Action on Hearing Loss), Hearing Industry Research Consortium, Amazon TTS Research

Announcing the 2nd Clarity Prediction Challenge (CPC2)

· 2 min read
Jon Barker
Clarity Team Member
Trevor Cox
Clarity Team Member

The 2nd Clarity Prediction Challenge - Register Now

To allow the development of better hearing aids, we need ways to evaluate the speech intelligibility of audio signals automatically. We need a prediction model that takes the audio produced by a hearing aid and the listener's characteristics (e.g. audiogram) and estimates the speech intelligibility score that the listener would achieve in a listening test.

Last year we ran the CPC1 Challenge to develop such models. The challenge was presented at an online workshop and a special session of Interspeech 2022. We are now running the 2nd round of this challenge (CPC2), which builds on the first by using more complex signals and a larger set of listening test data for training and evaluating the prediction systems.

The outputs of the new challenge will be presented at an ISCA workshop that is being run as a satellite event to Interspeech 2023 in Dublin on 19th August 2023.

Full details can be found on the Challenge Website.

Register now to take part

If you are interested in participating please register now via the online registration form.

Important Dates

  • March - Launch of challenge, release of training data + baseline system.
  • 1st July - Release of evaluation data and opening of submission window.
  • 31st July - Submission deadline.
  • 19th August - ISCA Clarity 2023 workshop @ Interspeech
  • 19th September - Deadline for submission of finalised Workshop papers

What will be provided

  • Audio produced by a variety of (simulated) hearing aids for speech-in-noise;
  • The corresponding clean reference signals (the original speech);
  • Characteristics of the listeners (pure tone audiograms, etc);
  • The measured speech intelligibility scores from listening tests, where hearing-impaired listeners were asked to say what they heard after listening to the hearing aid processed signals.
  • Software tools including a baseline system based on HASPI scores.

For further information

To be kept up to date please join our Clarity Challenge Google group. If you have questions, please contact us directly using the contact details found here.

ICASSP 2023 evaluation data released

· One min read
Jon Barker
Clarity Team Member
Trevor Cox
Clarity Team Member

We are pleased to announce that the evaluation dataset for the ICASSP Clarity Challenge is now available for download.

https://www.myairbridge.com/en/#!/folder/EkthOZZeBW33aaDBWSDadTgpOkbgaFxO

For instructions on preparing your submission please visit:

https://claritychallenge.org/docs/icassp2023/taking_part/icassp2023_submission

If you have not yet registered it is not too late to do so. Please use the form at the link below and we will then send you a Team ID and a personalised upload link for your submission.

https://claritychallenge.org/docs/icassp2023/taking_part/icassp2023_registration

Note, we have extended the deadline for submission until Friday 10th February so that teams have a full week to process the signals.

The remaining schedule is as follows,

  • 2nd Feb 2023: Release of evaluation data.
  • 10th Feb 2023: Teams submit processed signals and technical reports.
  • 14th Feb 2023: Results released. Top 5 ranked teams invited to submit papers to ICASSP-2023
  • 20th Feb 2023: Invited papers submitted to ICASSP-2023
  • 4-9th June 2023: Overview paper and invited papers presented at dedicated ICASSP session

Announcement of ICASSP 2023 Grand Challenge

· One min read
Clarity Team Member

We are pleased to announce that registration for the ICASSP 2023 Clarity Grand Challenge is now open.

To register please complete the simple Google form found on the registration page.

The remaining important dates for the challenge are as follows:

  • 28th Nov 2022: Challenge launch: Release training/dev data; tools; baseline; rules & documentation.
  • 2nd Feb 2023: Release of evaluation data.
  • 10th Feb 2023: Teams submit processed signals and technical reports.
  • 14th Feb 2023: Results released. Top 5 ranked teams invited to submit papers to ICASSP-2023
  • 20th Feb 2023: Invited papers submitted to ICASSP-2023
  • 4-9th June 2023: Overview paper and invited papers presented at dedicated ICASSP session

The challenge training, dev data and initial tools are now fully from the Github repository.

If you have any questions please do not hesitate to contact us at claritychallengecontact@gmail.com.

CPC1 results and prizes

· One min read
Jon Barker
Clarity Team Member

The 1st Clarity Prediction Challenge is now complete. Thank you to all who took part!

The full results can be found on the Clarity-2022 workshop website where you will also find links to system papers and the overview presentation.

Many of the systems have led to successful Interspeech 2022 papers and will be contributing to the Interspeech 2022 special session on Speech Intelligibility Prediction for Hearing-Impaired Listeners. We hope to see many of you in Korea!

In the meantime, please be sure to check out the onging 2nd Clarity Enhancement Challenge. The deadline for submitting enhanced signals is 1st September 2022, so there is still time to participate. To register a team please use the form here.

CEC2 registration open

· One min read
Jon Barker
Clarity Team Member

We are pleased to announce that registration for the 2nd Clarity Enhancement Challenge (CEC2) is now open.

To register please complete the simple Google form found on the registration page.

The remaining important dates for the challenge are as follows:

  • 25th July 2022: Evaluation data released
  • 1st Sept 2022: 1st round submission deadline for evaluation by objective measure
  • 15th Sept 2022: 2nd round submission deadline for listening tests
  • Sept-Nov 2022: Listening test evaluation period.
  • 2nd Dec 2022: Results announced at a Clarity Challenge Workshop; prizes awarded.

The challenge training, dev data and initial tools are now fully from the Github repository.

If you have any questions please do not hesitate to contact us at claritychallengecontact@gmail.com.

Release of CEC2 baseline

· One min read
Jon Barker
Clarity Team Member

We are pleased to announce the release of the 2nd Clarity Enhancement Challenge (CEC2) baseline system code.

The baseline code has been released in the latest commit to the Clarity GitHub repository.

The baseline system perform NAL-R amplification according to the audiogram of the target listener, followed by a simple gain control and output of the signals to 16-bit stereo wav format. The system has been kept deliberately simple with no microphone array processing or attempt at noise cancellation.

HASPI scores for the dev set have been measured. The scores are as follows.

SystemHASPI
Unprocessed0.1615
NAL-R baseline0.2493

See here for further details.

If you have any problems using the baseline code please do not hesitate to contact us at claritychallengecontact@gmail.com, or post questions on the Google group.

Launch of CEC2

· One min read
Jon Barker
Clarity Team Member

We are pleased to announce the launch of the 2nd Clarity Enhancement Challenge (CEC2).

The website has been fully updated to provide you with all the information you will need to participate in the challenge.

The schedule for the challenge is as follows:

  • 13th April 2022: Release of training and development data; initial tools.
  • 30th April 2022: Release of full toolset and baseline system.
  • 1st May 2022: Registration for challenge entrants opens.
  • 25th July 2022: Evaluation data released
  • 1st Sept 2022: 1st round submission deadline for evaluation by objective measure
  • 15th Sept 2022: 2nd round submission deadline for listening tests
  • Sept-Nov 2022: Listening test evaluation period.
  • 2nd Dec 2022: Results announced at a Clarity Challenge Workshop; prizes awarded.

The challenge training, dev data and initial tools will be available from 13th April. In the meantime, please visit the CEC2 Intro page to learn more about the task.

If you have any questions please do not hesitate to contact us at claritychallengecontact@gmail.com.

Live events in January

· 2 min read
Lara Harris
Clarity Team Member

The Clarity team are hosting two live sessions this month related to the Prediction Challenge. Everyone is welcome to attend, whether or not you have registered to participate in the challenge or are still considering signing up.

The presentations will be very similar to the webinar in November. These events are intended as a chance for people in different time zones to attend live and ask the team questions.

Hosting is via Microsoft Teams. You can join from your browser without needing to install Teams, but if you join from a mobile device you may need to install the Teams app.

Webinar - Challenge Overview

Friday 14th January

9:00 GMT | 17:00 CST (GMT+8)

Click here to join the webinar

An introduction to the aims of the challenge and some background to the problem of speech intelligibility prediction for hearing aids:

  • Welcome, introduction to Clarity.
  • Speech intelligibility models: Overview and why are they needed.
  • Hearing impairment speech intelligibility prediction.
  • The prediction challenge - details and how you can sign up to participate.
  • Audience questions / discussion.

The presentations will be recorded and made available online shortly after the event. The Q&A discussion will not be recorded.

You are welcome to join slightly later if you are only interested in joining for the Q&A section (presentations should finish around 9:40 GMT).

Live Q&A session

Monday 17th January

17:00 GMT | 12:00 EST (GMT-5) | 9:00 PST (GMT-8)

Click here to join the Q&A

A chance to ask the team questions about the Clarity Prediction Challenge - for anyone that could not attend the webinar on Friday 14th due to time zone differences.

Please note there will be no presentations in this session. The talks from Friday’s webinar will be uploaded to the Clarity project YouTube channel later in the day so you are invited to watch those before joining this live Q&A.

Introduction Webinar - Recording Available

· One min read
Lara Harris
Clarity Team Member

The Clarity team recently hosted a webinar to introduce the Prediction Challenge. The recording is now available to view online:

The slides are available to download:

1 Welcome and Overview

2 Speech Intelligibility Models

3 Hearing Impariment and SI Prediction

4 Clarity Prediction Challenge Details

Note that we did not record the Q&A session at the end, but if you have questions about taking part in the challenge you can contact us at claritychallengecontact@gmail.com

Welcome to CPC1

· One min read
Trevor Cox
Clarity Team Member

Welcome to the new Clarity CPC1 site for the first prediction challenge launching in autumn 2021. Feel free to look around. At the moment we're still doing listening tests and preparing the data, so the download links don't work. If anything is unclear or you've got questions, please contact us through the Google group.

CEC1 submissions received

· One min read
Jon Barker
Clarity Team Member

The CEC1 submission deadline has now passed. Thank you to all the teams who sent us signals.

Please remember to submit your finalised system descriptions by June 22nd to the Clarity workshop following the instructions provided on the workshop website.

We are currently busy evaluating the submissions using the MBSTOI metric. We will be contacting teams on the 22nd with details of how to prepare signals for the listening panel evaluation.

If you have been working on the challenge but missed the submission deadline then please do get in contact. We will still be happy to receive your signals and system descriptions. Although late entries will not be eligible for the official challenge ranking, we will be happy to compute the eval set MBSTOI score for you and may even be able to arrange listening test evaluation through our panel.

For any questions please contact us at claritychallengecontact@gmail.com or by posting to the Clarity challenge google group.

CEC1 eval data released

· 2 min read
Jon Barker
Clarity Team Member

The evaluation dataset is now available to download from the myairbridge download site. The evaluation data filename is clarity_CEC1_data.scenes_eval.v1_1.tgz.

Full details of how to prepare your submission are now available on this site. Please read them carefully.

Registration: Teams must register via the Google form on the How To Submit page of this site. (Please complete this even if you have already completed a pre-registration form). Only one person from each team should register. Only those who have registered will be eligible to proceed to the evaluation. Once you have registered you will receive a confirmation email, a team ID and a link to a Google Drive to which you can upload your signals.

Submission deadline: The deadline for submission is the 15th June.

The submission consists of two components:

i) a technical document of up to 2 pages describing the system/model and any external data and pre-existing tools, software and models used. This should be prepared as a Clarity-2021 workshop abstract and submitted to the workshop.

ii) the set of processed signals that we will evaluate using the MBSTOI metric. Details of how to name and package your signals for upload can be found on the How To Submit page.

Listening Tests: Teams that do well in the MBSTOI evaluation will be notified on 22nd June and invited to submit further signals for the second stage Listening Test evaluation.

For any questions please contact us at claritychallengecontact@gmail.com or by posting to the Clarity challenge google group.

Baseline speech intelligibility model in round one

· 4 min read
Simone Graetzer
Clarity Team Member

Some comments on signal alignment and level-insensitivity

Our baseline binaural speech intelligibility measure in round one is the Modified Binaural Short-Time Objective Intelligibility measure, or MBSTOI. This short post outlines the importance of correcting for delays that your hearing aid processing algorithm introduces into the audio signals to allow MBSTOI to estimate the speech intelligibility accurately. It also discusses the importance of considering the audibility of signals before evaluation with MBSTOI.

Evaluation

In stage one, entries will be ranked according to the average MBSTOI score across all samples in the evaluation test set. In the second stage, entries will be evaluated by the listening panel. There will be prizes for both stages. See this page for more information.

Latency, computation time and real-time operation

· 3 min read
Trevor Cox
Clarity Team Member

An explanation of the time and computational limits for the first round of the enhancement challenge.

The 1st Clarity Enhancement Challenge

For a hearing aid to work well for users, the processing needs to be quick. The output of the hearing aid should be produced with a delay of less than about 10 ms. Many audio processing techniques are non-causal, i.e., the output of the system depends on samples from the future. Such processing is useless for hearing aids and therefore our rules include a restriction on the use of future samples.

The rules state the following:

  • Systems must be causal; the output at time t must not use any information from input samples more than 5 ms into the future (i.e., no information from input samples >t+5ms).
  • There is no limit on computational cost.

Clarity Challenge pre-announcement

· 3 min read
Trevor Cox
Clarity Team Member

Although age-related hearing loss affects 40% of 55 to 74 year-olds, the majority of adults who would benefit from hearing aids don’t use them. A key reason is simply that hearing aids don’t provide enough benefit.

Picking out speech from background noise is a critical problem even for the most sophisticated devices. The purpose of the Clarity Challenges is to catalyse new work to radically improve the speech intelligibility provided by hearing aids.

The series of challenges will consider increasingly complex listening scenarios. The first round, launching in January 2021, will focus on speech in indoor environments in the presence of a single interferer. It will begin with a challenge involving improving hearing aid processing. Future challenges on how to model speech-in-noise perception will be launched at a later date.

Person using tablet

One approach to our enhancement challenge

· 4 min read
Trevor Cox
Clarity Team Member

Improving hearing aid processing using DNNs blog. A suggested approach to overcome the non-differentiable loss function.

The aim of our Enhancement Challenge is to get people producing new algorithms for processing speech signals through hearing aids. We expect most entries to replace the classic hearing aid processing of Dynamic Range Compressors (DRCs) with deep neural networks (DNN) (although all approaches are welcome!). The first round of the challenge is going to be all about improving speech intelligibility.

Setting up a DNN structure and training regime for the task is not as straightforward as it might first appear. Figure 1 shows an example of a naive training regime. An audio example of Speech in Noise (SPIN) is randomly created (audio sample generation, bottom left), and a listener is randomly selected with particular hearing loss characteristics (random artificial listener generation, top left). The DNN Enhancement model (represented by the bright yellow box) then produces improved speech in noise. (Audio signals in pink are two-channel, left and right because this is for binaural hearing aids.)

schematic

Figure 1

Next the improved speech in noise is passed to the Prediction Model in the lime green box, and this gives an estimation of the Speech Intelligibility (SI). Our baseline system will include algorithms for this. We’ve already blogged about the Hearing Loss Simulation. Our current thinking is that the intelligibility model will be using a binaural form of the Short-Time Objective Intelligibility Index (STOI) [1]. The dashed line going back to the enhancement model shows that the DNN will be updated based on the reciprocal of the Speech Intelligibility (SI) score. By minimising (1/SI), the enhancement model will be maximising intelligibility.

The speech-in-noise problem part two

· 5 min read
Simone Graetzer
Clarity Team Member
Trevor Cox
Clarity Team Member

How hearing aids address the problem of speech-in-noise in noisy and quieter places. We’ll also discuss what machine learning techniques are often used for noise reduction, and some promising strategies for hearing aids.

Tablet user

In a previous blog, we set out the problem of using hearing aids to pick out speech in noisy places. When the signal-to-noise ratio (SNR) is low, hearing aids can only do so much to improve the intelligibility of the speech.

A solitary hearing aid has various ways of addressing everyday constant noises such as cars, vacuum cleaners and fans. The aids work best when the noise is not too intrusive and SNR is relatively high. Problems arise when the noise is high (low SNRs), because then the hearing aid processing can distort the sound too much. While the hearing aid might have limited success in improving intelligibility in certain cases, they can still make the noise less annoying (e.g., Brons et al., 2014).

Using multiple microphones on each hearing aid can help in noisy conditions. The sound from the microphones is combined in a way that boosts the speech relative to the noise. This technology can be put into larger hearing aids, when there is enough spacing between the front and rear microphones.

One of the reasons why our brains are really good at picking out speech from the hubbub of a restaurant, is that it compares and contrasts the sounds from both ears. Our hearing is binaural. Similarly, if you have a hearing aids in both ears, they work better if they collaborate on reducing the noise.

Crucial to how our brains locate sound and pick out speech in noise are timing and level cues that come from comparing the sound at both ears. When sound comes from the side:

  • interaural time differences occur because the sound arrives at one ear earlier than the other.
  • interaural level differences occur because the sound has to bend around the head to reach the furthest ear.

Binaural hearing aids communicate wirelessly and use noise reduction strategies that preserve these interaural time and level difference cues (e.g., Van den Bogaert et al., 2009). This allows the listener’s brain to better locate the speech and boost this compared to the noise.

Hearing loss simulation

· 4 min read
Trevor Cox
Clarity Team Member
Simone Graetzer
Clarity Team Member

What our hearing loss algorithms simulate, with audio examples to illustrate hearing loss.

Our challenge entrants are going to use machine learning to develop better processing of speech in noise (SPIN) for hearing aids. For a machine learning algorithm to learn new ways of processing audio for the hearing impaired, it needs to estimate how the sound will be degraded by any hearing loss. Hence, we need an algorithm to simulate hearing loss for each of our listeners. The diagram belows shows our draft baseline system that was detailed in a previous blog. The hearing loss simulation is part of the prediction model. The Enhancement Model to the left is effectively the hearing aid and the Prediction Model to the right is estimating how someone will perceive the intelligibility of the speech in noise.

baseline

The draft baseline system (where SPIN is speech in noise, DRC is Dynamic Range Compression, HL is Hearing Loss, SI is Speech Intelligibility and L & R are Left and Right).

There are different causes of hearing loss, but we’re concentrating on the most common type that happens when you age (presbycusis). RNID (formerly Action on Hearing Loss) estimate that more than 40% of people over the age of 50 have a hearing loss, and this rises to 70% of people who are older than 70.

The aspects of hearing loss we’ve decided to simulate are

  1. The loss of ability to sense the quietest sounds (increase in absolute threshold).
  2. How as an audible sound increases in level, the perceived increase in loudness is greater than normal (loudness recruitment) (Moore et al. 1996).
  3. How the ear has a poorer ability to discriminate the frequency of sounds (impaired frequency selectivity).

Sounds for round one

· 4 min read
Trevor Cox
Clarity Team Member

We’ll be challenging our contestants to find innovative ways of making speech more audible for hearing impaired listeners when there is noise getting in the way. But what noises should we consider? To aid us in choosing sounds and situations that are relevant to people with hearing aids, we held a focus group.

We wanted to know about

  • Everyday background noises that make having a conversation difficult.
  • The characteristics of speech after it has been processed by a hearing-aid that hearing aid listeners would value.

A total of eight patients (four males, four females) attended the meeting, six of whom were recruited from the Nottingham Biomedical Research Centre’s patient and public involvement contact list. Two attendees were recruited from a local lip reading class organised by the Nottinghamshire Deaf Society. The range of hearing loss within the group is from mild to severe. They all regularly use bilateral hearing aids.

Our focus was on the living room because that is the scenario for round one of the challenges.

People Listening

Photo by Gustavo Fring from Pexels

The speech-in-noise problem

· 4 min read
Simone Graetzer
Clarity Team Member
Trevor Cox
Clarity Team Member

People often have problems understanding speech in noise, and this is one of the main deficits of hearing aids that our machine learning challenges will address.

cocktail party

It’s common for us to hear sounds coming simultaneously from different sources. Our brains then need to separate out what we want to hear (the target speaker) from the other sounds. This is especially difficult when the competing sounds are speech. This has the quaint name, The Cocktail Party Problem (Cherry, 1953). We don’t go to many cocktail parties, but we encounter lots of times where the The Cocktail Party Problem is important. Hearing a conversation in a busy restaurant, trying to understand a loved one while the television is on or hearing the radio in the kitchen when the kettle is boiling, are just a few examples.

Difficulty in picking out speech in noise is really common if you have a hearing loss. Indeed, it’s often when people have problems doing this that they realise they have a hearing loss.

“Hearing aids don’t work when there is a lot of background noise. This is when you need them to work.”

-- Statement from a hearing aid wearer (Kochkin, 2000)

Hearing aids are the the most common form of treatment for hearing loss. However, surveys indicate that at least 40% of hearing aids are never or rarely used (Knudsen et al., 2010). A major reason for this is dissatisfaction with performance. Even the best hearing aids perform poorly for speech in noise. This is particularly the case when there are many people talking at the same time, and when the amount of noise is relatively high (i.e., the signal-to-noise ratio (SNR) is low). As hearing ability worsen with age, the ability to understand speech in background noise also reduces (e.g., Akeroyd, 2008).

Why use machine learning challenges for hearing aids?

· 3 min read
Trevor Cox
Clarity Team Member

The Clarity Project is based around the idea that machine learning challenges could improve hearing aid signal processing. After all this has happened in other areas, such as automatic speech recognition (ASR) in the presence of noise. The improvements in ASR have happened because of:

  • Machine learning (ML) at scale – big data and raw GPU power.
  • Benchmarking – research has developed around community-organised evaluations or challenges.
  • Collaboration has been enabled by these challenges, allowing working across communities such as signal processing, acoustic modelling, language modelling and machine learning

We’re hoping that these three mechanisms can drive improvements in hearing aids.

Components of a challenge

There needs to be a common task based on a target application scenario to allow communities to gain from benchmarking and collaboration. Clarity project’s first enhancement challenge will be about hearing speech from a single talker in a typical living room, where there is one source of noise and a little reverberation.

We’re currently working on developing simulation tools to allow us to generate our living room data. The room acoustic will be simulated using RAVEN and the Hearing Device Head-related Transfer Functions will come from Denk’s work. We’re working on getting better, more ecologically valid speech than is often used in speech intelligibility work.

baseline

Entrants are then given training data and development (dev) test data along with a baseline system that represents the current state-of-the-art. You can find a post and video on the current thinking on the baseline here. We’re still working on the rules stipulating what is and what is not allowed (for example, will entrants be allowed to use data from outside the challenge).

Clarity’s first enhancement challenge is focussed on maximising the speech intelligibility (SI) score. We will evaluate this first through a prediciton model that is based on a hearing loss simulation and an objective metric for speech intellibility. Simulation has been hugely important for generating training data in the CHIME challenges and so we intend to use that approach in Clarity. But results from simulated test sets cannot be trusted and hence a second evaluation will come through perceptual tests on hearing impaired subjects. However, one of our current problems is that we can’t bring listeners into our labs because of COVID-19.

We’ll actually be running two challenges in roughly parallel, because we’re also going to task the community to improve our prediction model for speech intelligibility.

We’re running a series of challenges over five years. What other scenarios should we consider? What speech? What noise? What environment? Please comment below.

Acknowledgements

Much of this text is based on Jon Barker’s 2020 SPIN keynote

The baseline

· One min read
Trevor Cox
Clarity Team Member

An overview of the current state of the baseline we’re developing for the machine learning challenges

The baseline

We’re currently developing the baseline processing that challenge entrants will need. This takes a random listener and a random audio sample of speech in noise (SPIN) and passes that through a simulated hearing aid (the Enhancement Model). This improves the speech in noise. We then have an algorithm (the Prediction Model) to estimate the Speech Intelligibility that the listener would perceive (SI score). This score can then be used to drive machine learning to improve the hearing aid.

A talk through the baseline model we’re developing.

The first machine learning challenge is to improve the enhancement model, in other words, to produce a better processing algorithm for the hearing aid. The second challenge is to improve the prediction model using perceptual data we’ll provide.

Welcome

· One min read
Jon Barker
Clarity Team Member

Welcome to the new Clarity blog. We will be using this blog to post regular updates about our Challenges and Workshop, as well as posts discussing the tools and techniques that we are using in our baseline systems.