Skip to main content

Baseline System

A baseline intrusive intelligibility prediction system has been provided to help you get started.

Overview of the Baseline System​

The baseline uses the HASPI model [1] to make predictions. HASPI is an intrusive measure that takes three inputs:

  1. A processed signal,
  2. A reference signal,
  3. A listener audiogram.

Since the exact audiograms are not provided, the baseline uses a standard audiogram for each hearing impairment severity level (e.g., mild, moderate, moderately severe).

HASPI outputs an intelligibility score between 0 and 1. This score is then passed through a logistic function, whose parameters have been optimized on the training data to minimize the RMSE between the predicted and measured intelligibility scores. The output of the logistic function is the final sentence intelligibility prediction.

How to Use the Baseline System​

The baseline system is included in the pyclarity Python package (version pyclarity >= 0.7.1), which is available on GitHub. The relevant scripts are located in the recipes/cpc3/baseline directory. To use the baseline system:

  1. Download the Code: Clone or download the repository from GitHub.
  2. Follow the Instructions: Refer to the README file in the recipes/cpc3/baseline directory for detailed steps to run the baseline on the CPC3 dataset.

Baseline Performance​

The baseline system achieves the following performance on the development set:

MetricValue
RMSE28.00
Correlation0.72

This baseline system is not intended to be state-of-the-art but serves as a starting point for participants. We encourage you to use it as a reference and build upon it to develop your own systems.

Submitting Results​

The baseline system's results for the development set are published on the Eval.AI leaderboard. To submit and evaluate your own predictions for the development set, refer to the instructions on the CPC3 Development Set Leaderboard.

References​

  1. Kates, J.M. and Arehart, K.H., 2021. The hearing-aid speech perception index (HASPI) version 2. Speech Communication, 131, pp.35-46.