Skip to main content

Obtaining the CPC3 dataset

The challenge data is now published and available for download on Zenodo. On the Zenodo site you will find the following.

  • The challenge training set data packaged as a single 7.5 GB file, clarity_CPC3_data.v1_1.tar.gz
  • The development set data packaged as a single 752 MB file, clarity_CPC3_data.dev.v1_0.tar.gz.
  • The evaluation set data packaged as a single 6.1 GB file, clarity_CPC3_data.eval.v1_0.tar.gz.
  • Development and Evaluation set labels packaged as a small 515 KB file, clarity_CPC3_data.labels.tar.gz. The evaluation set labels were not made available to entrants during the challenge but have been released subsequently to allow self-evaluation.
  • A small 20 MB demo dataset for preview purposes, clarity_CPC3_demo_data.v1_0.tar.gz.

All packages should be unpacked under the same root.

The Github repository containing the baseline code is here. The repository contains code for all the Clarity enhancement and prediction challenges. You will find all the necessary instructions for installing the data and setting up the baseline system: i.e. producing the better-ear HASPI predictions.

info

The Challenge is now closed but the data is still available for anyone to use. If using the data please cite the following paper

Jon Barker, Michael A Akeroyd, Trevor J. Cox, John F. Culling, Jennifer Firth, Simone Graetzer and Graham Naylor, "The 3rd Clarity Prediction Challenge: A Machine Learning Challenge for Hearing Aid Intelligibility Prediction," ICASSP 2026 - 2026 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 2026, doi: 10.1109/ICASSP55912.2026.11461465.