Skip to main content

Download

Software

All the necessary software tools are available as a single package, pyClarity available on GitHub.

Data

To download data, please visit here.

The data is split into two packages:

  • clarity_CEC1_data.train.tgz [192 GB],
  • clarity_CEC1_data.dev_eval_metadata.tgz [163 GB].

Please also download and unpack clarity_CEC1_data.anechoic.v1_3.tgz [11.4 GB], which contains the correct version of anechoic signals for reference, and use them to replace the incorrected anechoic signals within the clarity_CEC1_data.train.tgz and clarity_CEC1_data.dev_eval_metadata.tgz packages.

Unpack packages under the same root directory using

tar -xvzf <PACKAGE_NAME>

Training data is stored in clarity_CEC1_data.train.tgz with the following structure,

clarity_data
|
└───train
└───interferers
| | nosie 3.9G
| | speech 4.5G
|
└───rooms
| | ac 48M
| | brir 46G
| | rpf 379M
|
└───scenes 166G
|
└───targets 2.8G

The dev and eval data is stored in clarity_CEC1_data.dev_eval_metadata.tgz contains

clarity_data
|
└───dev
| └───interferers
| | | nosie 587M
| | | speech 1.4G
| |
| └───rooms
| | | ac 20M
| | | brir 20G
| | | rpf 158M
| |
| └───scenes 72G
| |
| └───targets 1.3G
|
└───eval
| | | nosie 675M
| | | speech 1.3G
| |
| └───rooms
| | | ac 12M
| | | brir 12G
| | | rpf 95M
| |
| └───scenes 58G
| |
| └───targets 749M
|
└───eval2/scenes 21G # eval signal processed by baseline system

For further details and code for running the baseline, see the baseline recipe in the pyClarity repository.