recipes.cad2.task1.ConvTasNet.local.musdb18_dataset module¶
- class recipes.cad2.task1.ConvTasNet.local.musdb18_dataset.Compose(transforms)[source]¶
Bases:
object
Composes several augmentation transforms. :param augmentations: list of augmentations to compose.
- class recipes.cad2.task1.ConvTasNet.local.musdb18_dataset.MUSDB18Dataset(root: str, sources=None, targets=None, mix_background=False, suffix='.wav', split='train', subset=None, exclude_tracks=None, segment=None, samples_per_track=1, random_segments=False, random_track_mix=False, source_augmentations=<function MUSDB18Dataset.<lambda>>, sample_rate=44100)[source]¶
Bases:
Dataset
MUSDB18 music separation dataset
The dataset consists of 150 full lengths music tracks (~10h duration) of different genres along with their isolated stems:
drums, bass, vocals and others.
Out-of-the-box, asteroid does only support MUSDB18-HQ which comes as uncompressed WAV files. To use the MUSDB18, please convert it to WAV first:
MUSDB18 HQ: https://zenodo.org/record/3338373
Note
The datasets are hosted on Zenodo and require that users request access, since the tracks can only be used for academic purposes. We manually check this requests.
This dataset asssumes music tracks in (sub)folders where each folder has a fixed number of sources (defaults to 4). For each track, a list of sources and a common suffix can be specified. A linear mix is performed on the fly by summing up the sources
Due to the fact that all tracks comprise the exact same set of sources, random track mixing can be used can be used, where sources from different tracks are mixed together.
- Folder Structure:
>>> #train/1/vocals.wav ---------| >>> #train/1/drums.wav ----------+--> input (mix), output[target] >>> #train/1/bass.wav -----------| >>> #train/1/other.wav ---------/
- Parameters:
root (str) – Root path of dataset
sources (
list
ofstr
, optional) – List of source names that composes the mixture. Defaults to MUSDB18 4 stem scenario: vocals, drums, bass, other.targets (list or None, optional) –
List of source names to be used as targets. If None, a dict with the 4 stems is returned.
If e.g [vocals, drums], a tensor with stacked vocals and drums is returned instead of a dict. Defaults to None.
suffix (str, optional) – Filename suffix, defaults to .wav.
split (str, optional) – Dataset subfolder, defaults to train.
subset (
list
ofstr
, optional) – Selects a specific of list of tracks to be loaded, defaults to None (loads all tracks).segment (float, optional) – Duration of segments in seconds, defaults to
None
which loads the full-length audio tracks.samples_per_track (int, optional) – Number of samples yielded from each track, can be used to increase dataset size, defaults to 1.
random_segments (boolean, optional) – Enables random offset for track segments.
boolean (random_track_mix) – enables mixing of random sources from different tracks to assemble mix.
source_augmentations (
list
ofcallable
) – list of augmentation function names, defaults to no-op augmentations (input = output)sample_rate (int, optional) – Samplerate of files in dataset.
- root¶
Root path of dataset
- Type:
str
- sources¶
List of source names. Defaults to MUSDB18 4 stem scenario: vocals, drums, bass, other.
- Type:
list
ofstr
, optional
- suffix¶
Filename suffix, defaults to .wav.
- Type:
str, optional
- split¶
Dataset subfolder, defaults to train.
- Type:
str, optional
- subset¶
Selects a specific of list of tracks to be loaded, defaults to None (loads all tracks).
- Type:
list
ofstr
, optional
- segment¶
Duration of segments in seconds, defaults to
None
which loads the full-length audio tracks.- Type:
float, optional
- samples_per_track¶
Number of samples yielded from each track, can be used to increase dataset size, defaults to 1.
- Type:
int, optional
- random_segments¶
Enables random offset for track segments.
- Type:
boolean, optional
- random_track_mix boolean
enables mixing of random sources from different tracks to assemble mix.
- source_augmentations¶
list of augmentation function names, defaults to no-op augmentations (input = output)
- Type:
list
ofcallable
- sample_rate¶
Samplerate of files in dataset.
- Type:
int, optional
- tracks¶
List of track metadata
- Type:
list
ofDict
- References
“The 2018 Signal Separation Evaluation Campaign” Stoter et al. 2018.
Notes
This is a modified versions of the MUSDB18 dataset from the version in Asteroid. It extend the Asteroid version to allow for targets
`vocals`
and`background`
. The background is the sum of all sources except vocals. This is useful for training models that separate vocals from background.- dataset_name = 'MUSDB18'¶