recipes.cad2.task2.ConvTasNet.local package

Submodules

recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader module

class recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader.Compose(transforms)[source]

Bases: object

Composes several augmentation transforms. :param augmentations: list of augmentations to compose.

class recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader.RebalanceMusicDataset(root_path: ~pathlib.Path | str, music_tracks_file: ~pathlib.Path | str, target: str, samples_per_track: int = 1, segment_length: float | None = 5.0, random_segments=False, random_track_mix=False, split: str = 'train', source_augmentations=<function RebalanceMusicDataset.<lambda>>, sample_rate: int = 44100)[source]

Bases: Dataset

Dataset to process EnsembleSet and CadenzaWoodwind datasets for CAD2 Task2 baseline The dataset is composed of a target source and a random number of accompaniment sources.

Parameters:

root_path (str) – Path to the root directory of the dataset
music_tracks_file (str) – Path to the json file containing the music tracks
target (str) – Target source to be extracted
samples_per_track (int) – Number of samples to extract from each track
segment_length (float) – Length of the segment to extract
random_segments (bool) – If True, extract random segments from the tracks
random_track_mix (bool) – If True, mix random accompaniment tracks
split (str) – Split of the dataset to use
sample_rate (int) – Sample rate of the audio files

accompaniment_tracks: dict

dataset_name = 'EnsembleSet & CadenzaWoodwind'

get_infos()[source]

Get dataset infos (for publishing models).

Returns:: dict, dataset infos with keys dataset, task and licences.

repeated_instruments: list[str]

root_path: Path

recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader.augment_channelswap(audio)[source]: Randomly swap channels of stereo sources

recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader.augment_gain(audio, low=0.25, high=1.25)[source]: Applies a random gain to each source between low and high

recipes.cad2.task2.ConvTasNet.local.cad2task2_dataloader.get_audio_durations(track_path: Path | str) → float[source]

recipes.cad2.task2.ConvTasNet.local.tasnet module

class recipes.cad2.task2.ConvTasNet.local.tasnet.ChannelwiseLayerNorm(channel_size)[source]

Bases: Module

Channel-wise Layer Normalization (cLN)

forward(y)[source]

Parameters:: y – [M, N, K], M is batch size, N is channel size, K is length
Returns:: [M, N, K]
Return type:: cLN_y

reset_parameters()[source]

class recipes.cad2.task2.ConvTasNet.local.tasnet.Chomp1d(chomp_size)[source]

Bases: Module

To ensure the output length is the same as the input.

forward(x)[source]

Parameters:: x – [M, H, Kpad]
Returns:: [M, H, K]

class recipes.cad2.task2.ConvTasNet.local.tasnet.ConvTasNetStereo(*args: Any, **kwargs: Any)[source]

Bases: Module, PyTorchModelHubMixin

forward(mixture)[source]

Parameters:: mixture – [M, T], M is batch size, T is #samples
Returns:: [M, C, T]
Return type:: est_source

get_model_args()[source]: Arguments needed to re-instantiate the model.

get_state_dict()[source]: In case the state dict needs to be modified before sharing the model.

serialize()[source]

Serialize model and output dictionary.

Returns:: dict, serialized model with keys model_args and state_dict.

valid_length(length)[source]

class recipes.cad2.task2.ConvTasNet.local.tasnet.Decoder(N, L, audio_channels)[source]

Bases: Module

forward(mixture_w, est_mask)[source]

Parameters:

mixture_w – [M, N, K]
est_mask – [M, C, N, K]

Returns:

[M, C, T]

Return type:

est_source

class recipes.cad2.task2.ConvTasNet.local.tasnet.DepthwiseSeparableConv(in_channels, out_channels, kernel_size, stride, padding, dilation, norm_type='gLN', causal=False)[source]

Bases: Module

forward(x)[source]

Parameters:: x – [M, H, K]
Returns:: [M, B, K]
Return type:: result

class recipes.cad2.task2.ConvTasNet.local.tasnet.Encoder(L, N, audio_channels)[source]

Bases: Module

Estimation of the nonnegative mixture weight by a 1-D conv layer.

forward(mixture)[source]

Parameters:: mixture – [M, T], M is batch size, T is #samples
Returns:: [M, N, K], where K = (T-L)/(L/2)+1 = 2T/L-1
Return type:: mixture_w

class recipes.cad2.task2.ConvTasNet.local.tasnet.GlobalLayerNorm(channel_size)[source]

Bases: Module

Global Layer Normalization (gLN)

forward(y)[source]

Parameters:: y – [M, N, K], M is batch size, N is channel size, K is length
Returns:: [M, N, K]
Return type:: gLN_y

reset_parameters()[source]

class recipes.cad2.task2.ConvTasNet.local.tasnet.TemporalBlock(in_channels, out_channels, kernel_size, stride, padding, dilation, norm_type='gLN', causal=False)[source]

Bases: Module

forward(x)[source]

Parameters:: x – [M, B, K]
Returns:: [M, B, K]

class recipes.cad2.task2.ConvTasNet.local.tasnet.TemporalConvNet(N, B, H, P, X, R, C, norm_type='gLN', causal=False, mask_nonlinear='relu')[source]

Bases: Module

forward(mixture_w)[source]

Keep this API same with TasNet :param mixture_w: [M, N, K], M is batch size

Returns:: [M, C, N, K]
Return type:: est_mask

recipes.cad2.task2.ConvTasNet.local.tasnet.chose_norm(norm_type, channel_size)[source]: The input of normlization will be (M, C, K), where M is batch size, C is channel size and K is sequence length.

recipes.cad2.task2.ConvTasNet.local.tasnet.overlap_and_add(signal, frame_step)[source]

Module contents

class recipes.cad2.task2.ConvTasNet.local.Compose(transforms)[source]

Bases: object

Composes several augmentation transforms. :param augmentations: list of augmentations to compose.

class recipes.cad2.task2.ConvTasNet.local.ConvTasNetStereo(*args: Any, **kwargs: Any)[source]

Bases: Module, PyTorchModelHubMixin

forward(mixture)[source]

Parameters:: mixture – [M, T], M is batch size, T is #samples
Returns:: [M, C, T]
Return type:: est_source

get_model_args()[source]: Arguments needed to re-instantiate the model.

get_state_dict()[source]: In case the state dict needs to be modified before sharing the model.

serialize()[source]

Serialize model and output dictionary.

Returns:: dict, serialized model with keys model_args and state_dict.

valid_length(length)[source]

class recipes.cad2.task2.ConvTasNet.local.RebalanceMusicDataset(root_path: ~pathlib.Path | str, music_tracks_file: ~pathlib.Path | str, target: str, samples_per_track: int = 1, segment_length: float | None = 5.0, random_segments=False, random_track_mix=False, split: str = 'train', source_augmentations=<function RebalanceMusicDataset.<lambda>>, sample_rate: int = 44100)[source]

Bases: Dataset

Dataset to process EnsembleSet and CadenzaWoodwind datasets for CAD2 Task2 baseline The dataset is composed of a target source and a random number of accompaniment sources.

Parameters:

root_path (str) – Path to the root directory of the dataset
music_tracks_file (str) – Path to the json file containing the music tracks
target (str) – Target source to be extracted
samples_per_track (int) – Number of samples to extract from each track
segment_length (float) – Length of the segment to extract
random_segments (bool) – If True, extract random segments from the tracks
random_track_mix (bool) – If True, mix random accompaniment tracks
split (str) – Split of the dataset to use
sample_rate (int) – Sample rate of the audio files

accompaniment_tracks: dict

dataset_name = 'EnsembleSet & CadenzaWoodwind'

get_infos()[source]

Get dataset infos (for publishing models).

Returns:: dict, dataset infos with keys dataset, task and licences.

repeated_instruments: list[str]

root_path: Path

recipes.cad2.task2.ConvTasNet.local.augment_channelswap(audio)[source]: Randomly swap channels of stereo sources

recipes.cad2.task2.ConvTasNet.local.augment_gain(audio, low=0.25, high=1.25)[source]: Applies a random gain to each source between low and high

recipes.cad2.task2.ConvTasNet.local.overlap_and_add(signal, frame_step)[source]