enhance
Baseline enhancement for CAD2 task1.
- recipes.cad2.task1.baseline.enhance.downmix_signal(vocals: ndarray, accompaniment: ndarray, beta: float) ndarray [source]
Downmix the vocals and accompaniment to stereo. :param vocals: Vocal signal. :type vocals: np.ndarray :param accompaniment: Accompaniment signal. :type accompaniment: np.ndarray :param beta: Downmix parameter. :type beta: float
- Returns:
Downmixed signal.
- Return type:
np.ndarray
Notes
When beta is 0, the downmix is the accompaniment. When beta is 1, the downmix is the vocals.
- recipes.cad2.task1.baseline.enhance.enhance(config: DictConfig) None [source]
Run the music enhancement. The system decomposes the music into vocals and accompaniment. Then, vocals are enhanced according to alpha values. Finally, the music is amplified according hearing loss and downmix to stereo.
- Parameters:
config (dict) – Dictionary of configuration options for enhancing music.
- recipes.cad2.task1.baseline.enhance.get_device(device: str) tuple [source]
Get the Torch device.
- Parameters:
device (str) – device type, e.g. “cpu”, “gpu0”, “gpu1”, etc.
- Returns:
torch.device() appropiate to the hardware available. str: device type selected, e.g. “cpu”, “cuda”.
- Return type:
torch.device
- recipes.cad2.task1.baseline.enhance.load_separation_model(causality: str, device: device) ConvTasNetStereo [source]
Load the separation model. :param causality: Causality of the model (causal or noncausal). :type causality: str :param device: Device to load the model. :type device: torch.device
- Returns:
Separation model.
- Return type:
model
- recipes.cad2.task1.baseline.enhance.separate_sources(model: Module, mix: Tensor | ndarray, sample_rate: int, segment: float = 10.0, overlap: float = 0.1, number_sources: int = 4, device: device | str | None = None)[source]
Apply model to a given mixture. Use fade, and add segments together in order to add model segment by segment.
- Parameters:
model (torch.nn.Module) – model to use for separation
mix (torch.Tensor) – mixture to separate, shape (batch, channels, time)
sample_rate (int) – sampling rate of the mixture
segment (float) – segment length in seconds
overlap (float) – overlap between segments, between 0 and 1
number_sources (int) – number of sources to separate
device (torch.device, str, or None) – if provided, device on which to execute the computation, otherwise mix.device is assumed. When device is different from mix.device, only local computations will be on device, while the entire tracks will be stored on mix.device.
- Returns:
estimated sources
- Return type:
torch.Tensor
Based on https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html