enhance

Run the dummy enhancement.

recipes.cad_icassp_2024.baseline.enhance.decompose_signal(model: Module, model_sample_rate: int, signal: ndarray, signal_sample_rate: int, device: device, sources_list: list[str], listener: Listener, normalise: bool = True) → dict[str, ndarray][source]

Decompose signal into 8 stems.

The listener is ignored by the baseline system as it: is not performing personalised decomposition.
Instead, it performs a standard music decomposition using a pre-trained: model trained on the MUSDB18 dataset.

Parameters:

model (torch.nn.Module) – Torch model.
model_sample_rate (int) – Sample rate of the model.
signal (ndarray) – Signal to be decomposed.
signal_sample_rate (int) – Sample frequency.
device (torch.device) – Torch device to use for processing.
sources_list (list) – List of strings used to index dictionary.
listener (Listener)
normalise – Whether to normalise the signal.

recipes.cad_icassp_2024.baseline.enhance.enhance(config: DictConfig) → None[source]

Run the music enhancement. The system decomposes the music into vocal, drums, bass, and other stems. Then, the NAL-R prescription procedure is applied to each stem. :param config: Dictionary of configuration options for enhancing music. :type config: dict

Returns 8 stems for each song:

left channel vocal, drums, bass, and other stems
right channel vocal, drums, bass, and other stems

recipes.cad_icassp_2024.baseline.enhance.process_remix_for_listener(signal: ndarray, enhancer: NALR, compressor: Compressor, listener: Listener, apply_compressor: bool = False) → ndarray[source]

Process the stems from sources.

Parameters:

stems (dict) – Dictionary of stems
sample_rate (float) – Sample rate of the signal
enhancer (NALR) – NAL-R prescription hearing aid
compressor (Compressor) – Compressor
listener – Listener object
apply_compressor (bool) – Whether to apply the compressor

Returns:

Processed signal.

Return type:

ndarray

recipes.cad_icassp_2024.baseline.enhance.save_flac_signal(signal: ndarray, filename: Path, signal_sample_rate, output_sample_rate, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) → None[source]

Function to save output signals.

The output signal will be resample to output_sample_rate
The output signal will be clipped to [-1, 1] if do_clip_signal is True
and use soft clipped if do_soft_clip is True. Note that if do_clip_signal is False, do_soft_clip will be ignored. Note that if do_clip_signal is True, do_scale_signal will be ignored.
The output signal will be scaled to [-1, 1] if do_scale_signal is True.
If signal is scale, the scale factor will be saved in a TXT file. Note that if do_clip_signal is True, do_scale_signal will be ignored.
The output signal will be saved as a FLAC file.

Parameters:

signal (np.ndarray) – Signal to save
filename (Path) – Path to save signal
signal_sample_rate (int) – Sample rate of the input signal
output_sample_rate (int) – Sample rate of the output signal
do_clip_signal (bool) – Whether to clip signal
do_soft_clip (bool) – Whether to apply soft clipping
do_scale_signal (bool) – Whether to scale signal