enhance
Run the dummy enhancement.
- recipes.cad_icassp_2024.baseline.enhance.decompose_signal(model: Module, model_sample_rate: int, signal: ndarray, signal_sample_rate: int, device: device, sources_list: list[str], listener: Listener, normalise: bool = True) dict[str, ndarray] [source]
Decompose signal into 8 stems.
- The listener is ignored by the baseline system as it
is not performing personalised decomposition.
- Instead, it performs a standard music decomposition using a pre-trained
model trained on the MUSDB18 dataset.
- Parameters:
model (torch.nn.Module) – Torch model.
model_sample_rate (int) – Sample rate of the model.
signal (ndarray) – Signal to be decomposed.
signal_sample_rate (int) – Sample frequency.
device (torch.device) – Torch device to use for processing.
sources_list (list) – List of strings used to index dictionary.
listener (Listener)
normalise – Whether to normalise the signal.
- recipes.cad_icassp_2024.baseline.enhance.enhance(config: DictConfig) None [source]
Run the music enhancement. The system decomposes the music into vocal, drums, bass, and other stems. Then, the NAL-R prescription procedure is applied to each stem. :param config: Dictionary of configuration options for enhancing music. :type config: dict
- Returns 8 stems for each song:
left channel vocal, drums, bass, and other stems
right channel vocal, drums, bass, and other stems
- recipes.cad_icassp_2024.baseline.enhance.process_remix_for_listener(signal: ndarray, enhancer: NALR, compressor: Compressor, listener: Listener, apply_compressor: bool = False) ndarray [source]
Process the stems from sources.
- Parameters:
stems (dict) – Dictionary of stems
sample_rate (float) – Sample rate of the signal
enhancer (NALR) – NAL-R prescription hearing aid
compressor (Compressor) – Compressor
listener – Listener object
apply_compressor (bool) – Whether to apply the compressor
- Returns:
Processed signal.
- Return type:
ndarray
- recipes.cad_icassp_2024.baseline.enhance.save_flac_signal(signal: ndarray, filename: Path, signal_sample_rate, output_sample_rate, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) None [source]
Function to save output signals.
The output signal will be resample to
output_sample_rate
- The output signal will be clipped to [-1, 1] if
do_clip_signal
is True and use soft clipped if
do_soft_clip
is True. Note that ifdo_clip_signal
is False,do_soft_clip
will be ignored. Note that ifdo_clip_signal
is True,do_scale_signal
will be ignored.
- The output signal will be clipped to [-1, 1] if
- The output signal will be scaled to [-1, 1] if
do_scale_signal
is True. If signal is scale, the scale factor will be saved in a TXT file. Note that if
do_clip_signal
is True,do_scale_signal
will be ignored.
- The output signal will be scaled to [-1, 1] if
The output signal will be saved as a FLAC file.
- Parameters:
signal (np.ndarray) – Signal to save
filename (Path) – Path to save signal
signal_sample_rate (int) – Sample rate of the input signal
output_sample_rate (int) – Sample rate of the output signal
do_clip_signal (bool) – Whether to clip signal
do_soft_clip (bool) – Whether to apply soft clipping
do_scale_signal (bool) – Whether to scale signal