enhance

Run the dummy enhancement.

recipes.cad_icassp_2024.baseline.enhance.decompose_signal(model: Module, model_sample_rate: int, signal: ndarray, signal_sample_rate: int, device: device, sources_list: list[str], listener: Listener, normalise: bool = True) dict[str, ndarray][source]

Decompose signal into 8 stems.

The listener is ignored by the baseline system as it

is not performing personalised decomposition.

Instead, it performs a standard music decomposition using a pre-trained

model trained on the MUSDB18 dataset.

Parameters:
  • model (torch.nn.Module) – Torch model.

  • model_sample_rate (int) – Sample rate of the model.

  • signal (ndarray) – Signal to be decomposed.

  • signal_sample_rate (int) – Sample frequency.

  • device (torch.device) – Torch device to use for processing.

  • sources_list (list) – List of strings used to index dictionary.

  • listener (Listener)

  • normalise – Whether to normalise the signal.

recipes.cad_icassp_2024.baseline.enhance.enhance(config: DictConfig) None[source]

Run the music enhancement. The system decomposes the music into vocal, drums, bass, and other stems. Then, the NAL-R prescription procedure is applied to each stem. :param config: Dictionary of configuration options for enhancing music. :type config: dict

Returns 8 stems for each song:
  • left channel vocal, drums, bass, and other stems

  • right channel vocal, drums, bass, and other stems

recipes.cad_icassp_2024.baseline.enhance.process_remix_for_listener(signal: ndarray, enhancer: NALR, compressor: Compressor, listener: Listener, apply_compressor: bool = False) ndarray[source]

Process the stems from sources.

Parameters:
  • stems (dict) – Dictionary of stems

  • sample_rate (float) – Sample rate of the signal

  • enhancer (NALR) – NAL-R prescription hearing aid

  • compressor (Compressor) – Compressor

  • listener – Listener object

  • apply_compressor (bool) – Whether to apply the compressor

Returns:

Processed signal.

Return type:

ndarray

recipes.cad_icassp_2024.baseline.enhance.save_flac_signal(signal: ndarray, filename: Path, signal_sample_rate, output_sample_rate, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) None[source]

Function to save output signals.

  • The output signal will be resample to output_sample_rate

  • The output signal will be clipped to [-1, 1] if do_clip_signal is True

    and use soft clipped if do_soft_clip is True. Note that if do_clip_signal is False, do_soft_clip will be ignored. Note that if do_clip_signal is True, do_scale_signal will be ignored.

  • The output signal will be scaled to [-1, 1] if do_scale_signal is True.

    If signal is scale, the scale factor will be saved in a TXT file. Note that if do_clip_signal is True, do_scale_signal will be ignored.

  • The output signal will be saved as a FLAC file.

Parameters:
  • signal (np.ndarray) – Signal to save

  • filename (Path) – Path to save signal

  • signal_sample_rate (int) – Sample rate of the input signal

  • output_sample_rate (int) – Sample rate of the output signal

  • do_clip_signal (bool) – Whether to clip signal

  • do_soft_clip (bool) – Whether to apply soft clipping

  • do_scale_signal (bool) – Whether to scale signal