recipes.cad1.task1.baseline.enhance module

Run the dummy enhancement.

recipes.cad1.task1.baseline.enhance.apply_baseline_ha(enhancer: NALR, compressor: Compressor, signal: ndarray, audiogram: Audiogram, apply_compressor: bool = False) ndarray[source]

Apply NAL-R prescription hearing aid to a signal.

Parameters:
  • enhancer – A NALR object that enhances the signal.

  • compressor – A Compressor object that compresses the signal.

  • signal – An ndarray representing the audio signal.

  • audiogram – An Audiogram object representing the listener’s audiogram.

  • apply_compressor – A boolean indicating whether to include the compressor.

Returns:

An ndarray representing the processed signal.

recipes.cad1.task1.baseline.enhance.decompose_signal(model: Module, model_sample_rate: int, signal: ndarray, signal_sample_rate: int, device: device, sources_list: list[str], listener: Listener, normalise: bool = True) dict[str, ndarray][source]

Decompose signal into 8 stems.

The left and right audiograms are ignored by the baseline system as it is performing personalised decomposition. Instead, it performs a standard music decomposition using the HDEMUCS model trained on the MUSDB18 dataset.

Parameters:
  • model (torch.nn.Module) – Torch model.

  • model_sample_rate (int) – Sample rate of the model.

  • signal (ndarray) – Signal to be decomposed.

  • signal_sample_rate (int) – Sample frequency.

  • device (torch.device) – Torch device to use for processing.

  • sources_list (list) – List of strings used to index dictionary.

  • listener (Listener) – Listener object.

  • normalise – Whether to normalise the signal.

recipes.cad1.task1.baseline.enhance.enhance(config: DictConfig) None[source]

Run the music enhancement. The system decomposes the music into vocal, drums, bass, and other stems. Then, the NAL-R prescription procedure is applied to each stem. :param config: Dictionary of configuration options for enhancing music. :type config: dict

Returns 8 stems for each song:
  • left channel vocal, drums, bass, and other stems

  • right channel vocal, drums, bass, and other stems

recipes.cad1.task1.baseline.enhance.get_device(device: str) tuple[source]

Get the Torch device.

Parameters:

device (str) – device type, e.g. “cpu”, “gpu0”, “gpu1”, etc.

Returns:

torch.device() appropiate to the hardware available. str: device type selected, e.g. “cpu”, “cuda”.

Return type:

torch.device

recipes.cad1.task1.baseline.enhance.map_to_dict(sources: ndarray, sources_list: list[str]) dict[source]

Map sources to a dictionary separating audio into left and right channels.

Parameters:
  • sources (ndarray) – Signal to be mapped to dictionary.

  • sources_list (list) – List of strings used to index dictionary.

Returns:

A dictionary of separated source audio split into channels.

Return type:

Dictionary

recipes.cad1.task1.baseline.enhance.process_stems_for_listener(stems: dict, enhancer: NALR, compressor: Compressor, listener: Listener, apply_compressor: bool = False) dict[source]

Process the stems from sources.

Parameters:
  • stems (dict) – Dictionary of stems

  • enhancer (NALR) – NAL-R prescription hearing aid

  • compressor (Compressor) – Compressor

  • listener (Listener) – Listener object.

  • apply_compressor (bool) – Whether to apply the compressor

Returns:

Dictionary of processed stems

Return type:

processed_sources (dict)

recipes.cad1.task1.baseline.enhance.remix_signal(stems: dict) ndarray[source]

Function to remix signal. It takes the eight stems and combines them into a stereo signal.

Parameters:

stems (dict) – Dictionary of stems

Returns:

Remixed signal

Return type:

(ndarray)

recipes.cad1.task1.baseline.enhance.save_flac_signal(signal: ndarray, filename: Path, signal_sample_rate: int, output_sample_rate: int, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) None[source]

Function to save output signals.

  • The output signal will be resample to output_sample_rate

  • The output signal will be clipped to [-1, 1] if do_clip_signal is True

    and use soft clipped if do_soft_clip is True. Note that if do_clip_signal is False, do_soft_clip will be ignored. Note that if do_clip_signal is True, do_scale_signal will be ignored.

  • The output signal will be scaled to [-1, 1] if do_scale_signal is True.

    If signal is scale, the scale factor will be saved in a TXT file. Note that if do_clip_signal is True, do_scale_signal will be ignored.

  • The output signal will be saved as a FLAC file.

Parameters:
  • signal (np.ndarray) – Signal to save

  • filename (Path) – Path to save signal

  • signal_sample_rate (int) – Sample rate of the input signal

  • output_sample_rate (int) – Sample rate of the output signal

  • do_clip_signal (bool) – Whether to clip signal

  • do_soft_clip (bool) – Whether to apply soft clipping

  • do_scale_signal (bool) – Whether to scale signal

recipes.cad1.task1.baseline.enhance.separate_sources(model: torch.nn.Module, mix: torch.Tensor | ndarray, sample_rate: int, segment: float = 10.0, overlap: float = 0.1, device: torch.device | str | None = None)[source]

Apply model to a given mixture. Use fade, and add segments together in order to add model segment by segment.

Parameters:
  • model (torch.nn.Module) – model to use for separation

  • mix (torch.Tensor) – mixture to separate, shape (batch, channels, time)

  • sample_rate (int) – sampling rate of the mixture

  • segment (float) – segment length in seconds

  • overlap (float) – overlap between segments, between 0 and 1

  • device (torch.device, str, or None) – if provided, device on which to execute the computation, otherwise mix.device is assumed. When device is different from mix.device, only local computations will be on device, while the entire tracks will be stored on mix.device.

Returns:

estimated sources

Return type:

torch.Tensor

Based on https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html