recipes.cad1.task1.baseline.enhance module¶

Run the dummy enhancement.

recipes.cad1.task1.baseline.enhance.apply_baseline_ha(enhancer: NALR, compressor: Compressor, signal: ndarray, audiogram: Audiogram, apply_compressor: bool = False) → ndarray[source]¶

Apply NAL-R prescription hearing aid to a signal.

Parameters:

enhancer – A NALR object that enhances the signal.
compressor – A Compressor object that compresses the signal.
signal – An ndarray representing the audio signal.
audiogram – An Audiogram object representing the listener’s audiogram.
apply_compressor – A boolean indicating whether to include the compressor.

Returns:

An ndarray representing the processed signal.

recipes.cad1.task1.baseline.enhance.decompose_signal(model: Module, model_sample_rate: int, signal: ndarray, signal_sample_rate: int, device: device, sources_list: list[str], listener: Listener, normalise: bool = True) → dict[str, ndarray][source]¶

Decompose signal into 8 stems.

The left and right audiograms are ignored by the baseline system as it is performing personalised decomposition. Instead, it performs a standard music decomposition using the HDEMUCS model trained on the MUSDB18 dataset.

Parameters:

model (torch.nn.Module) – Torch model.
model_sample_rate (int) – Sample rate of the model.
signal (ndarray) – Signal to be decomposed.
signal_sample_rate (int) – Sample frequency.
device (torch.device) – Torch device to use for processing.
sources_list (list) – List of strings used to index dictionary.
listener (Listener) – Listener object.
normalise – Whether to normalise the signal.

recipes.cad1.task1.baseline.enhance.enhance(config: DictConfig) → None[source]¶

Run the music enhancement. The system decomposes the music into vocal, drums, bass, and other stems. Then, the NAL-R prescription procedure is applied to each stem. :param config: Dictionary of configuration options for enhancing music. :type config: dict

Returns 8 stems for each song:

left channel vocal, drums, bass, and other stems
right channel vocal, drums, bass, and other stems

recipes.cad1.task1.baseline.enhance.get_device(device: str) → tuple[source]¶

Get the Torch device.

Parameters:: device (str) – device type, e.g. “cpu”, “gpu0”, “gpu1”, etc.
Returns:: torch.device() appropiate to the hardware available. str: device type selected, e.g. “cpu”, “cuda”.
Return type:: torch.device

recipes.cad1.task1.baseline.enhance.map_to_dict(sources: ndarray, sources_list: list[str]) → dict[source]¶

Map sources to a dictionary separating audio into left and right channels.

Parameters:

sources (ndarray) – Signal to be mapped to dictionary.
sources_list (list) – List of strings used to index dictionary.

Returns:

A dictionary of separated source audio split into channels.

Return type:

Dictionary

recipes.cad1.task1.baseline.enhance.process_stems_for_listener(stems: dict, enhancer: NALR, compressor: Compressor, listener: Listener, apply_compressor: bool = False) → dict[source]¶

Process the stems from sources.

Parameters:

stems (dict) – Dictionary of stems
enhancer (NALR) – NAL-R prescription hearing aid
compressor (Compressor) – Compressor
listener (Listener) – Listener object.
apply_compressor (bool) – Whether to apply the compressor

Returns:

Dictionary of processed stems

Return type:

processed_sources (dict)

recipes.cad1.task1.baseline.enhance.remix_signal(stems: dict) → ndarray[source]¶

Function to remix signal. It takes the eight stems and combines them into a stereo signal.

Parameters:: stems (dict) – Dictionary of stems
Returns:: Remixed signal
Return type:: (ndarray)

recipes.cad1.task1.baseline.enhance.save_flac_signal(signal: ndarray, filename: Path, signal_sample_rate: int, output_sample_rate: int, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) → None[source]¶

Function to save output signals.

The output signal will be resample to output_sample_rate
The output signal will be clipped to [-1, 1] if do_clip_signal is True
and use soft clipped if do_soft_clip is True. Note that if do_clip_signal is False, do_soft_clip will be ignored. Note that if do_clip_signal is True, do_scale_signal will be ignored.
The output signal will be scaled to [-1, 1] if do_scale_signal is True.
If signal is scale, the scale factor will be saved in a TXT file. Note that if do_clip_signal is True, do_scale_signal will be ignored.
The output signal will be saved as a FLAC file.

Parameters:

signal (np.ndarray) – Signal to save
filename (Path) – Path to save signal
signal_sample_rate (int) – Sample rate of the input signal
output_sample_rate (int) – Sample rate of the output signal
do_clip_signal (bool) – Whether to clip signal
do_soft_clip (bool) – Whether to apply soft clipping
do_scale_signal (bool) – Whether to scale signal

recipes.cad1.task1.baseline.enhance.separate_sources(model: torch.nn.Module, mix: torch.Tensor | ndarray, sample_rate: int, segment: float = 10.0, overlap: float = 0.1, device: torch.device | str | None = None)[source]¶

Apply model to a given mixture. Use fade, and add segments together in order to add model segment by segment.

Parameters:

model (torch.nn.Module) – model to use for separation
mix (torch.Tensor) – mixture to separate, shape (batch, channels, time)
sample_rate (int) – sampling rate of the mixture
segment (float) – segment length in seconds
overlap (float) – overlap between segments, between 0 and 1
device (torch.device, str, or None) – if provided, device on which to execute the computation, otherwise mix.device is assumed. When device is different from mix.device, only local computations will be on device, while the entire tracks will be stored on mix.device.

Returns:

estimated sources

Return type:

torch.Tensor

Based on https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html

recipes.cad1.task1.baseline.enhance module¶

Project name not set

Navigation

Related Topics