clarity.utils package¶
Subpackages¶
- clarity.utils.car_noise_simulator package
- Submodules
- clarity.utils.car_noise_simulator.carnoise_parameters_generator module
- clarity.utils.car_noise_simulator.carnoise_signal_generator module
CarNoiseSignalGenerator
CarNoiseSignalGenerator.FINAl_MULTIPLIER
CarNoiseSignalGenerator.REFERENCE_CONSTANT_DB
CarNoiseSignalGenerator.apply_commonness()
CarNoiseSignalGenerator.generate_car_noise()
CarNoiseSignalGenerator.generate_engine_noise()
CarNoiseSignalGenerator.generate_source_noise()
CarNoiseSignalGenerator.get_bump_params()
CarNoiseSignalGenerator.get_engine_params()
- Module contents
Submodules¶
clarity.utils.audiogram module¶
Dataclass to represent a monaural audiogram
- class clarity.utils.audiogram.Audiogram(levels: ~numpy.ndarray, frequencies: ~numpy.ndarray = <factory>)[source]¶
Bases:
object
Dataclass to represent an audiogram.
- levels¶
The audiometric levels in dB HL
- Type:
ndarray
- frequencies¶
The frequencies at which the levels are measured
- Type:
ndarray
- frequencies: ndarray¶
- has_frequencies(frequencies: ndarray) bool [source]¶
Check if the audiogram has the given frequencies.
- Parameters:
frequencies (ndarray) – The frequencies to check
- Returns:
True if the audiogram has the given frequencies
- Return type:
bool
- levels: ndarray¶
- resample(new_frequencies: ndarray, linear_frequency: bool = False) Audiogram [source]¶
Resample the audiogram to a new set of frequencies.
Interpolates linearly on a (by default) log frequency axis. If linear_frequencies is set True then interpolation is done on a linear frequency axis.
- Parameters:
new_frequencies (ndarray) – The new frequencies to resample to
- Returns:
New audiogram with resampled frequencies
- Return type:
- property severity: str¶
Categorise HL severity level for the audiogram.
Note that this categorisation is different from that of the British Society of Audiology, which recommends descriptors mild, moderate, severe and profound for average hearing threshold levels at 250, 500, 1000, 2000 and 4000 Hz of 21-40 dB HL, 41-70 dB HL, 71-95 dB HL and > 95 dB HL, respectively (BSA Pure-tone air-conduction and bone-conduction threshold audiometry with and without masking 2018).
- Returns:
str – severity level, one of SEVERE, MODERATE, MILD, NOTHING
- class clarity.utils.audiogram.Listener(audiogram_left: Audiogram, audiogram_right: Audiogram, id: str = '')[source]¶
Bases:
object
Dataclass to represent a Listener.
The listener is currently defined by their left and right ear audiogram. In later versions, this may be extended to include further audiometric data.
The class provides methods for reading metadata files which will also include some basic validation.
- id¶
The ID of the listener
- Type:
str
- static from_dict(listener_dict: dict) Listener [source]¶
Create a Listener from a dict.
The dict structure and fields are based on those used in the Clarity metadata files.
- Parameters:
listener_dict (dict) – The listener dict
- Returns:
The listener
- Return type:
- id: str = ''¶
- static load_listener_dict(filename: Path) dict[str, Listener] [source]¶
Read a Clarity Listener dict file.
The standard Clarity metadata files presents listeners as a dictionary of listeners, keyed by listener ID.
- Parameters:
filename (Path) – The path to the listener dict file
- Returns:
A dict of listeners keyed by id
- Return type:
dict[str, Listener]
clarity.utils.file_io module¶
File I/O functions.
- clarity.utils.file_io.read_jsonl(filename: str) list [source]¶
Read a jsonl file into a list of dictionaries.
- clarity.utils.file_io.read_signal(filename: str | Path, sample_rate: float = 0.0, offset: int | float = 0, n_samples: int = -1, n_channels: int = 0, offset_is_samples: bool = False, allow_resample: bool = True) ndarray [source]¶
Read a wav format audio file and return as numpy array of floats.
The returned value will be a numpy array of floats of shape (n_samples, n_channels) for n_channels >= 2, and shape (n_samples,) for n_channels = 1.
If n_samples is set to a value other than -1, the specified number of samples will be read, or until the end of the file. An ‘offset’ can be set to start reading from a specified sample or time (in seconds).
The expected number of channels can be specified. If the file has a different number of channels, an error will be raised.
The expected sample rate can be specified. If the file has a different sample the file will be resampled to the expected sample rate, unless ‘allow_resample’ is set to False, in which case an error will be raised.
- Parameters:
filename (str|Path) – Name of file to read
sample_rate (float) – The expected sample rate (default: 0.0 = any rate OK)
offset (int, optional) – Offset in samples or seconds (from start). Default is 0.
n_samples (int) – Number of samples.
n_channels (int) – expected number of channel (default: 0 = any number OK)
offset_is_samples (bool) – is offset measured in samples, (True) or seconds (False) (default: False)
allow_resample (bool) – allow resampling if sample rate is different from expected rate. Else raise error (default: True)
- Returns:
audio signal
- Return type:
np.ndarray
- clarity.utils.file_io.write_jsonl(filename: str, records: list) None [source]¶
Write a list of dictionaries to a jsonl file.
- clarity.utils.file_io.write_signal(filename: str | Path, signal: ndarray, sample_rate: float, floating_point: bool = True, strict: bool = False) None [source]¶
Write a signal as fixed or floating point wav file.
Signals are passed as numpy arrays of floats of shape (n_samples, n_channels) for n_channels >= 1 or (n_samples,) for n_channels = 1.
Signals are floating point in the range [-1.0 to 1.0) but can be written as wav file with either 16 bit integers or floating point. In the former, the signals are scaled to map to the range -32768 to 32767 and clipped if necessary.
NB: setting ‘strict’ to True will raise error on overflow. This would be a more natural default but it would break existing code that did not check for overflow.
- Parameters:
filename (str|Path) – name of file in to write to.
signal (ndarray) – signal to write.
sample_rate (float) – sampling frequency.
floating_point (bool) – write as floating point else an ints (default: True).
strict (bool) – raise error if signal out of range for int16 (default: False).
clarity.utils.flac_encoder module¶
- Class for encoding and decoding audio signals
using flac compression.
- class clarity.utils.flac_encoder.FileDecoder(input_file: Path, output_file: Path | None = None)[source]¶
Bases:
FileDecoder
- process() tuple[ndarray, int] [source]¶
Overwritten version of the process method from the pyflac decoder. Original process returns stereo signals in float64 format.
In this version, the data is returned using the original number of channels and in in16 format.
- Returns:
- A tuple of the decoded numpy audio array, and the sample rate
of the audio data.
- Return type:
(tuple)
- Raises:
DecoderProcessException – if any fatal read, write, or memory allocation error occurred (meaning decoding must stop)
- class clarity.utils.flac_encoder.FlacEncoder(compression_level: int = 5)[source]¶
Bases:
object
Class for encoding and decoding audio signals using FLAC
It uses the pyflac library to encode and decode the audio data. And offers convenient methods for encoding and decoding audio data.
- static decode(input_filename: Path | str) tuple[np.ndarray, float] [source]¶
Method to decode a flac file to wav audio data.
It uses the pyflac library to decode the flac file.
- Parameters:
input_filename (pathlib.Path | str) – Path to the input FLAC file.
- Returns:
The raw audio data.
- Return type:
(np.ndarray)
- Raises:
FileNotFoundError – If the flac file to decode does not exist.
- encode(signal: np.ndarray, sample_rate: int, output_file: str | Path | None = None) bytes [source]¶
Method to encode the audio data using FLAC compressor.
It creates a WavEncoder object and uses it to encode the audio data.
- Parameters:
signal (np.ndarray) – The raw audio data to be compressed.
sample_rate (int) – The sample rate of the audio data.
output_file (str | Path) – Path to where to save the output FLAC file. If not specified, a temporary file will be created.
- Returns:
The FLAC encoded audio signal.
- Return type:
(bytes)
- Raises:
ValueError – If the audio signal is not in np.int16 format.
- class clarity.utils.flac_encoder.WavEncoder(signal: np.ndarray, sample_rate: int, output_file: str | Path | None = None, compression_level: int = 5, blocksize: int = 0, streamable_subset: bool = True, verify: bool = False)[source]¶
Bases:
_Encoder
Class offers an adaptation of the pyflac.encoder.FileEncoder to work directly with WAV signals as input.
- clarity.utils.flac_encoder.read_flac_signal(filename: Path) tuple[ndarray, float] [source]¶
Read a FLAC signal and return it as a numpy array
- Parameters:
filename (Path) – The path to the FLAC file to read.
- Returns:
The decoded signal. sample_rate (float): The sample rate of the signal.
- Return type:
signal (np.ndarray)
- clarity.utils.flac_encoder.save_flac_signal(signal: np.ndarray, filename: Path, signal_sample_rate: int, output_sample_rate: int | None = None, do_clip_signal: bool = False, do_soft_clip: bool = False, do_scale_signal: bool = False) None [source]¶
Function to save output signals.
- The output signal will be resample to
output_sample_rate
. If
output_sample_rate
is None, the output signal will have the same sample rate as the input signal.
- The output signal will be resample to
- The output signal will be clipped to [-1, 1] if
do_clip_signal
is True and use soft clipped if
do_soft_clip
is True. Note that ifdo_clip_signal
is False,do_soft_clip
will be ignored. Note that ifdo_clip_signal
is True,do_scale_signal
will be ignored.
- The output signal will be clipped to [-1, 1] if
- The output signal will be scaled to [-1, 1] if
do_scale_signal
is True. If signal is scale, the scale factor will be saved in a TXT file. Note that if
do_clip_signal
is True,do_scale_signal
will be ignored.
- The output signal will be scaled to [-1, 1] if
The output signal will be saved as a FLAC file.
- Parameters:
signal (np.ndarray) – Signal to save
filename (Path) – Path to save signal
signal_sample_rate (int) – Sample rate of the input signal
output_sample_rate (int) – Sample rate of the output signal
do_clip_signal (bool) – Whether to clip signal
do_soft_clip (bool) – Whether to apply soft clipping
do_scale_signal (bool) – Whether to scale signal
clarity.utils.results_support module¶
Dataclass to save challenges results to a CSV file.
- class clarity.utils.results_support.ResultsFile(file_name: str | Path, header_columns: list[str], append_results: bool = False)[source]¶
Bases:
object
A utility class for writing results to a CSV file.
- file_name¶
The name of the file to write results to.
- Type:
str | Path
- header_columns¶
The columns to write to the CSV file.
- Type:
list[str]
- append_results¶
Whether to append results to an existing file. If False, a new file will be created and the header row will be written. Defaults to False.
- Type:
bool
- add_result(row_values: dict[str, str | float])[source]¶
Add a result to the CSV file.
- Parameters:
row_values (dict[str, str | float]) – The values to write to the CSV file.
- append_results: bool = False¶
- file_name: str | Path¶
- header_columns: list[str]¶
clarity.utils.signal_processing module¶
Signal processing utilities.
- clarity.utils.signal_processing.clip_signal(signal: numpy.ndarray, soft_clip: bool = False) tuple[numpy.ndarray, int] [source]¶
Clip the signal.
- Parameters:
signal (np.ndarray) – Signal to be clipped and saved.
soft_clip (bool) – Whether to use soft clipping.
- Returns:
Clipped signal. n_clipped (int): Number of samples clipped.
- Return type:
signal (np.ndarray)
- clarity.utils.signal_processing.compute_rms(signal: numpy.ndarray) float [source]¶
Compute RMS of signal
- Parameters:
signal – Signal to compute RMS of.
- Returns:
RMS of the signal.
- Return type:
float
- clarity.utils.signal_processing.denormalize_signals(sources: numpy.ndarray, ref: numpy.ndarray) numpy.ndarray [source]¶
Scale signals back to the original scale.
- Parameters:
sources (ndarray) – Source to be scaled.
ref (ndarray) – Original sources to be used for reverting scaling.
- Returns:
Signal rescaled back to its original.
- Return type:
ndarray
- clarity.utils.signal_processing.normalize_signal(signal: numpy.ndarray) tuple[numpy.ndarray, numpy.ndarray] [source]¶
Standardize the signal to have zero mean and unit variance.
- Parameters:
signal – The signal to be standardized.
- Returns:
The standardized signal and the reference signal.
- clarity.utils.signal_processing.resample(signal: numpy.ndarray, sample_rate: float, new_sample_rate: float, method: str = 'soxr') numpy.ndarray [source]¶
Resample a signal to a new sample rate.
This is a simple wrapper around soxr and scipy.signal.resample with the resampling expressed in terms of input and output sampling rates.
It also ensures that for multichannel signals, resampling is in the time domain, i.e. down the columns.
- Parameters:
signal – The signal to be resampled.
sample_rate – The original sample rate.
new_sample_rate – The new sample rate.
method – determine the approach use.
- Returns:
The resampled signal.
clarity.utils.source_separation_support module¶
Module that contains functions for source separation.
- clarity.utils.source_separation_support.get_device(device: str) tuple [source]¶
Get the Torch device.
- Parameters:
device (str) – device type, e.g. “cpu”, “gpu0”, “gpu1”, etc.
- Returns:
torch.device() appropiate to the hardware available. str: device type selected, e.g. “cpu”, “cuda”.
- Return type:
torch.device
- clarity.utils.source_separation_support.separate_sources(model: torch.nn.Module, mix: torch.Tensor, sample_rate: int, segment: float = 10.0, overlap: float = 0.1, device: torch.device | str | None = None)[source]¶
Apply model to a given mixture. Use fade, and add segments together in order to add model segment by segment.
- Parameters:
model (torch.nn.Module) – model to use for separation
mix (torch.Tensor) – mixture to separate, shape (batch, channels, time)
sample_rate (int) – sampling rate of the mixture
segment (float) – segment length in seconds
overlap (float) – overlap between segments, between 0 and 1
device (torch.device, str, or None) – if provided, device on which to execute the computation, otherwise mix.device is assumed. When device is different from mix.device, only local computations will be on device, while the entire tracks will be stored on mix.device.
- Returns:
estimated sources
- Return type:
torch.Tensor
Based on https://pytorch.org/audio/main/tutorials/hybrid_demucs_tutorial.html