recipes.cad_icassp_2026.baseline package¶

Submodules¶

recipes.cad_icassp_2026.baseline.compute_stoi module¶

Compute the STOI scores.

recipes.cad_icassp_2026.baseline.compute_stoi.compute_single_stoi(reference: ndarray, processed: ndarray, fsamp: int, stoi_fsamp: int = 10000) → float[source]¶

Compute the STOI score between a reference and processed signal.

Parameters:

reference (np.ndarray) – Reference signal.
processed (np.ndarray) – Processed signal.
fsamp (int) – Sampling frequency.
stoi_fsamp (int) – Sampling frequency for STOI computation. Default is 10000 Hz.

Returns:

STOI score.

Return type:

float

recipes.cad_icassp_2026.baseline.compute_stoi.compute_stoi_for_signal(cfg: DictConfig, record: dict, data_root: str, estimated_vocals: ndarray) → float[source]¶

Compute the stoi score for a given signal.

Parameters:

cfg (DictConfig) – configuration object
record (dict) – the metadata dict for the signal
data_root (str) – root path to the dataset
estimated_vocals (np.ndarray) – estimated vocals signal

Returns:

stoi score

Return type:

float

recipes.cad_icassp_2026.baseline.compute_stoi.run_compute_stoi(cfg: DictConfig) → None[source]¶: Run the STOI score computation.

recipes.cad_icassp_2026.baseline.compute_whisper module¶

recipes.cad_icassp_2026.baseline.evaluate module¶

Evaluate the predictions against the ground truth correctness values

recipes.cad_icassp_2026.baseline.evaluate.compute_scores(predictions, labels) → dict[source]¶: Compute the scores for the predictions

recipes.cad_icassp_2026.baseline.evaluate.evaluate(cfg: DictConfig) → None[source]¶: Evaluate the predictions against the ground truth correctness values

recipes.cad_icassp_2026.baseline.evaluate.kt_score(x: ndarray, y: ndarray) → float[source]¶: Compute the Kendall’s tau correlation between two arrays

recipes.cad_icassp_2026.baseline.evaluate.ncc_score(x: ndarray, y: ndarray) → float[source]¶: Compute the normalized cross correlation between two arrays

recipes.cad_icassp_2026.baseline.evaluate.rmse_score(x: ndarray, y: ndarray) → float[source]¶: Compute the root mean squared error between two arrays

recipes.cad_icassp_2026.baseline.evaluate.std_err(x: ndarray, y: ndarray) → float[source]¶: Compute the standard error between two arrays

recipes.cad_icassp_2026.baseline.predict module¶

Make intelligibility predictions from HASPI scores.

recipes.cad_icassp_2026.baseline.predict.predict_dev(cfg: DictConfig)[source]¶

Predict intelligibility for baselines.

Set config.baseline to `stoi` or `whisper_mixture` or `whisper_vocals` depending on which baseline you want to run.

recipes.cad_icassp_2026.baseline.shared_predict_utils module¶

Shared utilities for STOI baseline prediction experiments.

class recipes.cad_icassp_2026.baseline.shared_predict_utils.LogisticModel[source]¶

Bases: object

Class to represent a logistic mapping.

Fits a logistic mapping from input values x to output values y.

fit(x, y)[source]¶: Fit a mapping from x values to y values.

params: np.ndarray | None = None¶

predict(x)[source]¶

Predict y values given x.

Raises:: TypeError – If the predict() method is called before fit().

recipes.cad_icassp_2026.baseline.shared_predict_utils.estimate_vocals(signal: ndarray, sample_rate: int, model: Module, device: str = 'cpu') → ndarray[source]¶

Estimate vocals from the input signal using the pre-trained model.

Parameters:

signal (torch.Tensor | np.ndarray) – Input audio signal.
sample_rate (int) – Sample rate of the input signal.
model (torch.nn.Module) – Pre-trained source separation model.
device (str) – Device to run the model on (‘cpu’ or ‘cuda’).

Returns:

Estimated vocals.

Return type:

np.ndarray

recipes.cad_icassp_2026.baseline.shared_predict_utils.input_align(reference: ndarray, processed: ndarray, fsamp: float = 10000) → tuple[ndarray, ndarray][source]¶: Align the processed signal to the reference signal. Code based on the evaluator.haspi.eb but for variable sampling rate

recipes.cad_icassp_2026.baseline.shared_predict_utils.load_dataset_with_score(cfg, split: str) → pandas.DataFrame[source]¶

Load dataset and add prediction scores.

Parameters:

cfg (DictConfig) – Configuration object.
split (str) – Dataset split to load (‘train’ or ‘valid’)

Returns:

DataFrame containing dataset records with added scores.

Return type:

pd.DataFrame

recipes.cad_icassp_2026.baseline.shared_predict_utils.load_mixture(dataroot: Path, record: dict, cfg: DictConfig) → tuple[ndarray, float][source]¶

Load the mixture signal for a given record.

Parameters:

dataroot (Path) – Root path to the dataset.
record (dict) – Record containing signal metadata.
cfg (DictConfig) – Configuration object.

Returns:

Mixture signal and its sample rate.

Return type:

tuple[np.ndarray, int]

recipes.cad_icassp_2026.baseline.shared_predict_utils.load_vocals(dataroot: Path, record: dict, cfg: DictConfig, separation_model, device='cpu') → ndarray[source]¶

Load or compute estimated vocals for a given record.

Parameters:

dataroot (Path) – Root path to the dataset.
record (dict) – Record containing signal metadata.
cfg (DictConfig) – Configuration object.
separation_model – Pre-trained source separation model.
device (str) – Device to run the model on (‘cpu’ or ‘cuda’).

Returns:

Estimated vocals signal.

Return type:

np.ndarray

recipes.cad_icassp_2026.baseline package¶

Submodules¶

recipes.cad_icassp_2026.baseline.compute_stoi module¶

recipes.cad_icassp_2026.baseline.compute_whisper module¶

recipes.cad_icassp_2026.baseline.evaluate module¶

recipes.cad_icassp_2026.baseline.predict module¶

recipes.cad_icassp_2026.baseline.shared_predict_utils module¶

recipes.cad_icassp_2026.baseline.transcription_scorer module¶

Module contents¶

Project name not set

Navigation

Related Topics