recipes.cad_icassp_2026.baseline package
recipes.cad_icassp_2026.baseline.compute_stoi module
Compute the STOI scores.
-
recipes.cad_icassp_2026.baseline.compute_stoi.compute_single_stoi(reference: ndarray, processed: ndarray, fsamp: int, stoi_fsamp: int = 10000) → float[source]
Compute the STOI score between a reference and processed signal.
- Parameters:
reference (np.ndarray) – Reference signal.
processed (np.ndarray) – Processed signal.
fsamp (int) – Sampling frequency.
stoi_fsamp (int) – Sampling frequency for STOI computation. Default is 10000 Hz.
- Returns:
STOI score.
- Return type:
float
-
recipes.cad_icassp_2026.baseline.compute_stoi.compute_stoi_for_signal(cfg: DictConfig, record: dict, data_root: str, estimated_vocals: ndarray) → float[source]
Compute the stoi score for a given signal.
- Parameters:
cfg (DictConfig) – configuration object
record (dict) – the metadata dict for the signal
data_root (str) – root path to the dataset
estimated_vocals (np.ndarray) – estimated vocals signal
- Returns:
stoi score
- Return type:
float
-
recipes.cad_icassp_2026.baseline.compute_stoi.run_compute_stoi(cfg: DictConfig) → None[source]
Run the STOI score computation.
recipes.cad_icassp_2026.baseline.compute_whisper module
recipes.cad_icassp_2026.baseline.evaluate module
Evaluate the predictions against the ground truth correctness values
-
recipes.cad_icassp_2026.baseline.evaluate.compute_scores(predictions, labels) → dict[source]
Compute the scores for the predictions
-
recipes.cad_icassp_2026.baseline.evaluate.evaluate(cfg: DictConfig) → None[source]
Evaluate the predictions against the ground truth correctness values
-
recipes.cad_icassp_2026.baseline.evaluate.kt_score(x: ndarray, y: ndarray) → float[source]
Compute the Kendall’s tau correlation between two arrays
-
recipes.cad_icassp_2026.baseline.evaluate.ncc_score(x: ndarray, y: ndarray) → float[source]
Compute the normalized cross correlation between two arrays
-
recipes.cad_icassp_2026.baseline.evaluate.rmse_score(x: ndarray, y: ndarray) → float[source]
Compute the root mean squared error between two arrays
-
recipes.cad_icassp_2026.baseline.evaluate.std_err(x: ndarray, y: ndarray) → float[source]
Compute the standard error between two arrays
recipes.cad_icassp_2026.baseline.predict module
Make intelligibility predictions from HASPI scores.
-
recipes.cad_icassp_2026.baseline.predict.predict_dev(cfg: DictConfig)[source]
Predict intelligibility for baselines.
Set config.baseline to `stoi`
or `whisper_mixture`
or `whisper_vocals`
depending on which baseline you want to run.
recipes.cad_icassp_2026.baseline.shared_predict_utils module
Shared utilities for STOI baseline prediction experiments.
-
class recipes.cad_icassp_2026.baseline.shared_predict_utils.LogisticModel[source]
Bases: object
Class to represent a logistic mapping.
Fits a logistic mapping from input values x to output values y.
-
fit(x, y)[source]
Fit a mapping from x values to y values.
-
params: np.ndarray | None = None
-
predict(x)[source]
Predict y values given x.
- Raises:
TypeError – If the predict() method is called before fit().
-
recipes.cad_icassp_2026.baseline.shared_predict_utils.estimate_vocals(signal: ndarray, sample_rate: int, model: Module, device: str = 'cpu') → ndarray[source]
Estimate vocals from the input signal using the pre-trained model.
- Parameters:
signal (torch.Tensor | np.ndarray) – Input audio signal.
sample_rate (int) – Sample rate of the input signal.
model (torch.nn.Module) – Pre-trained source separation model.
device (str) – Device to run the model on (‘cpu’ or ‘cuda’).
- Returns:
Estimated vocals.
- Return type:
np.ndarray
-
recipes.cad_icassp_2026.baseline.shared_predict_utils.input_align(reference: ndarray, processed: ndarray, fsamp: float = 10000) → tuple[ndarray, ndarray][source]
Align the processed signal to the reference signal.
Code based on the evaluator.haspi.eb but for variable sampling rate
-
recipes.cad_icassp_2026.baseline.shared_predict_utils.load_dataset_with_score(cfg, split: str) → pandas.DataFrame[source]
Load dataset and add prediction scores.
- Parameters:
-
- Returns:
DataFrame containing dataset records with added scores.
- Return type:
pd.DataFrame
-
recipes.cad_icassp_2026.baseline.shared_predict_utils.load_mixture(dataroot: Path, record: dict, cfg: DictConfig) → tuple[ndarray, float][source]
Load the mixture signal for a given record.
- Parameters:
dataroot (Path) – Root path to the dataset.
record (dict) – Record containing signal metadata.
cfg (DictConfig) – Configuration object.
- Returns:
Mixture signal and its sample rate.
- Return type:
tuple[np.ndarray, int]
-
recipes.cad_icassp_2026.baseline.shared_predict_utils.load_vocals(dataroot: Path, record: dict, cfg: DictConfig, separation_model, device='cpu') → ndarray[source]
Load or compute estimated vocals for a given record.
- Parameters:
dataroot (Path) – Root path to the dataset.
record (dict) – Record containing signal metadata.
cfg (DictConfig) – Configuration object.
separation_model – Pre-trained source separation model.
device (str) – Device to run the model on (‘cpu’ or ‘cuda’).
- Returns:
Estimated vocals signal.
- Return type:
np.ndarray
recipes.cad_icassp_2026.baseline.transcription_scorer module