clarity.evaluator.haspi.eb module

Module for HASPI, HASQI, HAAQI EBs

clarity.evaluator.haspi.eb.ave_covary2(signal_cross_covariance: np.ndarray, reference_signal_mean_square: np.ndarray, threshold_db: float, lp_filter_order: ndarray | None = None, freq_cutoff: ndarray | None = None) tuple[float, ndarray][source]

Compute the average cross-covariance between the reference and processed signals in each auditory band.

The silent time-frequency tiles are removed from consideration. The cross-covariance is computed for each segment in each frequency band. The values are weighted by 1 for inclusion or 0 if the tile is below threshold. The sum of the covariance values across time and frequency are then divided by the total number of tiles above threshold. The calculation is a modification of Tan et al.[1]_ . The cross-covariance is also output with a frequency weighting that reflects the loss of IHC synchronization at high frequencies Johnson[2]_.

Parameters:
  • signal_cross_covariance (np.array) – [nchan,nseg] of cross-covariance values

  • reference_signal_mean_square (np.array) – [nchan,nseg] of reference signal MS values

  • () (threshold_db) – threshold in dB SL to include segment ave over freq in average

  • lp_filter (list) – LP filter order

  • freq_cutoff (list) – Cutoff frequencies in Hz

Returns:

cross-covariance in segments averaged over time and

frequency

ihc_sync_covariance (): cross-covariance array, 6 different weightings for loss
of IHC synchronization at high frequencies:
LP Filter Order Cutoff Freq, kHz

1 1.5 3 2.0 5 2.5, 3.0, 3.5, 4.0

Return type:

average_covariance ()

References:

Updates:

James M. Kates, 28 August 2012. Adjusted for BM vibration in dB SL, 30 October 2012. Threshold for including time-freq tile modified, 30 January 2013. Version for different sync loss, 15 February 2013. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.bandwidth_adjust(control: ndarray, bandwidth_min: float, bandwidth_max: float, level1: float) float[source]

Compute the increase in auditory filter bandwidth in response to high signal levels. The RMS of the control signal, a scalar, is used to set the bandwidth for the entire signal.

Parameters:
  • () (level1) – envelope output in the control filter band

  • () – auditory filter bandwidth computed for the loss (or NH)

  • () – auditory filter bandwidth at maximum OHC damage

  • () – RMS=1 corresponds to Level1 dB SPL

Returns:

filter bandwidth increased for high signal levels

Return type:

bandwidth ()

Updates: James M. Kates, 21 June 2011. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.basilar_membrane_add_noise(reference: ndarray, threshold: int, level1: float) ndarray[source]

Apply the IHC attenuation to the BM motion and to add a low-level Gaussian noise to give the auditory threshold.

Parameters:
  • () (level1) – BM motion to be attenuated

  • () – additive noise level in dB re:auditory threshold

  • () – an input having RMS=1 corresponds to Level1 dB SPL

Returns:

Attenuated signal with threshold noise added

Updates:

James M. Kates, 19 June 2012. Just additive noise, 2 Oct 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.bm_covary(reference_basilar_membrane: ndarray, processed_basilar_membrane: ndarray, segment_size: int, sample_rate: float) tuple[ndarray, ndarray, ndarray][source]

Compute the cross-covariance (normalized cross-correlation) between the reference and processed signals in each auditory band. The signals are divided into segments having 50% overlap.

Parameters:
  • () (segment_size) – Basilar Membrane movement, reference signal

  • () – Basilar Membrane movement, processed signal

  • () – signal segment size, msec

  • freq_sample (int) – sampling rate in Hz

Returns:

[nchan,nseg] of cross-covariance values reference_mean_square (np.array) : [nchan,nseg] of MS input signal energy values processed_mean_square (np.array) : [nchan,nseg] of MS processed signal energy

values

Return type:

signal_cross_covariance (np.array)

Updates:

James M. Kates, 28 August 2012. Output amplitude adjustment added, 30 october 2012. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.center_frequency(nchan: int, shift: float | None = None, low_freq: int = 80, high_freq: int = 8000, ear_q: float = 9.26449, min_bw: float = 24.7) ndarray[source]

Compute the Equivalent Rectangular Bandwidth_[1] frequency spacing for the gammatone filter bank. The equation comes from Malcolm Slaney[2].

Parameters:
  • nchan (int) – number of filters in the filter bank

  • low_freq (int) – Low Frequency level.

  • high_freq (int) – High Frequency level.

  • () (shift) – optional frequency shift of the filter bank specified as a fractional shift in distance along the BM. A positive shift is an increase in frequency (basal shift), and negative is a decrease in frequency (apical shift). The total length of the BM is normalized to 1. The frequency-to-distance map is from D.D. Greenwood[3].

  • ear_q (float)

  • min_bw (float)

Returns:

References: .. [1] Moore BCJ, Glasberg BR (1983) Suggested formulae for calculating

auditory-filter bandwidths and excitation patterns. J Acoustical Soc America 74:750-753. Available at <https://doi.org/10.1121/1.389861>

Updates: James M. Kates, 25 January 2007. Frequency shift added 22 August 2008. Lower and upper frequencies fixed at 80 and 8000 Hz, 19 June 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.convert_rms_to_sl(reference: ndarray, control: ndarray, attenuated_ohc: ndarray | float, threshold_low: ndarray | int, compression_ratio: ndarray | int, attenuated_ihc: ndarray | float, level1: float, threshold_high: int = 100, small: float = 1e-30) ndarray[source]

Covert the Root Mean Square average output of the gammatone filter bank into dB SL. The gain is linear below the lower threshold, compressive with a compression ratio of CR:1 between the lower and upper thresholds, and reverts to linear above the upper threshold. The compressor assumes that auditory threshold is 0 dB SPL.

Parameters:
  • () (level1) – analytic signal envelope (magnitude) returned by the

  • bank (gammatone filter)

  • level (RMS average)

  • () – control signal envelope

  • () – OHC attenuation at the input to the compressor

  • () – kneepoint for the low-level linear amplification

  • () – compression ratio

  • () – IHC attenuation at the input to the synapse

  • () – dB reference level: a signal having an RMS value of 1 is assigned to Level1 dB SPL.

  • threshold_high (int)

  • small (float)

Returns:

compressed output in dB above the impaired threshold

Return type:

reference_db ()

Updates:

James M. Kates, 6 August 2007. Version for two-tone suppression, 29 August 2008. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.ear_model(reference: ndarray, reference_freq: float, processed: ndarray, processed_freq: float, hearing_loss: ndarray, itype: int, level1: float, nchan: int = 32, m_delay: int = 1, shift: float | None = None) tuple[ndarray, ndarray, ndarray, ndarray, ndarray, ndarray, float][source]

Function that implements a cochlear model that includes the middle ear, auditory filter bank, Outer Hair Cell (OHC) dynamic-range compression, and Inner Hair Cell (IHC) attenuation.

The inputs are the reference and processed signals that are to be compared. The reference is at the reference intensity (e.g. 65 dB SPL or with NAL-R amplification) and has no other processing. The processed signal is the hearing-aid output, and is assumed to have the same or greater group delay compared to the reference.

The function outputs the envelopes of the signals after OHC compression and IHC loss attenuation.

Parameters:
  • reference (np.ndarray) – reference signal: should be adjusted to 65 dB SPL (itype=0 or 1) or to 65 dB SPL plus NAL-R gain (itype=2)

  • reference_freq (int) – sampling rate for the reference signal, Hz

  • processed (np.ndarray) – processed signal (e.g. hearing-aid output) includes HA gain

  • processed_freq (int) – sampling rate for the processed signal, Hz

  • hearing_loss (np.ndarray) – audiogram giving the hearing loss in dB at 6 audiometric frequencies: [250, 500, 1000, 2000, 4000, 6000] Hz

  • itype (int) –

    purpose for the calculation: 0=intelligibility: reference is normal hearing and must not

    include NAL-R EQ

    1=quality: reference does not include NAL-R EQ 2=quality: reference already has NAL-R EQ applied

  • level1 – level calibration: signal RMS=1 corresponds to Level1 dB SPL

  • nchan (int) – auditory frequency bands

  • m_delay (int) – Compensate for the gammatone group delay.

  • shift (float) – Basal shift of the basilar membrane length

Returns:

envelope for the reference in each band reference_basilar_membrane (): BM motion for the reference in each band processed_db (): envelope for the processed signal in each band processed_basilar_membrane (): BM motion for the processed signal in each band reference_sl (): compressed RMS average reference in each band converted

to dB SL

processed_sl (): compressed RMS average output in each band converted to dB SL freq_sample (): sampling rate in Hz for the model outputs

Return type:

reference_db ()

Updates: James M. Kates, 27 October 2011. Basilar Membrane added 30 Dec 2011. Revised 19 June 2012. Remove match of reference RMS level to processed 29 August 2012. IHC adaptation added 1 October 2012. Basilar Membrane envelope converted to dB SL, 2 Oct 2012. Filterbank group delay corrected, 14 Dec 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022. Updated by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.env_compress_basilar_membrane(envsig: ndarray, bm: ndarray, control: ndarray, attn_ohc: float, threshold_low: float, compression_ratio: float, fsamp: float, level1: float, small: float = 1e-30, threshold_high: int = 100) tuple[ndarray, ndarray][source]

Compute the cochlear compression in one auditory filter band. The gain is linear below the lower threshold, compressive with a compression ratio of CR:1 between the lower and upper thresholds, and reverts to linear above the upper threshold. The compressor assumes that auditory threshold is 0 dB SPL.

Parameters:
  • () (small) – analytic signal envelope (magnitude) returned by the gammatone filter bank

  • () – BM motion output by the filter bank

  • () – analytic control envelope returned by the wide control path filter bank

  • () – OHC attenuation at the input to the compressor

  • () – kneepoint for the low-level linear amplification

  • () – compression ratio

  • () – sampling rate in Hz

  • () – dB reference level: a signal having an RMS value of 1 is assigned to Level1 dB SPL.

  • ()

    ???

  • threshold_high – kneepoint for the high-level linear amplification

Returns:

compressed version of the signal envelope compressed_basilar_membrane (): compressed version of the BM motion

Return type:

compressed_signal ()

Updates: James M. Kates, 19 January 2007. LP filter added 15 Feb 2007 (Ref: Zhang et al., 2001) Version to compress the envelope, 20 Feb 2007. Change in the OHC I/O function, 9 March 2007. Two-tone suppression added 22 August 2008. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.env_smooth(envelopes: np.ndarray, segment_size: int, sample_rate: float) ndarray[source]

Function to smooth the envelope returned by the cochlear model. The envelope is divided into segments having a 50% overlap. Each segment is windowed, summed, and divided by the window sum to produce the average. A raised cosine window is used. The envelope sub-sampling frequency is 2*(1000/segsize).

Parameters:
  • envelopes (np.ndarray) – matrix of envelopes in each of the auditory bands

  • segment_size – averaging segment size in msec

  • freq_sample (int) – input envelope sampling rate in Hz

Returns:

matrix of subsampled windowed averages in each band

Return type:

smooth

Updates:

James M. Kates, 26 January 2007. Final half segment added 27 August 2012. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.envelope_align(reference: ndarray, output: ndarray, freq_sample: int = 24000, corr_range: int = 100) ndarray[source]

Align the envelope of the processed signal to that of the reference signal.

Parameters:
  • () (output) – envelope or BM motion of the reference signal

  • () – envelope or BM motion of the output signal

  • freq_sample (int) – Frequency sample rate in Hz

  • corr_range (int) – range in msec for the correlation

Returns:

shifted output envelope to match the input

Return type:

y ()

Updates: James M. Kates, 28 October 2011. Absolute value of the cross-correlation peak removed, 22 June 2012. Cross-correlation range reduced, 13 August 2013. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.envelope_sl(reference: ndarray, basilar_membrane: ndarray, attenuated_ihc: float, level1: float, small: float = 1e-30) tuple[ndarray, ndarray][source]

Convert the compressed envelope returned by cochlear_envcomp to dB SL.

Parameters:
  • () (level1) – linear envelope after compression

  • () – linear Basilar Membrane vibration after compression

  • () – IHC attenuation at the input to the synapse

  • () – level in dB SPL corresponding to 1 RMS

  • small (float) –

    ???

Returns:

reference envelope in dB SL _basilar_membrane (): Basilar Membrane vibration with envelope converted to

dB SL

Return type:

_reference ()

Updates: James M. Kates, 20 Feb 07. IHC attenuation added 9 March 2007. Basilar membrane vibration conversion added 2 October 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.gammatone_bandwidth_demodulation(npts: int, tpt: float, center_freq: float, center_freq_cos: np.ndarray, center_freq_sin: np.ndarray) tuple[np.ndarray, np.ndarray][source]
Create the carriers for demodulaton, using the 2d Rotation method from

https://ccrma.stanford.edu/~jos/pasp/Digital_Sinusoid_Generators.html

to generate the sin and cos components. More efficient, perhaps, than calculating the sin and cos at each point in time.

Parameters:
  • () (center_freq_sin) – How many points are needed.

  • () – Phase change (2pi/T) due to each sample time.

  • () – The carrier frequency

  • () – Array to overwrite for the output.

  • () – Array to overwrite for the output.

Returns:

Samples of the carrier frequency in sin phase. coscf (): Samples of the carrier frequency in cos phase.

Return type:

sincf ()

clarity.evaluator.haspi.eb.gammatone_basilar_membrane(reference: ndarray, reference_bandwidth: float, processed: ndarray, processed_bandwidth: float, freq_sample: float, center_freq: float, ear_q: float = 9.26449, min_bandwidth: float = 24.7) tuple[ndarray, ndarray, ndarray, ndarray][source]

4th-order gammatone auditory filter. This implementation is based on the c program published on-line by Ning Ma, U. Sheffield, UK[1]_ that gives an implementation of the Martin Cooke filters[2]_: an impulse-invariant transformation of the gammatone filter. The signal is demodulated down to baseband using a complex exponential, and then passed through a cascade of four one-pole low-pass filters.

This version filters two signals that have the same sampling rate and the same gammatone filter center frequencies. The lengths of the two signals should match; if they don’t, the signals are truncated to the shorter of the two lengths.

Parameters:
  • () (freq_sample) – first sequence to be filtered

  • reference_bandwidth – bandwidth for x relative to that of a normal ear

  • () – second sequence to be filtered

  • () – bandwidth for x relative to that of a normal ear

  • () – sampling rate in Hz

  • center_frequency (int) – filter center frequency in Hz

  • ear_q – (float): ???

  • min_bandwidth (float) –

    ???

Returns:

filter envelope output (modulated down to baseband)

1st signal

reference_basilar_membrane (): Basilar Membrane for the first signal processed_envelope (): filter envelope output (modulated down to baseband)

2nd signal

processed_basilar_membrane (): Basilar Membrane for the second signal

Return type:

reference_envelope ()

References: .. [1] Ma N, Green P, Barker J, Coy A (2007) Exploiting correlogram

structure for robust speech recognition with multiple speech sources. Speech Communication, 49 (12): 874-891. Available at <https://doi.org/10.1016/j.specom.2007.05.003> <https://staffwww.dcs.shef.ac.uk/people/N.Ma/resources/gammatone/>

Updates: James M. Kates, 8 Jan 2007. Vectorized version for efficient MATLAB execution, 4 February 2007. Cosine and sine generation, 29 June 2011. Output sine and cosine sequences, 19 June 2012. Cosine/sine loop speed increased, 9 August 2013. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.group_delay_compensate(reference: ndarray, bandwidths: ndarray, center_freq: ndarray, freq_sample: float, ear_q: float = 9.26449, min_bandwidth: float = 24.7) ndarray[source]

Compensate for the group delay of the gammatone filter bank. The group delay is computed for each filter at its center frequency. The firing rate output of the IHC model is then adjusted so that all outputs have the same group delay.

Parameters:
  • xenv (np.ndarray) – matrix of signal envelopes or BM motion

  • () (freq_sample) – gammatone filter bandwidths adjusted for loss

  • () – center frequencies of the bands

  • () – sampling rate for the input signal in Hz (e.g. 24,000 Hz)

  • ear_q (float)

  • min_bandwidth (float)

Returns:

envelopes or BM motion compensated for the group delay.

Return type:

processed ()

Updates:

James M. Kates, 28 October 2011. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.inner_hair_cell_adaptation(reference_db, reference_basilar_membrane, delta, freq_sample)[source]

Provide inner hair cell (IHC) adaptation. The adaptation is based on an equivalent RC circuit model, and the derivatives are mapped into 1st-order backward differences. Rapid and short-term adaptation are provided. The input is the signal envelope in dB SL, with IHC attenuation already applied to the envelope. The outputs are the envelope in dB SL with adaptation providing overshoot of the long-term output level, and the BM motion is multiplied by a gain vs. time function that reproduces the adaptation. IHC attenuation and additive noise for the equivalent auditory threshold are provided by a subsequent call to eb_BMatten.

Parameters:
  • reference_db (np.ndarray) – signal envelope in one frequency band in dB SL contains OHC compression and IHC attenuation

  • () (delta) – basilar membrane vibration with OHC compression but no IHC attenuation

  • () – overshoot factor = delta x steady-state

  • freq_sample (int) – sampling rate in Hz

Returns:

envelope in dB SL with IHC adaptation output_basilar_membrane (): Basilar Membrane multiplied by the IHC adaptation

gain function

Return type:

output_db ()

Updates: James M. Kates, 1 October 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.input_align(reference: ndarray, processed: ndarray) tuple[ndarray, ndarray][source]

Approximate temporal alignment of the reference and processed output signals. Leading and trailing zeros are then pruned.

The function assumes that the two sequences have the same sampling rate: call eb_Resamp24kHz for each sequence first, then call this function to align the signals.

Arguments: reference (np.ndarray): input reference sequence processed (np.ndarray): hearing-aid output sequence

Returns: reference (np.ndarray): pruned and shifted reference processed (np.ndarray): pruned and shifted hearing-aid output

Updates: James M. Kates, 12 July 2011. Match the length of the processed output to the reference for the purposes of computing the cross-covariance Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.loss_parameters(hearing_loss: ndarray, center_freq: ndarray, audiometric_freq: ndarray | None = None) tuple[ndarray, ndarray, ndarray, ndarray, ndarray][source]

Apportion the hearing loss to the outer hair cells (OHC) and the inner hair cells (IHC) and to increase the bandwidth of the cochlear filters in proportion to the OHC fraction of the total loss.

Parameters:
  • hearing_loss (np.ndarray) – hearing loss at the 6 audiometric frequencies

  • center_freq (np.ndarray) – array containing the center frequencies of the gammatone filters arranged from low to high

  • audiometric_freq (list)

Returns:

attenuation in dB for the OHC gammatone filters bandwidth (): OHC filter bandwidth expressed in terms of normal low_knee (): Lower kneepoint for the low-level linear amplification compression_ratio (): Ranges from 1.4:1 at 150 Hz to 3.5:1 at 8 kHz for normal

hearing. Reduced in proportion to the OHC loss to 1:1.

attenuated_ihc (): attenuation in dB for the input to the IHC synapse

Return type:

attenuated_ohc ()

Updates: James M. Kates, 25 January 2007. Version for loss in dB and match of OHC loss to CR, 9 March 2007. Low-frequency extent changed to 80 Hz, 27 Oct 2011. Lower kneepoint set to 30 dB, 19 June 2012. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.mel_cepstrum_correlation(reference: ndarray, distorted: ndarray, threshold: float, addnoise: float) tuple[float, ndarray][source]

Compute the cross-correlations between the input signal time-frequency envelope and the distortion time-frequency envelope.

For each time interval, the log spectrum is fitted with a set of half-cosine basis functions. The spectrum weighted by the basis functions corresponds to Mel Cepstral Coefficients computed in the frequency domain. The amplitude-normalized cross-covariance between the time-varying basis functions for the input and output signals is then computed.

Parameters:
  • () (addnoise) – subsampled input signal envelope in dB SL in each critical band

  • () – subsampled distorted output signal envelope

  • () – threshold in dB SPL to include segment in calculation

  • () – additive Gaussian noise to ensure 0 cross-corr at low levels

Returns:

average cepstral correlation 2-6, input vs output individual_cepstral_correlations : individual cepstral correlations,

input vs output

Return type:

average_cepstral_correlation

Updates:

James M. Kates, 24 October 2006. Difference signal removed for cochlear model, 31 January 2007. Absolute value added 13 May 2011. Changed to loudness criterion for silence threshsold, 28 August 2012. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.melcor9(reference: ndarray, distorted: ndarray, threshold: float, add_noise: float, segment_size: int, n_cepstral_coef: int = 6) tuple[float, float, float, ndarray][source]

Compute the cross-correlations between the input signal time-frequency envelope and the distortion time-frequency envelope. For each time interval, the log spectrum is fitted with a set of half-cosine basis functions. The spectrum weighted by the basis functions corresponds to mel cepstral coefficients computed in the frequency domain. The amplitude-normalized cross-covariance between the time-varying basis functions for the input and output signals is then computed for each of the 8 modulation frequencies.

Parameters:
  • () (segment_size) – subsampled input signal envelope in dB SL in each critical band

  • () – subsampled distorted output signal envelope

  • () – threshold in dB SPL to include segment in calculation

  • () – additive Gaussian noise to ensure 0 cross-corr at low levels

  • () – segment size in ms used for the envelope LP filter (8 msec)

  • n_cepstral_coef (int) – Number of cepstral coefficients

Returns:

average of the modulation correlations across analysis

frequency bands and modulation frequency bands, basis functions 2 -6

mel_cepstral_low (): average over the four lower mod freq bands, 0 - 20 Hz mel_cepstral_high (): average over the four higher mod freq bands, 20 - 125 Hz mel_cepstral_modulation (): vector of cross-correlations by modulation

frequency, averaged over analysis frequency band

Return type:

mel_cepstral_average ()

Updates:

James M. Kates, 24 October 2006. Difference signal removed for cochlear model, 31 January 2007. Absolute value added 13 May 2011. Changed to loudness criterion for silence threshold, 28 August 2012. Version using envelope modulation filters, 15 July 2014. Modulation frequency vector output added 27 August 2014. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.

clarity.evaluator.haspi.eb.melcor9_crosscovmatrix(b: ndarray, nmod: int, nbasis: int, nsamp: int, nfir: int, reference_cep: ndarray, processed_cep: ndarray) ndarray[source]

Compute the cross-covariance matrix.

Parameters:
  • () (ycep) –

    ???

  • ()

    ???

  • ()

    ???

  • ()

    ???

  • ()

    ???

  • ()

    ???

  • ()

    ???

Return type:

cross_covariance_matrix ()

clarity.evaluator.haspi.eb.middle_ear(reference: ndarray, freq_sample: float) ndarray[source]

Design the middle ear filters and process the input through the cascade of filters. The middle ear model is a 2-pole HP filter at 350 Hz in series with a 1-pole LP filter at 5000 Hz. The result is a rough approximation to the equal-loudness contour at threshold.

Arguments: reference (np.ndarray): input signal freq_sample (float): sampling rate in Hz

Returns: xout (): filtered output

Updates: James M. Kates, 18 January 2007. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.resample_24khz(reference_signal: ndarray, reference_freq: float, freq_sample_hz: float = 24000.0) tuple[ndarray, float][source]

Resample the input signal at 24 kHz. The input sampling rate is rounded to the nearest kHz to compute the sampling rate conversion ratio.

Arguments: reference_signal (np.ndarray): input signal reference_freq (int): sampling rate for the input in Hz freq_sample_hz (int): Frequency sample in Hz

Returns: reference_signal_24 signal resampled at kHz (default 24Khz) freq_sample_hz output sampling rate in Hz

Updates James M. Kates, 20 June 2011. Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.

clarity.evaluator.haspi.eb.spectrum_diff(reference_sl: ndarray, processed_sl: ndarray) tuple[ndarray, ndarray, ndarray][source]

Compute changes in the long-term spectrum and spectral slope.

The metric is based on the spectral distortion metric of Moore and Tan[1]_ (JAES, Vol 52, pp 900-914). The log envelopes in dB SL are converted to linear to approximate specific loudness. The outputs are the sum of the absolute differences, the standard deviation of the differences, and the maximum absolute difference. The same three outputs are provided for the normalized spectral difference and for the slope. The output is calibrated so that a processed signal having 0 amplitude produces a value of 1 for the spectrum difference.

Abs diff: weight all deviations uniformly Std diff: weight larger deviations more than smaller deviations Max diff: only weight the largest deviation

Parameters:
  • reference_sl (np.ndarray) – reference signal spectrum in dB SL

  • processed_sl (np.ndarray) – degraded signal spectrum in dB SL

Returns:

[sum abs diff, std dev diff, max diff] spectra dnorm (np.array) : [sum abs diff, std dev diff, max diff] norm spectra dslope (np.array) : [sum abs diff, std dev diff, max diff] slope

Return type:

dloud (np.array)

References: .. [1] Moore BCJ, Tan, CT (2004) Development and Validation of a Method

for Predicting the Perceived Naturalness of Sounds Subjected to Spectral Distortion J Audio Eng Soc 52(9):900-914. Available at. <http://www.aes.org/e-lib/browse.cfm?elib=13018>.

Updates:

James M. Kates, 28 June 2012. Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.