Compute the average cross-covariance between the reference and processed
signals in each auditory band.
The silent time-frequency tiles are removed from consideration. The
cross-covariance is computed for each segment in each frequency band. The
values are weighted by 1 for inclusion or 0 if the tile is below
threshold. The sum of the covariance values across time and frequency are
then divided by the total number of tiles above threshold. The calculation
is a modification of Tan et al.[1]_ . The cross-covariance is also output
with a frequency weighting that reflects the loss of IHC synchronization at high
frequencies Johnson[2]_.
Parameters:
signal_cross_covariance (np.array) – [nchan,nseg] of cross-covariance values
reference_signal_mean_square (np.array) – [nchan,nseg] of reference signal MS
values
() (threshold_db) – threshold in dB SL to include segment ave over freq in
average
lp_filter (list) – LP filter order
freq_cutoff (list) – Cutoff frequencies in Hz
Returns:
cross-covariance in segments averaged over time and
frequency
ihc_sync_covariance (): cross-covariance array, 6 different weightings for loss
of IHC synchronization at high frequencies:
LP Filter Order Cutoff Freq, kHz
1 1.5
3 2.0
5 2.5, 3.0, 3.5, 4.0
Return type:
average_covariance ()
References:
Updates:
James M. Kates, 28 August 2012.
Adjusted for BM vibration in dB SL, 30 October 2012.
Threshold for including time-freq tile modified, 30 January 2013.
Version for different sync loss, 15 February 2013.
Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.
Compute the increase in auditory filter bandwidth in response to high signal
levels. The RMS of the control signal, a scalar, is used to set the
bandwidth for the entire signal.
Parameters:
() (level1) – envelope output in the control filter band
() – auditory filter bandwidth computed for the loss (or NH)
() – auditory filter bandwidth at maximum OHC damage
() – RMS=1 corresponds to Level1 dB SPL
Returns:
filter bandwidth increased for high signal levels
Return type:
bandwidth ()
Updates:
James M. Kates, 21 June 2011.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Compute the cross-covariance (normalized cross-correlation) between the reference
and processed signals in each auditory band. The signals are divided into segments
having 50% overlap.
Parameters:
() (segment_size) – Basilar Membrane movement, reference signal
() – Basilar Membrane movement, processed signal
() – signal segment size, msec
freq_sample (int) – sampling rate in Hz
Returns:
[nchan,nseg] of cross-covariance values
reference_mean_square (np.array) : [nchan,nseg] of MS input signal energy values
processed_mean_square (np.array) : [nchan,nseg] of MS processed signal energy
values
Return type:
signal_cross_covariance (np.array)
Updates:
James M. Kates, 28 August 2012.
Output amplitude adjustment added, 30 october 2012.
Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.
Compute the Equivalent Rectangular Bandwidth_[1] frequency spacing for the
gammatone filter bank. The equation comes from Malcolm Slaney[2].
Parameters:
nchan (int) – number of filters in the filter bank
low_freq (int) – Low Frequency level.
high_freq (int) – High Frequency level.
() (shift) – optional frequency shift of the filter bank specified as a fractional
shift in distance along the BM. A positive shift is an increase in frequency
(basal shift), and negative is a decrease in frequency (apical shift). The
total length of the BM is normalized to 1. The frequency-to-distance map is
from D.D. Greenwood[3].
ear_q (float)
min_bw (float)
Returns:
References:
.. [1] Moore BCJ, Glasberg BR (1983) Suggested formulae for calculating
auditory-filter bandwidths and excitation patterns. J Acoustical
Soc America 74:750-753. Available at
<https://doi.org/10.1121/1.389861>
Updates:
James M. Kates, 25 January 2007.
Frequency shift added 22 August 2008.
Lower and upper frequencies fixed at 80 and 8000 Hz, 19 June 2012.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Covert the Root Mean Square average output of the gammatone filter bank
into dB SL. The gain is linear below the lower threshold, compressive
with a compression ratio of CR:1 between the lower and upper thresholds,
and reverts to linear above the upper threshold. The compressor
assumes that auditory threshold is 0 dB SPL.
Parameters:
() (level1) – analytic signal envelope (magnitude) returned by the
bank (gammatone filter)
level (RMS average)
() – control signal envelope
() – OHC attenuation at the input to the compressor
() – kneepoint for the low-level linear amplification
() – compression ratio
() – IHC attenuation at the input to the synapse
() – dB reference level: a signal having an RMS value of 1 is
assigned to Level1 dB SPL.
threshold_high (int)
small (float)
Returns:
compressed output in dB above the impaired threshold
Return type:
reference_db ()
Updates:
James M. Kates, 6 August 2007.
Version for two-tone suppression, 29 August 2008.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Function that implements a cochlear model that includes the middle ear,
auditory filter bank, Outer Hair Cell (OHC) dynamic-range compression,
and Inner Hair Cell (IHC) attenuation.
The inputs are the reference and processed signals that are to be
compared. The reference is at the reference intensity (e.g. 65 dB SPL
or with NAL-R amplification) and has no other processing. The processed
signal is the hearing-aid output, and is assumed to have the same or
greater group delay compared to the reference.
The function outputs the envelopes of the signals after OHC compression
and IHC loss attenuation.
Parameters:
reference (np.ndarray) – reference signal: should be adjusted to 65 dB SPL
(itype=0 or 1) or to 65 dB SPL plus NAL-R gain (itype=2)
reference_freq (int) – sampling rate for the reference signal, Hz
processed (np.ndarray) – processed signal (e.g. hearing-aid output) includes
HA gain
processed_freq (int) – sampling rate for the processed signal, Hz
hearing_loss (np.ndarray) – audiogram giving the hearing loss in dB at 6
audiometric frequencies: [250, 500, 1000, 2000, 4000, 6000] Hz
itype (int) –
purpose for the calculation:
0=intelligibility: reference is normal hearing and must not
include NAL-R EQ
1=quality: reference does not include NAL-R EQ
2=quality: reference already has NAL-R EQ applied
level1 – level calibration: signal RMS=1 corresponds to Level1 dB SPL
nchan (int) – auditory frequency bands
m_delay (int) – Compensate for the gammatone group delay.
shift (float) – Basal shift of the basilar membrane length
Returns:
envelope for the reference in each band
reference_basilar_membrane (): BM motion for the reference in each band
processed_db (): envelope for the processed signal in each band
processed_basilar_membrane (): BM motion for the processed signal in each band
reference_sl (): compressed RMS average reference in each band converted
to dB SL
processed_sl (): compressed RMS average output in each band converted to dB SL
freq_sample (): sampling rate in Hz for the model outputs
Return type:
reference_db ()
Updates:
James M. Kates, 27 October 2011.
Basilar Membrane added 30 Dec 2011.
Revised 19 June 2012.
Remove match of reference RMS level to processed 29 August 2012.
IHC adaptation added 1 October 2012.
Basilar Membrane envelope converted to dB SL, 2 Oct 2012.
Filterbank group delay corrected, 14 Dec 2012.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Updated by Gerardo Roa Dabike, September 2022.
Compute the cochlear compression in one auditory filter band. The gain is linear
below the lower threshold, compressive with a compression ratio of CR:1 between the
lower and upper thresholds, and reverts to linear above the upper threshold. The
compressor assumes that auditory threshold is 0 dB SPL.
Parameters:
() (small) – analytic signal envelope (magnitude) returned by the
gammatone filter bank
() – BM motion output by the filter bank
() – analytic control envelope returned by the wide control
path filter bank
() – OHC attenuation at the input to the compressor
() – kneepoint for the low-level linear amplification
() – compression ratio
() – sampling rate in Hz
() – dB reference level: a signal having an RMS value of 1 is
assigned to Level1 dB SPL.
() –
???
threshold_high – kneepoint for the high-level linear amplification
Returns:
compressed version of the signal envelope
compressed_basilar_membrane (): compressed version of the BM motion
Return type:
compressed_signal ()
Updates:
James M. Kates, 19 January 2007.
LP filter added 15 Feb 2007 (Ref: Zhang et al., 2001)
Version to compress the envelope, 20 Feb 2007.
Change in the OHC I/O function, 9 March 2007.
Two-tone suppression added 22 August 2008.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Function to smooth the envelope returned by the cochlear model. The
envelope is divided into segments having a 50% overlap. Each segment is
windowed, summed, and divided by the window sum to produce the average.
A raised cosine window is used. The envelope sub-sampling frequency is
2*(1000/segsize).
Parameters:
envelopes (np.ndarray) – matrix of envelopes in each of the auditory bands
segment_size – averaging segment size in msec
freq_sample (int) – input envelope sampling rate in Hz
Returns:
matrix of subsampled windowed averages in each band
Return type:
smooth
Updates:
James M. Kates, 26 January 2007.
Final half segment added 27 August 2012.
Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.
Align the envelope of the processed signal to that of the reference signal.
Parameters:
() (output) – envelope or BM motion of the reference signal
() – envelope or BM motion of the output signal
freq_sample (int) – Frequency sample rate in Hz
corr_range (int) – range in msec for the correlation
Returns:
shifted output envelope to match the input
Return type:
y ()
Updates:
James M. Kates, 28 October 2011.
Absolute value of the cross-correlation peak removed, 22 June 2012.
Cross-correlation range reduced, 13 August 2013.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Convert the compressed envelope returned by cochlear_envcomp to dB SL.
Parameters:
() (level1) – linear envelope after compression
() – linear Basilar Membrane vibration after compression
() – IHC attenuation at the input to the synapse
() – level in dB SPL corresponding to 1 RMS
small (float) –
???
Returns:
reference envelope in dB SL
_basilar_membrane (): Basilar Membrane vibration with envelope converted to
dB SL
Return type:
_reference ()
Updates:
James M. Kates, 20 Feb 07.
IHC attenuation added 9 March 2007.
Basilar membrane vibration conversion added 2 October 2012.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
4th-order gammatone auditory filter. This implementation is based on the c program
published on-line by Ning Ma, U. Sheffield, UK[1]_ that gives an implementation of
the Martin Cooke filters[2]_: an impulse-invariant transformation of the gammatone
filter. The signal is demodulated down to baseband using a complex exponential,
and then passed through a cascade of four one-pole low-pass filters.
This version filters two signals that have the same sampling rate and the same
gammatone filter center frequencies. The lengths of the two signals should match;
if they don’t, the signals are truncated to the shorter of the two lengths.
Parameters:
() (freq_sample) – first sequence to be filtered
reference_bandwidth – bandwidth for x relative to that of a normal ear
() – second sequence to be filtered
() – bandwidth for x relative to that of a normal ear
() – sampling rate in Hz
center_frequency (int) – filter center frequency in Hz
ear_q – (float): ???
min_bandwidth (float) –
???
Returns:
filter envelope output (modulated down to baseband)
1st signal
reference_basilar_membrane (): Basilar Membrane for the first signal
processed_envelope (): filter envelope output (modulated down to baseband)
2nd signal
processed_basilar_membrane (): Basilar Membrane for the second signal
Return type:
reference_envelope ()
References:
.. [1] Ma N, Green P, Barker J, Coy A (2007) Exploiting correlogram
Updates:
James M. Kates, 8 Jan 2007.
Vectorized version for efficient MATLAB execution, 4 February 2007.
Cosine and sine generation, 29 June 2011.
Output sine and cosine sequences, 19 June 2012.
Cosine/sine loop speed increased, 9 August 2013.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Compensate for the group delay of the gammatone filter bank. The group
delay is computed for each filter at its center frequency. The firing
rate output of the IHC model is then adjusted so that all outputs have
the same group delay.
Parameters:
xenv (np.ndarray) – matrix of signal envelopes or BM motion
() (freq_sample) – gammatone filter bandwidths adjusted for loss
() – center frequencies of the bands
() – sampling rate for the input signal in Hz (e.g. 24,000 Hz)
ear_q (float)
min_bandwidth (float)
Returns:
envelopes or BM motion compensated for the group delay.
Return type:
processed ()
Updates:
James M. Kates, 28 October 2011.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Provide inner hair cell (IHC) adaptation. The adaptation is based on an
equivalent RC circuit model, and the derivatives are mapped into
1st-order backward differences. Rapid and short-term adaptation are
provided. The input is the signal envelope in dB SL, with IHC attenuation
already applied to the envelope. The outputs are the envelope in dB SL
with adaptation providing overshoot of the long-term output level, and
the BM motion is multiplied by a gain vs. time function that reproduces
the adaptation. IHC attenuation and additive noise for the equivalent
auditory threshold are provided by a subsequent call to eb_BMatten.
Parameters:
reference_db (np.ndarray) – signal envelope in one frequency band in dB SL
contains OHC compression and IHC attenuation
() (delta) – basilar membrane vibration with OHC compression
but no IHC attenuation
() – overshoot factor = delta x steady-state
freq_sample (int) – sampling rate in Hz
Returns:
envelope in dB SL with IHC adaptation
output_basilar_membrane (): Basilar Membrane multiplied by the IHC adaptation
gain function
Return type:
output_db ()
Updates:
James M. Kates, 1 October 2012.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Approximate temporal alignment of the reference and processed output
signals. Leading and trailing zeros are then pruned.
The function assumes that the two sequences have the same sampling rate:
call eb_Resamp24kHz for each sequence first, then call this function to
align the signals.
Returns:
reference (np.ndarray): pruned and shifted reference
processed (np.ndarray): pruned and shifted hearing-aid output
Updates:
James M. Kates, 12 July 2011.
Match the length of the processed output to the reference for the
purposes of computing the cross-covariance
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Apportion the hearing loss to the outer hair cells (OHC) and the inner
hair cells (IHC) and to increase the bandwidth of the cochlear filters
in proportion to the OHC fraction of the total loss.
Parameters:
hearing_loss (np.ndarray) – hearing loss at the 6 audiometric frequencies
center_freq (np.ndarray) – array containing the center frequencies of the
gammatone filters arranged from low to high
audiometric_freq (list)
Returns:
attenuation in dB for the OHC gammatone filters
bandwidth (): OHC filter bandwidth expressed in terms of normal
low_knee (): Lower kneepoint for the low-level linear amplification
compression_ratio (): Ranges from 1.4:1 at 150 Hz to 3.5:1 at 8 kHz for normal
hearing. Reduced in proportion to the OHC loss to 1:1.
attenuated_ihc (): attenuation in dB for the input to the IHC synapse
Return type:
attenuated_ohc ()
Updates:
James M. Kates, 25 January 2007.
Version for loss in dB and match of OHC loss to CR, 9 March 2007.
Low-frequency extent changed to 80 Hz, 27 Oct 2011.
Lower kneepoint set to 30 dB, 19 June 2012.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Compute the cross-correlations between the input signal time-frequency
envelope and the distortion time-frequency envelope.
For each time interval, the log spectrum is fitted with a set of
half-cosine basis functions. The spectrum weighted by the basis
functions corresponds to Mel Cepstral Coefficients computed in the
frequency domain. The amplitude-normalized cross-covariance between
the time-varying basis functions for the input and output signals is
then computed.
Parameters:
() (addnoise) – subsampled input signal envelope in dB SL in each critical band
() – subsampled distorted output signal envelope
() – threshold in dB SPL to include segment in calculation
() – additive Gaussian noise to ensure 0 cross-corr at low levels
Returns:
average cepstral correlation 2-6, input vs output
individual_cepstral_correlations : individual cepstral correlations,
input vs output
Return type:
average_cepstral_correlation
Updates:
James M. Kates, 24 October 2006.
Difference signal removed for cochlear model, 31 January 2007.
Absolute value added 13 May 2011.
Changed to loudness criterion for silence threshsold, 28 August 2012.
Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.
Compute the cross-correlations between the input signal
time-frequency envelope and the distortion time-frequency envelope. For
each time interval, the log spectrum is fitted with a set of half-cosine
basis functions. The spectrum weighted by the basis functions corresponds
to mel cepstral coefficients computed in the frequency domain. The
amplitude-normalized cross-covariance between the time-varying basis
functions for the input and output signals is then computed for each of
the 8 modulation frequencies.
Parameters:
() (segment_size) – subsampled input signal envelope in dB SL in each critical band
() – subsampled distorted output signal envelope
() – threshold in dB SPL to include segment in calculation
() – additive Gaussian noise to ensure 0 cross-corr at low levels
() – segment size in ms used for the envelope LP filter (8 msec)
n_cepstral_coef (int) – Number of cepstral coefficients
Returns:
average of the modulation correlations across analysis
frequency bands and modulation frequency bands, basis functions 2 -6
mel_cepstral_low (): average over the four lower mod freq bands, 0 - 20 Hz
mel_cepstral_high (): average over the four higher mod freq bands, 20 - 125 Hz
mel_cepstral_modulation (): vector of cross-correlations by modulation
frequency, averaged over analysis frequency band
Return type:
mel_cepstral_average ()
Updates:
James M. Kates, 24 October 2006.
Difference signal removed for cochlear model, 31 January 2007.
Absolute value added 13 May 2011.
Changed to loudness criterion for silence threshold, 28 August 2012.
Version using envelope modulation filters, 15 July 2014.
Modulation frequency vector output added 27 August 2014.
Translated from MATLAB to Python by Gerardo Roa Dabike, September 2022.
Design the middle ear filters and process the input through the
cascade of filters. The middle ear model is a 2-pole HP filter
at 350 Hz in series with a 1-pole LP filter at 5000 Hz. The
result is a rough approximation to the equal-loudness contour
at threshold.
Arguments:
reference (np.ndarray): input signal
freq_sample (float): sampling rate in Hz
Returns:
xout (): filtered output
Updates:
James M. Kates, 18 January 2007.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Resample the input signal at 24 kHz. The input sampling rate is
rounded to the nearest kHz to compute the sampling rate conversion
ratio.
Arguments:
reference_signal (np.ndarray): input signal
reference_freq (int): sampling rate for the input in Hz
freq_sample_hz (int): Frequency sample in Hz
Returns:
reference_signal_24 signal resampled at kHz (default 24Khz)
freq_sample_hz output sampling rate in Hz
Updates
James M. Kates, 20 June 2011.
Translated from MATLAB to Python by Zuzanna Podwinska, March 2022.
Compute changes in the long-term spectrum and spectral slope.
The metric is based on the spectral distortion metric of Moore and Tan[1]_
(JAES, Vol 52, pp 900-914). The log envelopes in dB SL are converted to
linear to approximate specific loudness. The outputs are the sum of the
absolute differences, the standard deviation of the differences, and the
maximum absolute difference. The same three outputs are provided for the
normalized spectral difference and for the slope. The output is
calibrated so that a processed signal having 0 amplitude produces a
value of 1 for the spectrum difference.
Abs diff: weight all deviations uniformly
Std diff: weight larger deviations more than smaller deviations
Max diff: only weight the largest deviation
Parameters:
reference_sl (np.ndarray) – reference signal spectrum in dB SL
processed_sl (np.ndarray) – degraded signal spectrum in dB SL
Returns:
[sum abs diff, std dev diff, max diff] spectra
dnorm (np.array) : [sum abs diff, std dev diff, max diff] norm spectra
dslope (np.array) : [sum abs diff, std dev diff, max diff] slope
Return type:
dloud (np.array)
References:
.. [1] Moore BCJ, Tan, CT (2004) Development and Validation of a Method