Short-Time Objective Intelligibility (STOI) Metric

This repository contains an implementation of the STOI metric¹, an intrusive objective measure used to predict speech intelligibility in noisy environments. The STOI metric is widely used in evaluating the effectiveness of hearing aid algorithms, speech enhancement systems, and machine learning-based intelligibility predictors.

This implementation was developed as part of a B.Sc. thesis, and the STOI-derived d-matrices were later used as inputs for neural networks.

Overview of the STOI Metric

The STOI metric is a computationally efficient way to predict speech intelligibility based on the correlation of short-time temporal envelopes of clean and noisy speech.

It follows these main steps:

Preprocessing
- Converts audio to mono and resamples to 10 kHz.
- Removes silent frames based on an energy threshold.
Time-Frequency Analysis
- Computes the Short-Time Fourier Transform (STFT).
- Groups STFT bins into one-third octave bands (mimicking human auditory perception).
Short-Time Segmentation
- Divides signals into overlapping 30-frame windows.
- Normalizes and clips noisy speech based on reference signal energy.
D-Matrix Computation
- Calculates frame-wise correlation between clean and noisy signals.
- Stores these correlations in a structured d-matrix.
Final STOI Score
- Averages all correlation values to get the final STOI intelligibility score.

Experimental Results

The dataset used to test the metric is the CPC1 dataset, which includes noisy speech signals and corresponding intelligibility scores obtained from tests with human listeners.

Installation

Clone the repository:

git clone https://github.com/George-P-1/stoi_Metric.git
cd stoi_Metric

Prerequisites

See requirements.txt for the necessary dependencies. To install the necessary dependencies, run:

pip install -r requirements.txt

Further Work

Use neural networks to improve predictions (see neural networks project).

Acknowledgements

The dataset used to test STOI metric was provided by The Clarity Project. The official pystoi implementation was used for validation of this work.

Cees H. Taal, Richard C. Hendriks, Richard Heusdens, and Jesper Jensen. “An Algorithm for Intelligibility Prediction of Time-Frequency Weighted Noisy Speech”. In: IEEE Transactions on Audio, Speech, and Language Processing 19.7 (Sept. 2011), pp. 2125–2136. doi: 10.1109/TASL.2011.2114881. ↩

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
utils		utils
.gitignore		.gitignore
README.md		README.md
calculate_rmse.py		calculate_rmse.py
config.yaml		config.yaml
main.py		main.py
mystoi.py		mystoi.py
mystoi_scores_2025-01-16_09-47-05.csv		mystoi_scores_2025-01-16_09-47-05.csv
mystoi_scores_2025-01-16_10-37-23.csv		mystoi_scores_2025-01-16_10-37-23.csv
mystoi_scores_2025-01-16_17-25-56.csv		mystoi_scores_2025-01-16_17-25-56.csv
mystoi_train_indep_1.csv		mystoi_train_indep_1.csv
mystoi_train_indep_2.csv		mystoi_train_indep_2.csv
plotter.py		plotter.py
pystoi_scores_2025-01-16_09-47-05.csv		pystoi_scores_2025-01-16_09-47-05.csv
pystoi_scores_2025-01-16_10-37-23.csv		pystoi_scores_2025-01-16_10-37-23.csv
pystoi_scores_2025-01-16_17-25-56.csv		pystoi_scores_2025-01-16_17-25-56.csv
requirements.txt		requirements.txt
tester.py		tester.py
true_listeners_scores.csv		true_listeners_scores.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Short-Time Objective Intelligibility (STOI) Metric

Overview of the STOI Metric

Experimental Results

Installation

Prerequisites

Further Work

Acknowledgements

About

Uh oh!

Uh oh!

Languages

George-P-1/stoi_Metric

Folders and files

Latest commit

History

Repository files navigation

Short-Time Objective Intelligibility (STOI) Metric

Overview of the STOI Metric

Experimental Results

Installation

Prerequisites

Further Work

Acknowledgements

Footnotes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Uh oh!

Languages