Skip to content
This repository was archived by the owner on Feb 25, 2025. It is now read-only.

andreaco/SpokenDigitClassification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

39 Commits
 
 
 
 

Repository files navigation

Spoken Digit Classification

  • Description
    Implement a classifier able to predict which digit is pronounced in a short audio excerpt.
  • Input
    The dataset used is the Free Spoken Digit Dataset (FSDD). In the folder recodings you will find the audio files named in a specific format. Please read the READ ME distributed with the dataset. The results of the classification must be reported as a confusion matrix and, optionally, other metrics of your choice.
  • Output
    • a brief presentation of your work (max 5 minutes) that will be given to the class
    • a more detailed report in which you illustrate and explain every step of your classification system and in which the results are shown and commented (max 8 pages) to be delivered by May 17th.
    • a link to a repository containing the code (e.g. on GitHub) with minimal comments.

Tasks:

  • Preprocessing
  • Feature selection
  • Dataset split
  • Feature extraction
  • Feature selection
  • Classification
  • Performance evaluation

Features

Mel-frequency cepstrum coefficients
Linear Predictive Coding
Phoneme detection
HiddenMarkovModels
Image processing on spectrogram

About

A classifier for prediction of spoken digits from short audio excerpts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •