Skip to content

Trains a CNN that takes audio clips and produces predictions of what the audio is based on a set of IAB labels.

Notifications You must be signed in to change notification settings

theozhangg/Audio_to_IAB

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

IAB label classification finetuned on pretrained audio neural networks (PANNs)

IAB label classification is a task to classify audio clips into different IAB labels. We used the Audioset dataset to train this model; more information on how we created the dataset can be found here. In this codebase, we fine-tune PANNs based on how this model was fine tuned to build an audio clip classification system.

View our full report here on how we created the dataset and how we created our model here

View our datasets and our Audioset labels to IAB labels mappings (for both 5 and 20 labels) here

We also expanded this model to use 20 IAB labels here

View the notebook here if you would like to run it yourself

1. Requirements

python 3.6 + pytorch 1.0

2. Then simply run:

$ Run the bash script ./runme.sh

Or run the commands in runme.sh line by line. The commands includes:

(1) Modify the paths of dataset and your workspace

(2) Extract features

(3) Train model

Model

A 14-layer CNN of PANNs is fine-tuned. We use 10-fold cross validation for IAB label classification. That is, 900 audio clips are used for training, and 100 audio clips are used for validation.

Citation

[1] Kong, Qiuqiang, Yin Cao, Turab Iqbal, Yuxuan Wang, Wenwu Wang, and Mark D. Plumbley. "PANNs: Large-scale pretrained audio neural networks for audio pattern recognition." arXiv preprint arXiv:1912.10211 (2019).

About

Trains a CNN that takes audio clips and produces predictions of what the audio is based on a set of IAB labels.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 94.3%
  • Shell 5.7%