Skip to content

This repo gives the code for the official implementation of RCT.

Notifications You must be signed in to change notification settings

Audio-WestlakeU/RCT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

45 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RCT-Random-Consistency-Training

Welcome to RCT-Random consistency training! This is the official implementation of RCT. RCT has already been accepcted by INTERSPEECH 2022.

Paper 🤩 | Issues 😅 | Lab 🙉 | Contact 😘

Introduction

The structure of RCT

RCT: Random Consistency Training

RCT is a semi-supervised training secheme for Sound Event Detection (SED). But we believe it has a more generalized usage on other semi-supervised implementations!

RCT is constructed for SED, and we built it based on the baseline model for DCASE 2021 challenge. Please refer to [1] and [2] for more details about the code architecture. The model is built based on PytorchLightning, if you are not familiar with its workflow, you can just focus: 1. the training_step() in sed_trainer_rct.py; 2. the class RandAugment in rand_augm_agg.py, to understand RCT.

Training

The training/validation data is obtained from the DCSAE2021 task4 DESED dataset. The download of DESED is quite tedious and not all data is available for the accesses. You could ask for help from the DCASE committee to get the full dataset. Noted that, your testing result might be different with an incomplete validation dataset.

To train the model, please first get the baseline architecture of DCASE2021 task 4 by:

git clone [email protected]:DCASE-REPO/DESED_task.git

Don't forget to configure your environment by their requirements.

After complete the above setup, you could add the codes of this repo to the baseline repo.

git clone [email protected]:Audio-WestlakeU/RCT-Random-Consistency-Training.git

To train the model, DO NOT forget to change your dataset path in recipes/dcase2021_task4_baseline/confs/sed_rct.yaml to YOUR_PATH_TO_DESED. Then, please run:

python train_sed_rct.py

If you want to customize your training, you could modify the configuration file in recipes/dcase2021_task4_baseline/confs/sed_rct.yaml. We provided our own implementations of different data augmentations including SpecAug, FilterAug, pitch shift and time shift.

Using the proposed self-consistency loss is set as a trigger in sed_rct.yaml by

augs:    
    consis: True

Of course, we encourage the implementation of other data augmentations to be added and tested using RCT.

Results

The result of a single model of RCT is around 40.12% and 61.39% for PSDS 1 and PSDS 2 under 7 trials. You may get higher or lower results according to your choice of seeds. We provided the results of 3 trials:

Trial num. Seed PSDS_1 PSDS_2
1 42 39.69% 61.59%
2 1 40.49% 62.67%
3 2 39.50% 60.03%

Reference

[1] DESED Dataset: https://github.com/turpaultn/DESED

[2] DCASE2021 Task4 baseline: https://github.com/DCASE-REPO/DESED_task

[3] SpecAug: https://arxiv.org/pdf/1904.08779

[4] FilterAug: https://github.com/frednam93/FilterAugSED

About

This repo gives the code for the official implementation of RCT.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages