RCT-Random-Consistency-Training

Welcome to RCT-Random consistency training! This is the official implementation of RCT. RCT has already been accepcted by INTERSPEECH 2022.

Paper 🤩 | Issues 😅 | Lab 🙉 | Contact 😘

Introduction

RCT: Random Consistency Training

RCT is a semi-supervised training secheme for Sound Event Detection (SED). But we believe it has a more generalized usage on other semi-supervised implementations!

RCT is constructed for SED, and we built it based on the baseline model for DCASE 2021 challenge. Please refer to [1] and [2] for more details about the code architecture. The model is built based on PytorchLightning, if you are not familiar with its workflow, you can just focus: 1. the training_step() in sed_trainer_rct.py; 2. the class RandAugment in rand_augm_agg.py, to understand RCT.

Training

The training/validation data is obtained from the DCSAE2021 task4 DESED dataset. The download of DESED is quite tedious and not all data is available for the accesses. You could ask for help from the DCASE committee to get the full dataset. Noted that, your testing result might be different with an incomplete validation dataset.

To train the model, please first get the baseline architecture of DCASE2021 task 4 by:

git clone [email protected]:DCASE-REPO/DESED_task.git

Don't forget to configure your environment by their requirements.

After complete the above setup, you could add the codes of this repo to the baseline repo.

git clone [email protected]:Audio-WestlakeU/RCT-Random-Consistency-Training.git

To train the model, DO NOT forget to change your dataset path in recipes/dcase2021_task4_baseline/confs/sed_rct.yaml to YOUR_PATH_TO_DESED. Then, please run:

python train_sed_rct.py

If you want to customize your training, you could modify the configuration file in recipes/dcase2021_task4_baseline/confs/sed_rct.yaml. We provided our own implementations of different data augmentations including SpecAug, FilterAug, pitch shift and time shift.

Using the proposed self-consistency loss is set as a trigger in sed_rct.yaml by

augs:    
    consis: True

Of course, we encourage the implementation of other data augmentations to be added and tested using RCT.

Results

The result of a single model of RCT is around 40.12% and 61.39% for PSDS 1 and PSDS 2 under 7 trials. You may get higher or lower results according to your choice of seeds. We provided the results of 3 trials:

Trial num.	Seed	PSDS_1	PSDS_2
1	42	39.69%	61.59%
2	1	40.49%	62.67%
3	2	39.50%	60.03%

Reference

[1] DESED Dataset: https://github.com/turpaultn/DESED

[2] DCASE2021 Task4 baseline: https://github.com/DCASE-REPO/DESED_task

[3] SpecAug: https://arxiv.org/pdf/1904.08779

[4] FilterAug: https://github.com/frednam93/FilterAugSED

Name		Name	Last commit message	Last commit date
Latest commit History 45 Commits
desed_task/data_augm		desed_task/data_augm
imgs		imgs
recipes/dcase2021_task4_baseline		recipes/dcase2021_task4_baseline
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

desed_task/data_augm

desed_task/data_augm

imgs

imgs

recipes/dcase2021_task4_baseline

recipes/dcase2021_task4_baseline

README.md

README.md

Repository files navigation

RCT-Random-Consistency-Training

Introduction

Training

Results

Reference

About

Releases

Packages

Languages

Audio-WestlakeU/RCT

Folders and files

Latest commit

History

Repository files navigation

RCT-Random-Consistency-Training

Introduction

Training

Results

Reference

About

Resources

Stars

Watchers

Forks

Languages