A PyTorch implementation of our paper "TENET: A Time-reversal Enhancement Network for noise-robust ASR".
- See requirements.txt
Usage: ./inference.sh <noisy.scp> <cpt-dir> <dump-dir>
option: --ref-scp(given clean counterparts and calculate pesq/sisnr metrics)
--remove-wav(false)
--model(TENET)
--gpu (0)
--fs (16000)
--nj (1)
-
Download pretrained wav2vec model and put it in
pretrain/wav2vec_large.pt
-
Configure training settings and model hyperparameters from nnet/conf.py.
-
Full experiment command:
Usage: ./train.sh <cpt-dir> <model> <exp-id>
egs: ./train.sh exp/voicebank TENET(TCN-skip, DCCRN, PFPL, DPTNet) freqdpt_base
options: --resume (path/to/best.pt.tar)
--gpu (0)
--epochs (100)
--batch-size (4)
--cache-size (10)
This repository contains codes from:
- ConvTasNet - https://github.com/funcwj/conv-tasnet
- DPTNet - https://github.com/asteroid-team/asteroid
- DCCRN - https://github.com/huyanxin/DeepComplexCRN
- PFPL - https://github.com/aleXiehta/PhoneFortifiedPerceptualLoss
- SETK toolkit - https://github.com/funcwj/setk
If you find this repository useful, please cite the following paper:
@inproceedings{chao2021tenet,
title = {TENET: A Time-reversal Enhancement Network for noise-robust ASR},
author = {Fu-An Chao and Shao-Wei Fan Jiang and Bi-Cheng Yan
and Jeih-weih Hung and Berlin Chen},
booktitle = {2021 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU)},
year = {2021},
}