Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

This repository for the official PyTorch implementation of Microphone Array Generalization for Multichannel Narrowband Deep Speech Enhancement, accepted by InterSpeech 2021.

Introduction

Our work addresses the problem of microphone array generalization for deep-learning-based end-to-end multichannel speech enhancement. We aim to train a unique potentially performing well on unseen microphone arrays. The goal is to make the network learn the universal information for speech enhancement that is available for any array geometry, rather than learn the one-array-dedicated characteristics. To resolve this problem, a single network is trained using data recorded by various VIRTUAL microphone arrays of different geometries using RIR Generator[1] and simulated diffused noise[2]. We design three variants of our recently proposed original NarrowBand Deep Filtering(NBDF) [3] network to cope with the agnostic number of microphones.

Key Features

Simulated_RIR_Generator
Network
- original NBDF (CP-NBDF)
- CC-NBDF
- PW-NBDF
Train
Inference
Evaluation

Get started

(1) Clone:

$ git clone https://github.com/atomicoo/Tacotron2-PyTorch.git

(2) Requirements:

$ pip install -r requirements.txt

RIR Generator [1], coherent multichannel noise generator[2] and wind noise simulator [4] are also required.

Reference

[1] E. A. Habets, “Room impulse response generator,” Technische Universiteit Eindhoven, Tech. Rep, vol. 2, no. 2.4, p. 1, 2006.

[2] E. A. Habets, I. Cohen, and S. Gannot, “Generating nonstationary multisensor signals under a spatial coherence constraint,” The Journal of the Acoustical Society of America, vol. 124, no. 5, pp. 2911–2917, 2008.

[3] X. Li and R. Horaud, “Narrow-band deep filtering for multichannel speech enhancement,” arXiv preprint arXiv:1911.10791, 2019.

[4] D. Mirabilii and E. A. Habets, “Simulating multi-channel wind noise based on the corcos model,” in 2018 16th International Workshop on Acoustic Signal Enhancement (IWAENC).IEEE,2018, pp. 560–564.

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
doc		doc
CC_NBDF_Net.py		CC_NBDF_Net.py
LICENSE		LICENSE
NBDF_Net.py		NBDF_Net.py
NB_Dataset.py		NB_Dataset.py
PW_NBDF_Net.py		PW_NBDF_Net.py
README.md		README.md
SIMU_RIR_Generator.m		SIMU_RIR_Generator.m
batch_generator_with_magnitude_augmentation.py		batch_generator_with_magnitude_augmentation.py
evaluation.py		evaluation.py
inference.py		inference.py
noise_generator.m		noise_generator.m
requirements.txt		requirements.txt
simulated_multichannel_speech_generator.m		simulated_multichannel_speech_generator.m
train.py		train.py

License

Audio-WestlakeU/Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement-

Folders and files

Latest commit

History

Repository files navigation

Microphone-Array-Generalization-for-Multichannel-Narrowband-Deep-Speech-Enhancement

Introduction

Key Features

Get started

Reference

About

Resources

License

Stars

Watchers

Forks

Languages