A PyTorch-based Speech Toolkit
-
Updated
May 14, 2024 - Python
A PyTorch-based Speech Toolkit
Reading list for research topics in multimodal machine learning
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Foundation Architecture for (M)LLMs
WaveNet vocoder
PyTorch implementation of convolutional neural networks-based text-to-speech synthesis models
Multilingual Automatic Speech Recognition with word-level timestamps and confidence
A curated list of awesome Speaker Diarization papers, libraries, datasets, and other resources.
💎 A list of accessible speech corpora for ASR, TTS, and other Speech Technologies
SincNet is a neural architecture for efficiently processing raw audio samples.
Open source audio annotation tool for humans
AI powered speech denoising and enhancement
General Speech Restoration
A tutorial for Speech Enhancement researchers and practitioners. The purpose of this repo is to organize the world’s resources for speech enhancement and make them universally accessible and useful.
A neural network for end-to-end speech denoising
Tensorflow 2.x implementation of the DTLN real time speech denoising model. With TF-lite, ONNX and real-time audio processing support.
This is the main repository of open-sourced speech technology by Huawei Noah's Ark Lab.
PyTorch implementation of "FullSubNet: A Full-Band and Sub-Band Fusion Model for Real-Time Single-Channel Speech Enhancement."
Speech recognition toolkit for the arduino
Add a description, image, and links to the speech-processing topic page so that developers can more easily learn about it.
To associate your repository with the speech-processing topic, visit your repo's landing page and select "manage topics."