Skip to content

Ready-to-use Multilingual Text-To-Speech (TTS) package.

License

Notifications You must be signed in to change notification settings

qanastek/EasyTTS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

drawing

PyPI version GitHub Issues Contributions welcome License: MIT Downloads

EasyTTS is an open-source and ready-to-use Multilingual Text-To-Speech (TTS) package.

The goal is to simplify usages of state-of-the-art text-to-speech models for a variety of languages (french, english, ...).

⚠️ EasyTTS is currently in beta. ⚠️

Quick installation

EasyTTS is constantly evolving. New features, tutorials, and documentation will appear over time. EasyTTS can be installed via PyPI to rapidly use the standard library. Moreover, a local installation can be used by those users than want to run experiments and modify/customize the toolkit. EasyTTS supports both CPU and GPU computations. Please note that CUDA must be properly installed to use GPUs.

Anaconda setup

conda create --name EasyTTS python=3.7 -y
conda activate EasyTTS
pip install git+https://github.com/repodiac/german_transliterate

More information on managing environments with Anaconda can be found in the conda cheat sheet.

Install via PyPI

Once you have created your Python environment (Python 3.7+) you can simply type:

pip install EasyTTS
pip install git+https://github.com/repodiac/german_transliterate

Install with GitHub

Once you have created your Python environment (Python 3.7+) you can simply type:

git clone https://github.com/qanastek/EasyTTS.git
cd EasyTTS
pip install -r requirements.txt
pip install --editable .

Any modification made to the EasyTTS package will be automatically interpreted as we installed it with the --editable flag.

Example Usage

import soundfile as sf
from EasyTTS.inference.TTS import TTS

tts = TTS(lang="fr") # Instantiate the model for your language
audio = tts.predict(text="Bonjour à tous") # Make a prediction

sf.write('./audio_pip.wav', audio, 22050, "PCM_16") # Save output in .WAV file

Audios Samples

Sentence Language Audio File
Comme le capitaine prononçait ces mots, un éclair illumina les ondes de l'Atlantique, puis une détonation se fit entendre et deux boulets ramés balayèrent le pont de l'Alcyon. FR audio_fr.wav
We shall not flag or fail. We shall go on to the end... we shall never surrender. EN audio_en.wav

Model architectures

  1. Tacotron 2 (from Google Research & University of California, Berkeley) released with the paper NATURAL TTS SYNTHESIS BY CONDITIONING WAVENET ON MEL SPECTROGRAM PREDICTIONS, by Jonathan Shen, Ruoming Pang, Ron J. Weiss, Mike Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, RJ Skerry-Ryan, Rif A. Saurous, Yannis Agiomyrgiannakis and Yonghui Wu.

Datasets used

  1. SynPaFlex (from IRISA, LLF (Laboratoire de Linguistique Formelle de Nantes), LIUM (Le Mans Université) and ATILF (Analyse et Traitement Informatique de la Langue Française)) released with the paper SynPaFlex-Corpus: An Expressive French Audiobooks Corpus Dedicated to Expressive Speech Synthesis, by Aghilas Sini, Damien Lolive, Gaëlle Vidal, Marie Tahon and Élisabeth Delais-Roussarie.

Build PyPi package

Build: python setup.py sdist bdist_wheel

Upload: twine upload dist/*

About

Ready-to-use Multilingual Text-To-Speech (TTS) package.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages