Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for modern TTS models for various languages #153

Open
snakers4 opened this issue Apr 2, 2021 · 0 comments
Open

Support for modern TTS models for various languages #153

snakers4 opened this issue Apr 2, 2021 · 0 comments

Comments

@snakers4
Copy link

snakers4 commented Apr 2, 2021

Proposal

Consider giving a go to Silero TTS models. These are published under an open license assuming non-commercial / personal usage. Please see our TTS models here - https://github.com/snakers4/silero-models#text-to-speech (corresponding article https://habr.com/ru/post/549482/).

What is most important our TTS models can run on one CPU thread / core decently and depend mostly only on PyTorch.

Just let me repost some of the benchmarks here:

  • RTF (Real Time Factor) - time the synthesis takes divided by audio duration;

  • RTS = 1 / RTF (Real Time Speed) - how much the synthesis is "faster" than realtime;

We benchmarked the models on two devices using Pytorch 1.8 utils:

  • CPU - Intel i7-6800K CPU @ 3.40GHz;

  • GPU - 1080 Ti;

  • When measuring CPU performance, we also limited the number of threads used;

For the 16KHz models we got the following metrics:

| BatchSize | Device        | RTF   | RTS   |
| --------- | ------------- | ----- | ----- |
| 1         | CPU 1 thread  | 0.7   | 1.4   |
| 1         | CPU 2 threads | 0.4   | 2.3   |
| 1         | CPU 4 threads | 0.3   | 3.1   |
| 4         | CPU 1 thread  | 0.5   | 2.0   |
| 4         | CPU 2 threads | 0.3   | 3.2   |
| 4         | CPU 4 threads | 0.2   | 4.9   |
| ---       | -----------   | ---   | ---   |
| 1         | GPU           | 0.06  | 16.9  |
| 4         | GPU           | 0.02  | 51.7  |
| 8         | GPU           | 0.01  | 79.4  |
| 16        | GPU           | 0.008 | 122.9 |
| 32        | GPU           | 0.006 | 161.2 |
| ---       | -----------   | ---   | ---   |

For the 8KHz models we got the following metrics:

| BatchSize | Device        | RTF   | RTS   |
| --------- | ------------- | ----- | ----- |
| 1         | CPU 1 thread  | 0.5   | 1.9   |
| 1         | CPU 2 threads | 0.3   | 3.0   |
| 1         | CPU 4 threads | 0.2   | 4.2   |
| 4         | CPU 1 thread  | 0.4   | 2.8   |
| 4         | CPU 1 threads | 0.2   | 4.4   |
| 4         | CPU 4 threads | 0.1   | 6.6   |
| ---       | -----------   | ---   | ---   |
| 1         | GPU           | 0.06  | 17.5  |
| 4         | GPU           | 0.02  | 55.0  |
| 8         | GPU           | 0.01  | 92.1  |
| 16        | GPU           | 0.007 | 147.7 |
| 32        | GPU           | 0.004 | 227.5 |
| ---       | -----------   | ---   | ---   |
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant