|
1 |
| -# tortoise.cpp: GGML implementation of tortoise-tts, under construction |
| 1 | +# tortoise.cpp: GGML implementation of tortoise-tts (Ready for testing!) |
2 | 2 |
|
3 | 3 | 
|
4 | 4 |
|
5 |
| -Implementation status: |
6 |
| - |
7 |
| -Tokenization seems to work, but doesn't exactly match the tokenization tortoise-tts performs, needs work. |
8 |
| - |
9 |
| -Voice latent is hardcoded for now. |
10 |
| - |
11 |
| -Text embedding/ text position embedding reconstruction complete. |
12 |
| - |
13 |
| -Mel embedding reconstruction complete. |
14 |
| - |
15 |
| -Autoregressive model(gpt-2) reconstruction complete. |
16 |
| - |
17 |
| -Contrastive Language-Voice Pretrained Transformer reconstruction pending. |
18 |
| - |
19 |
| -Diffusion model reconstruction pending. |
| 5 | +# Compiling |
| 6 | +For now, CUDA and CPU only. To compile: |
20 | 7 |
|
| 8 | +## Compile for CPU |
| 9 | +```` |
| 10 | +mkdir build |
| 11 | +cd build |
| 12 | +cmake .. |
| 13 | +make |
| 14 | +```` |
21 | 15 |
|
22 |
| -# Compiling |
23 |
| -For now, cuda only. To compile: |
| 16 | +## Compile for CUDA |
24 | 17 | ````
|
25 | 18 | mkdir build
|
26 | 19 | cd build
|
27 |
| -cmake .. |
| 20 | +cmake .. -DGGML_CUBLAS=ON |
28 | 21 | make
|
29 | 22 | ````
|
30 | 23 | This is tested with Ubuntu 22.04 and cuda 12.0 and a 1070ti
|
31 | 24 |
|
32 |
| - |
33 | 25 | # Running
|
34 |
| -You will need to place `ggml-model.bin` and `ggml-diffusion-model.bin` in the models directory to run tortoise.cpp. You can generate the model yourself following the instructions in this tortoise-tts reverse engineering fork here https://github.com/balisujohn/tortoise-reverse-engineering, or download it here https://huggingface.co/balisujohn/tortoise-ggml. |
35 |
| - |
| 26 | +You will need to place `ggml-model.bin`, `ggml-vocoder-model.bin` and `ggml-diffusion-model.bin` in the models directory to run tortoise.cpp. You can download them here https://huggingface.co/balisujohn/tortoise-ggml. I will release scripts for generating these files from tortoise-tts. |
36 | 27 |
|
37 | 28 | From the build directory, run:
|
38 | 29 | ````
|
39 |
| -./bin/tortoise |
| 30 | +./tortoise |
| 31 | +```` |
| 32 | +here's an example that should work out of the box: |
| 33 | +```` |
| 34 | +./tortoise --message "based... dr freeman?" --voice "../models/mouse.bin" --seed 0 --output "based?.wav" |
| 35 | +```` |
| 36 | +all command line arguments are optional: |
| 37 | + |
| 38 | +```` |
| 39 | +arguments: |
| 40 | + --message Specifies the message to generate, lowercase letters, spaces, and punctuation only. (default: "this is a test message." ) |
| 41 | + --voice Specifies the path to the voice file to use to determine the speaker's voice. (default: "../models/mol.bin" ) |
| 42 | + --output Specifies the path where the generated wav file will be saved. (default: "./output.wav") |
| 43 | + --seed Specifies the seed for psuedorandom number generation, used in autoregressive sampling and diffusion sampling (default: system time seed) |
40 | 44 | ````
|
41 | 45 |
|
42 | 46 |
|
43 | 47 | # Contributing
|
44 |
| -If you want to contribute, please make an issue stating what you want to work on. I'll make a discord to manage contributors if there is a lot of interest. You can email me questions at \<mylastname\>u\<myfirstname\>@gmail.com. I am happy to help get people get started with contributing! |
| 48 | +If you want to contribute, please make an issue stating what you want to work on. DM me on twitter if you want a link to join the dev Discord, or if you have questions. I am happy to help get people get started with contributing! |
45 | 49 |
|
46 | 50 | I am also making available a fork of tortoise-tts which has my reverse engineering annotations, and also the export script for the autoregressive model.
|
47 | 51 |
|
|
0 commit comments