Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TTS #67

Open
yukiarimo opened this issue Jul 29, 2024 · 8 comments
Open

TTS #67

yukiarimo opened this issue Jul 29, 2024 · 8 comments

Comments

@yukiarimo
Copy link

Hello. Do you know how to turn this: https://github.com/nivibilla/build-nanogpt into TTS instead of audio-to-audio?

@Momnadar1
Copy link

Momnadar1 commented Aug 15, 2024

Hey @yukiarimo , I am trying todo that too, is there any progress on you side on this? I made some progress on audio to audio

  • at first it was just noise
  • then reduced noise
  • now, no noise but bird voices I guess.
  • working on next thing to upgrade it, so might be posting here about it...,

if you are interested to work on it with me, let me know.

thanks

@Momnadar1
Copy link

@yukiarimo
Copy link
Author

Gonna try it out! But how is that “without tokenizer”?

@Momnadar1
Copy link

I think you are talking about audio-to-audio, so for that I build my own tokenizer hehe :'D

@Momnadar1
Copy link

So, the concept behind the tokenizer is batches of data. Convert the combined audio say for 50MB for now; to mel spectrogram, encode the mel spectrogram into a sequence of integers and decode the sequence of integers back into the mel spectrogram. The mel spectrogram values are scaled and quantized to a range of integers. The encoding and decoding process maps these integers back and forth between the mel spectrogram values.

and in more general words, like at sec 1 we have encoded some kind of Mel spectrogram data. like we had for:

input: print(encode("hii there"))
output: [46, 47, 47, 1, 58, 46, 43, 56, 43]
input: print(decode(encode("hii there")))
output: hii there

Let me know if you can contribute on top of this, thanks.

@yukiarimo
Copy link
Author

@Momnadar1
Copy link

I will send you the Colab link on this, where it’s working for me . Thanks

@Momnadar1
Copy link

Hi, @yukiarimo here is the link: https://colab.research.google.com/drive/1NHFi8y1GCIUR4Nv0yguGVwOk2q0-JOEu?usp=sharing.

But take a look on attached images of train and test loss etc on this https://github.com/tttzof351/SimpleTransfromerTTS. It shows you nearly take 400K iteration to generate good results.

If still issues just let me know.

Thanks,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants