Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sound cloning #11

Open
darkman111a opened this issue May 3, 2023 · 0 comments
Open

Sound cloning #11

darkman111a opened this issue May 3, 2023 · 0 comments

Comments

@darkman111a
Copy link

I'm looking to build on your research. I understand this isn't the scope of your project. Just curious for an interesting. Just wanted the thoughts of the creators. I want to retrain and repuprose and expertiment with this for expressive TTS instead of generic . I'm somewhat new to working with these models.

OBJECTIVES ->

  1. retrain on a more dynamic dataset
  2. synthetic dataset -> speech/text[w/special utterences]{real speech/lofi speech from 'BARK'}, speech w/synthetic audio envirnoments generated by 'tango'/text[I have a rather large dataset]

EXPECTATATIOS ->

  1. most expressive hybrid TTS[TTS with semantic conditioned background environments]

QUESTIONS ->

  1. what are your thought on approaching voice cloning with this style of architecture? I figure I should approach like inpainting?
  2. If possible, wouldn't it clone any artifact contained in the speech audio?

CLOSING THOUGHTS ->
I'm opening to sharing my results with you guys privately. Appreciate your contribution to the community.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant