You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello. Thank you RVC team for the great app! The problem I'm running into is that my voices sometimes sound like they just came back from the dentists, lacking articulation on consonants and vowels. This isn't so much the case when it comes to real-time / microphone recording but when I'm overdubbing a video or podcast (especially if the original recording is done on a camera microphone or similar) it has a hard time picking up some articulations.
My source samples are clean (so no issues there) and I try to deverb and denoise my target audio. 1
Is there anything I can do to improve this?
2.. Is there any documentation that explains what the settings do? I have a slight idea but it would be great to read some documentation.
Ultimately, I was wondering if I could commission a TTS style app / add on if anyone would be interested. I don't know how much this would cost but if it's only a couple of hundred bucks, I can cover the expense completely. Here is the vision for the app:
It would be a TTS interface except when you upload your tarpet audio it would then process it in the voice you specified. From this point it would give you a text representation of your audio with time code so you can then go change words. This way you can edit any gibberish or fi mistakes like "It was the year 1995" when it should be "It was the year 1999".
Another feature request I would like to see is for the app to completely ignore any accents. For example, if the target audio has accents then it would use the natural cadence and access of the original voice. I know there is a slider for this but any improvement in this area would be helpful!!!
Finally question
Does anyone know what play.ht is using for their source code? Their TTS and voice cloning is AMAZING. I would really like to see something like that expect with the ability to overdub voice-overs to fit to the cadence of the target audio to sync the lips of the video.
Thank you again!
The text was updated successfully, but these errors were encountered:
Hello. Thank you RVC team for the great app! The problem I'm running into is that my voices sometimes sound like they just came back from the dentists, lacking articulation on consonants and vowels. This isn't so much the case when it comes to real-time / microphone recording but when I'm overdubbing a video or podcast (especially if the original recording is done on a camera microphone or similar) it has a hard time picking up some articulations.
My source samples are clean (so no issues there) and I try to deverb and denoise my target audio. 1
2.. Is there any documentation that explains what the settings do? I have a slight idea but it would be great to read some documentation.
Ultimately, I was wondering if I could commission a TTS style app / add on if anyone would be interested. I don't know how much this would cost but if it's only a couple of hundred bucks, I can cover the expense completely. Here is the vision for the app:
It would be a TTS interface except when you upload your tarpet audio it would then process it in the voice you specified. From this point it would give you a text representation of your audio with time code so you can then go change words. This way you can edit any gibberish or fi mistakes like "It was the year 1995" when it should be "It was the year 1999".
Another feature request I would like to see is for the app to completely ignore any accents. For example, if the target audio has accents then it would use the natural cadence and access of the original voice. I know there is a slider for this but any improvement in this area would be helpful!!!
Finally question
Does anyone know what play.ht is using for their source code? Their TTS and voice cloning is AMAZING. I would really like to see something like that expect with the ability to overdub voice-overs to fit to the cadence of the target audio to sync the lips of the video.
Thank you again!
The text was updated successfully, but these errors were encountered: