-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FastPitch] Why do you hierarchically predict the variance features (pitch and energy)? #1357
Comments
Hello 😌. I hope you're well and that you are having a good day. Sorry 😅 I don't know how it happened and sorry for that. I was trying to build my own model for my data for my local language and I faced issues. I don't know how I did what you said. Can you please 🥺 tell me how I can use FastPitch to build my own model in Colab or another notebook? I have issues with the base configuration: docker, NGC Container in Colab. How can I solve this? |
Hello 😌. I hope you're well and that you are having a good day.
Sorry 😅 I don't know how it happened and sorry for that. I was trying to
build my own model for my data for my local language and I faced issues. I
don't know how I did what you said.
Can you please 🥺 tell me how I can use FastPitch to build my own model in
Colab or another notebook?
I have issues with the base configuration: docker, NGC Container in Colab.
How can I solve this?
…On Thu, 5 Oct 2023, 09:09 Changjin Han, ***@***.***> wrote:
Thank you always for sharing your thoughtful code.
As we can see in FastPitch code, you added the pitch embedding to encoder
output before passing the energy predictor.
https://github.com/NVIDIA/DeepLearningExamples/blob/da7e1a701bd44885c5537afa7974be391f82401e/PyTorch/SpeechSynthesis/FastPitch/fastpitch/model.py#L300
Why did you chose the hierarchical variance feature prediction instead of
parallel prediction like the FastSpeech2(paper version)?
Are there any performance advantages?
—
Reply to this email directly, view it on GitHub
<#1357>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/BCRSIUJPLISDOH6YHEQHKNDX5ZTMZAVCNFSM6AAAAAA5T2XLT2VHI2DSMVQWIX3LMV43ASLTON2WKOZRHEZDONRVHE4TMMA>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Thank you always for sharing your thoughtful code.
As we can see in FastPitch code, you added the pitch embedding to encoder output before passing the energy predictor.
DeepLearningExamples/PyTorch/SpeechSynthesis/FastPitch/fastpitch/model.py
Line 300 in da7e1a7
Why did you chose the hierarchical variance feature prediction instead of parallel prediction like the FastSpeech2(paper version)?
Are there any performance advantages?
The text was updated successfully, but these errors were encountered: