Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Audio - Lip Misalignment | 音频嘴形未对齐 #234

Open
Qifeng-Wu99 opened this issue Nov 16, 2023 · 2 comments
Open

Audio - Lip Misalignment | 音频嘴形未对齐 #234

Qifeng-Wu99 opened this issue Nov 16, 2023 · 2 comments

Comments

@Qifeng-Wu99
Copy link

Appreciate the fantastic job done by the authors.

As I am trying to reproduce the result in the demo released by the authors, I train the sync net, audio to motion generator, post net and rad nerf from scratch on my own with the hyper-parameters released with the code.

However, the lip alighnment with the audio in my result is not satisfactory when compared to that obtained by the authors.

I wonder if there is some trick to tune/refine the hyper parameters to achieve better results.

Thanks in advance.

感谢作者的精彩工作。

在尝试复现作者发布的演示结果时,我使用随代码一起发布的超参数,从头开始训练syncnet、audio2motion generator、postnet和rad nerf。

然而,与作者获得的结果相比,我得到的结果中唇部与音频的对齐并不令人满意。

想问下大家能否分享一下如何调整超参数从而获得更好的效果。

谢谢。

@yerfor
Copy link
Owner

yerfor commented Nov 16, 2023

Hi Qifeng, I suspect it might be the problem of selecting a appropriate checkpoint of the postnet. Maybe you can refer to this doc and this figure. Also, we plan to release GeneFace++ in Feb. 2024, which could well handle the challenge to hand-pick the postnet

@Theweekfoolish229
Copy link

Appreciate the fantastic job done by the authors.

As I am trying to reproduce the result in the demo released by the authors, I train the sync net, audio to motion generator, post net and rad nerf from scratch on my own with the hyper-parameters released with the code.

However, the lip alighnment with the audio in my result is not satisfactory when compared to that obtained by the authors.

I wonder if there is some trick to tune/refine the hyper parameters to achieve better results.

Thanks in advance.

感谢作者的精彩工作。

在尝试复现作者发布的演示结果时,我使用随代码一起发布的超参数,从头开始训练syncnet、audio2motion generator、postnet和rad nerf。

然而,与作者获得的结果相比,我得到的结果中唇部与音频的对齐并不令人满意。

想问下大家能否分享一下如何调整超参数从而获得更好的效果。

谢谢。

您好,这个对齐的问题您找到了吗?我在用其他视频进行syncnet、audio2motion generator、postnet和rad nerf训练也发现训练出来的模型嘴形对不齐。我在考虑在用Hubert提取特征时候是不是对应不同语种使用不同的hubert

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants