Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluate the quantitative performance on the Dataset #18

Open
Shedima opened this issue Jul 10, 2022 · 7 comments
Open

Evaluate the quantitative performance on the Dataset #18

Shedima opened this issue Jul 10, 2022 · 7 comments

Comments

@Shedima
Copy link

Shedima commented Jul 10, 2022

How would you evaluate the quantitative performance of your model on the genea_challenge_2020 dataset? I only found the code for evaluation on the TED dataset.

@UttaranB127
Copy link
Owner

You can find the method generate_gestures_by_dataset in processor_v2.py, which provides the generation of the GENEA dataset in addition to TED. The quantitative metrics for GENEA were evaluated by hand following the code inside generate_gestures in processor_v2.py.

@Shedima
Copy link
Author

Shedima commented Jul 11, 2022

Do you mean to use generate_gestures_by_dataset to generate a sequence of poses and then evaluate quantitative metrics manually with generate_gestures?

@UttaranB127
Copy link
Owner

Yes

@Shedima Shedima closed this as completed Jul 11, 2022
@Shedima Shedima reopened this Aug 4, 2022
@Shedima
Copy link
Author

Shedima commented Aug 4, 2022

Hello, I followed the previously stated method to evaluate on the GENEA dataset and found that the FGD evaluation metrics are very high. According to the method in the source code, I generated the corresponding video and found that the generated one is completely different from the real pose. I guess this is the reason for the high FGD evaluation metrics. So can you please provide a complete methodology to evaluate it on the GENEA dataset.
2022-08-04 20-09-22 的屏幕截图

@UttaranB127
Copy link
Owner

One thing I notice is that the arms in the GT seem to be vertically inverted (along the y-axis). I think that the evaluation is adding some vertical flipping for both the GT and the predicted, but it might not be required for the GT. Could you try to evaluate and visualize by inverting back the y-axis values of the GT? Apart from that, we had used the same error terms to evaluate on GENEA as on the TED dataset.

@Shedima
Copy link
Author

Shedima commented Aug 5, 2022

I also found that the arm in GT is inverted vertically, I tried to reverse the value of the Y axis but found that the FGD is still very high. Although I am following the evaluation and visualization in the source code you provided. I can't get the evaluation data in your paper on GENEA dataset, so can you provide the source code that you evaluated on GENEA dataset?

@UttaranB127
Copy link
Owner

My apologies, but I currently don't have access to that evaluation code, not sure how soon I will be able to retrieve and access it. Meanwhile, you can report the higher numbers if you followed the same evaluation methodology as for the TED dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants