Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

Open
jed-ho opened this issue May 9, 2024 · 1 comment
Open

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

jed-ho opened this issue May 9, 2024 · 1 comment
Labels
question Further information is requested

Comments

@jed-ho
Copy link

jed-ho commented May 9, 2024

  • Orion version: 0.6.0
  • Python version: 3.10.12
  • Operating System: Ubuntu 22.04
  • Dependencies installed using make install-develop

Description

I am attempting to reproduce the results of the research paper AER: Auto-Encoder with Regression for Time Series Anomaly Detection. I ran benchmark.py and obtained the results. However, the results show a significant discrepancy when compared to the F1 scores reported in the research paper.
Could you please help investigate this discrepancy? Any guidance on whether I might be missing a step or misinterpreting the results would be greatly appreciated.

What I Did

  1. Run benchmark.py and obtain the results for each signal.
  2. Compare these results with those in Orion/benchmark/results/0.6.0.csv; the values in my results do match those in the file.
  3. Calculate the average F1 scores of signals from my results.
  4. Compare the average F1 scores with leaderboard.xlsx. There are minor differences between my results and the leaderboard.xlsx.
  5. Compare both sets of results with the F1 scores published in the paper; they exhibit a significant discrepancy.
@sarahmish
Copy link
Collaborator

Hi @jed-ho - thank you for using Orion!

After running benchmark.py you can use get_f1_scores in results.py to get the overview f1 scores and write_results to obtain the leaderboard.xlsx.

In Orion, we publish the benchmark with every release to help navigate the changes the happen due to external factors such as dependency changes and package updates. Your results should be consistent with the latest published benchmark results.

Hope this answers your questions!

@sarahmish sarahmish added the question Further information is requested label May 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants