Discrepancy in Reproduced F1 Scores Compared to Published Results #539

jed-ho · 2024-05-09T06:21:08Z

Orion version: 0.6.0
Python version: 3.10.12
Operating System: Ubuntu 22.04
Dependencies installed using make install-develop

Description

I am attempting to reproduce the results of the research paper AER: Auto-Encoder with Regression for Time Series Anomaly Detection. I ran benchmark.py and obtained the results. However, the results show a significant discrepancy when compared to the F1 scores reported in the research paper.
Could you please help investigate this discrepancy? Any guidance on whether I might be missing a step or misinterpreting the results would be greatly appreciated.

What I Did

Run benchmark.py and obtain the results for each signal.
Compare these results with those in Orion/benchmark/results/0.6.0.csv; the values in my results do match those in the file.
Calculate the average F1 scores of signals from my results.
Compare the average F1 scores with leaderboard.xlsx. There are minor differences between my results and the leaderboard.xlsx.
Compare both sets of results with the F1 scores published in the paper; they exhibit a significant discrepancy.

The text was updated successfully, but these errors were encountered:

sarahmish · 2024-05-13T14:38:27Z

Hi @jed-ho - thank you for using Orion!

After running benchmark.py you can use get_f1_scores in results.py to get the overview f1 scores and write_results to obtain the leaderboard.xlsx.

In Orion, we publish the benchmark with every release to help navigate the changes the happen due to external factors such as dependency changes and package updates. Your results should be consistent with the latest published benchmark results.

Hope this answers your questions!

sarahmish added the question Further information is requested label May 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

jed-ho commented May 9, 2024

sarahmish commented May 13, 2024

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

Discrepancy in Reproduced F1 Scores Compared to Published Results #539

Comments

jed-ho commented May 9, 2024

Description

What I Did

sarahmish commented May 13, 2024