Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TadGAN benchmark] F1-Score is very low. #486

Open
gunnha opened this issue Nov 20, 2023 · 3 comments
Open

[TadGAN benchmark] F1-Score is very low. #486

gunnha opened this issue Nov 20, 2023 · 3 comments

Comments

@gunnha
Copy link

gunnha commented Nov 20, 2023

What I Did

  • Tadgan Paper benchmark test for NASA dataset.
  • For example, you want to check with MSL data to see if it's close to 0.623 as in the paper.
known_anomalies = pd.DataFrame()
for signal in train_55:

    str1 = f'{signal}'
    df = load_anomalies(str1)

    known_anomalies = pd.concat([known_anomalies, df], axis=0)

# Merge signal
X_train_msl = pd.DataFrame()
X_test_msl = pd.DataFrame()
for signal in train_55:

    train_signal_path = f'multivariate/{signal}-train'
    test_signal_path = f'multivariate/{signal}-test'
    #train_signal_path = f'{signal}-train'
    #test_signal_path = f'{signal}-test'

    train_df = load_signal(train_signal_path)
    test_df = load_signal(test_signal_path)

    X_train_msl = pd.concat([X_train_msl, train_df], axis=0)
    X_test_msl = pd.concat([X_test_msl, test_df], axis=0)

hyperparameters = {
    "mlstars.custom.timeseries_preprocessing.time_segments_aggregate#1": {
        "time_column": "timestamp",
        "interval": 21600,
        "method": "mean"
    },
    "orion.primitives.tadgan.TadGAN#1": {
        "epochs": 70
    },
    "orion.primitives.tadgan.score_anomalies#1": {
        "rec_error_type": "dtw",
        "comb": "mult"
    }
}

orion = Orion(
    pipeline='tadgan',
    hyperparameters=hyperparameters
)

orion.fit(X_train_msl)
anomalies = orion.detect(X_test_msl)

contextual_f1_score(known_anomalies, anomalies, X_test_msl)

Question

  • The F1 score is very low. What is the reason?
  • Can't I do it this way?
@gunnha gunnha changed the title How do I set up for Variation? [TadGAN] How do I set up for Variation? Nov 20, 2023
@gunnha gunnha closed this as completed Nov 20, 2023
@gunnha gunnha reopened this Nov 20, 2023
@gunnha gunnha closed this as completed Nov 20, 2023
@sarahmish
Copy link
Collaborator

Hi @gunnha – did you find the answers you're looking for?

@gunnha
Copy link
Author

gunnha commented Nov 21, 2023

Hi @gunnha – did you find the answers you're looking for?

I thought I found it at first, so I closed it.
I'm still looking for it...

I found that Variation changes can be handled like the code above.
The only concern now is the f1-score.

There are 55 channels of MSL data (ex. P-11, D-15, M-7).
I merged all these channels into one DataFrame.
Using the code above, the f1-score is approximately 0.1.

Secondly, I tried to take f1-score by fitting on individual channels, and some of them came out as nan.
Instead, some of them came out as well as papers.

  • The code above is a code that processes 55-column channels by merging them into a train_55 list.

Is there any other good way?
@sarahmish Thank you for your interest.

  • How to load Other benchmark (ex. Yahoo S5 A1)'s load_anomalies?

@gunnha gunnha reopened this Nov 22, 2023
@gunnha gunnha changed the title [TadGAN] How do I set up for Variation? [TadGAN] F1-Score is very low. Nov 22, 2023
@gunnha gunnha changed the title [TadGAN] F1-Score is very low. [TadGAN benchmark] F1-Score is very low. Nov 22, 2023
@sarahmish
Copy link
Collaborator

@gunnha to reproduce the results in the paper, we use the benchmark function provided in Orion. We do a couple of things differently there:

  1. we process each signal on its own (no concatenation).
  2. we use weighted=False option for the evaluation metrics.
  3. we aggregate the results on a dataset level.

As for the Yahoo data, you need to directly request access from their website to obtain their data.

The code for reproducing the benchmark can be found in benchmark.py and to aggregate the results refer to results.py.

Let me know if you have any further questions!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants