Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Training setting for reproducing corner detection model #120

Open
yusukeSekikawa opened this issue Jun 10, 2024 · 9 comments
Open

Training setting for reproducing corner detection model #120

yusukeSekikawa opened this issue Jun 10, 2024 · 9 comments
Assignees

Comments

@yusukeSekikawa
Copy link

yusukeSekikawa commented Jun 10, 2024

I was training the corner detection model and encountered the following issues.

  • I am training the corner detection model using train_corner_detection.py with the default settings (no arguments except root_dir and dataset_path)
  • I could see the detected corer on the training dataset (stored in /video folder)
  • However, when I tried to evaluate using demo_corner_detection.py, the model did not detect any key points.
  • The provided a pre-trained model (/core_ml/python/models/corner_detection_10_heatmaps.ckpt) works fine

Can you share the settings to reproduce the pre-trained model?

In a similar vein, I have a few questions.

  • In the "Long-lived keypoint ..." paper, the train is run for 30 epochs. However, the number of iterations for each epoch is capped by --limit_train_batches and the number of samples experienced during each epoch changes depending on batch size.
  • In the paper, event data for training is generated with random noise. However, randomize_noises is turned off by default.
@yusukeSekikawa yusukeSekikawa changed the title Training setting for e2v model Training setting for corner detection model Jun 11, 2024
@yusukeSekikawa yusukeSekikawa changed the title Training setting for corner detection model Training setting for reproducing corner detection model Jun 11, 2024
@jngt
Copy link

jngt commented Jun 12, 2024

I'm facing the same problem.

@lbristiel-psee lbristiel-psee self-assigned this Jun 12, 2024
@lbristiel-psee
Copy link
Collaborator

Hello,

let me first answer to your first point about reproducing the pre-trained model.
Could you give some information about those points:

  • what dataset did you use to train ?
  • when you evaluated the model using demo_corner_detection.py, did you change any parameter or did you use default ones?

Thanks,
Laurent for Prophesee Support

@yusukeSekikawa
Copy link
Author

yusukeSekikawa commented Jun 12, 2024

Thank you for the reply

  • We use MS-COCO as described in the "Long-Lived Accurate Keypoint..." paper.
    For training, we use default settings except for the batch size (we use 16 instead of 4 and train for 8 epochs, which I think is equivalent to training for 30 epochs with a default batch size of 4 due to the cap from limit_train_batches).

  • We use the default settings for evaluation using demo_corner_detection.py. We use the chessboard dataset downloaded from http://prophesee.ai/hvga-atis-corner-dataset for evaluation.

  • With the same setting, the pre-trained model works fine.

I appreciate your help.

@saladair
Copy link

Hi, I am suffering from a similar issue with the e2v model.
I attempted to reproduce the pre-trained model (e2v.chpt) using MS COCO data (we use the default setting of train_event_to_video.py), but we could not produce the result on evaluation.
We use the same chessboard dataset from http://prophesee.ai/hvga-atis-corner-dataset.
Our trained model outputs the intensity image, but the quality is worse than the pre-trained one.

Please help us reproduce the pre-trained mode (dataset, options, etc).
I really appreciate any help you can provide.

@lbristiel-psee
Copy link
Collaborator

Hello @yusukeSekikawa and @saladair,
indeed the training script used with their default params don't allow to reproduce the pre-trained model we share. Our main suggestion is to follow the indication of the papers (same number of epochs, data augmentation etc.) to get closer to what we have done to produce those models.
Hope this helps,
Laurent for Prophesee Support

@yusukeSekikawa
Copy link
Author

yusukeSekikawa commented Jun 13, 2024

Hello @lbristiel-psee

Thank you for the feedback.
I want to reproduce the pre-trained model, NOT the result on the paper.
So, I would appreciate it if you could share the script to run the "train_corner_detection.py" (and "train_event_to_video.py" for @saladair). I want to know the value for each option, e.g., --lr, --epochs, --precision, ** which your teams used when training the pre-trained model in SDK**.
(The paper describes the values for some parameters, but it is difficult to know all the parameters I need to specify for training. So, we are trying to reproduce the pre-trained model in SDK, not the results on the paper).

If sharing the training script is difficult, it would also be helpful if you can share the major parameter

  • Dataset for training. Is. the model trained entirely using simulated events from MS-COCO (Do I need another dataset for fine-tuning?)
  • batch_size (I could not find it in the paper)
  • limit_train_batches(I could not find it in the paper)
  • epochs (In the paper, it is 30, but I suspect this depends on the batch_size and limit_train_batches)
  • height, width (I could not find it in the paper)
  • min_frames_per_video, min_frames_per_video (I could not find it in the paper)

In case we need NDA to share the values for input options for the script, please let me know ([email protected])

Great thanks

@yusukeSekikawa
Copy link
Author

I found a hyperparameter stored in "corner_detection_10_heatmaps.ckpt"

It looks like the SDK's provided pre-trained model is based on another checkpoint:"'/home/pchiberre/prophesee/data/logs/testing_train/checkpoints/epoch=65-step=131999.ckp."

Can you share the hyper parameter for training "epoch=65-step=131999.ckp"?

checkpoint = torch.load("corner_detection_10_heatmaps.ckpt")
print(checkpoint["hyper_parameters"])
{'root_dir': '/home/pchiberre/prophesee/data/logs/testing_train', 'dataset_path': '/mnt/hdd1/coco/images/', 'lr': 0.0007, 'epochs': 100, 'demo_iter': 10, 'precision': 16, 'accumulate_grad_batches': 1, 'batch_size': 2, 'demo_every': 1, 'val_every': 1, 'save_every': 1, 'just_test': False, 'cpu': False, 'resume': False, 'checkpoint': '/home/pchiberre/prophesee/data/logs/testing_train/checkpoints/epoch=65-step=131999.ckpt', 'mask_loss_no_events_yet': False, 'limit_train_batches': 2000, 'limit_val_batches': 100, 'data_device': 'cuda:0', 'event_volume_depth': 10, 'cin': 10, 'cout': 10, 'height': 360, 'width': 480, 'num_tbins': 10, 'min_frames_per_video': 200, 'max_frames_per_video': 5000, 'number_of_heatmaps': 10, 'num_workers': 2, 'randomize_noises': True}

Thank you in advance.

@lbristiel-psee
Copy link
Collaborator

Sorry but we are not able to share more information at the moment than what we already published (research papers, training scripts and pre-trained models). We are trying to gather more data about the topic and will share it when available, but in the meantime, the main idea is to follow what is specified in the papers (even if it is not the full picture) are those pre-trained models were build when writting those papers, and fine-tune by adjusting the parameters yourself.

I will keep you updated when I have some news.

Best,
Laurent for Prophesee Support

@yusukeSekikawa
Copy link
Author

I appreciate your help. We will wait for the updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants