Skip to content

Code for the paper "Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments"

Notifications You must be signed in to change notification settings

mertkayhan/SSL-2D-Pose

Repository files navigation

Deep Attention Based Semi-Supervised 2D-Pose Estimation

Code for the paper "Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments" is presented in this repository.

Endovis results RMIT results

Figure: Example demonstration of the test set results can be seen in the above provided videos. (Left: Endovis dataset, Right: RMIT dataset)

Abstract

For many practical problems and applications, it is not feasible to create a vast and accurately labeled dataset, which restricts the application of deep learning in many areas. Semi-supervised learning algorithms intend to improve performance by also leveraging unlabeled data. This is very valuable for 2D-pose estimation task where data labeling requires substantial time and is subject to noise. This work aims to investigate if semi-supervised learning techniques can achieve acceptable performance level that makes using these algorithms during training justifiable. To this end, a lightweight network architecture is introduced and mean teacher, virtual adversarial training and pseudo-labeling algorithms are evaluated on 2D-pose estimation for surgical instruments. For the applicability of pseudo-labelling algorithm, we have proposed a novel confidence measure, total variation. Experimental results show that utilization of semi-supervised learning improves the performance on unseen geometries drastically while maintaining high accuracy for seen geometries. For RMIT benchmark, our lightweight architecture outperforms state-of-the-art with supervised learning. For Endovis benchmark, pseudo-labelling algorithm improves the supervised baseline achieving the new state-of-the-art performance.

Running the Code

  • Endovis labels can be found here.
  • Scripts that were used to generate target heatmaps are provided in the utils folder.
  • For RMIT dataset, data is expected to be in $ROOT/training/image and $ROOT/test/image respectively.
  • For RMIT dataset, labels are expected to be in $ROOT/training/label and $ROOT/test/label respectively.
  • For EndoVis dataset, data is expected to be in $ROOT/labelled_train and $ROOT/labelled_test respectively.
  • For EndoVis dataset, labels are expected to be in $ROOT/pseudo_labels, $ROOT/training_labels and $ROOT/test_labels respectively.
  • Models can be evaluated using evaluate.py after training. (Post-processing labels are relevant for the evaluation script and these labels should be saved in $ROOT/training_labels_postprocessing and $ROOT/test_labels_postprocessing respectively.)
  • Training can be performed using train.py.
python train.py \
--batch_size 5 \
--gpu_id 0 \
--root <data-folder> \
--use_vat <toggle-vat> \
--use_mean_teacher <toggle-mean-teacher> \
--use_pseudo_labels <pseudo-labeling> \
--dataset <RMIT|ENDOVIS>

Citation

Please cite the following article if you use this code:

@article{kayhan2019deep,
  title={Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments},
  author={Kayhan, Mert and K{\"o}p{\"u}kl{\"u}, Okan and Sarhan, Mhd Hasan and Yigitsoy, Mehmet and Eslami, Abouzar and Rigoll, Gerhard},
  journal={arXiv preprint arXiv:1912.04618},
  year={2019}
}

About

Code for the paper "Deep Attention Based Semi-Supervised 2D-Pose Estimation for Surgical Instruments"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages