Skip to content

Fred199683/S2VT

 
 

Repository files navigation

Video Caption using S2VT

Requirements

For running my code and reproducing the results, the following packages need to be installed first. I have used Python 3.6 for the whole of this project.

  • PyTorch
  • Caffe
  • NumPy
  • cv2
  • imageio
  • scikit-image

Running instructions

1.Install all the packages mentioned in the 'Requirements' section for the smooth running of this project.
2.Download the MSVD dataset to Data/YouTubeClips
3.Change all the path in these python files to point to directories in your workspace
4.Run extract_feats.py to extract the RGB features of videos
5.Run train.py to train the model
6.Run test.py to generate the caption of test videos

or you can directly extract features from a video and generate captions using test_from_video.py

DataSet

You can download the MSVD dataset here
You can download the extracted video features at Features_VGG, and unzip it to "Data/Features_VGG"

Result

The val.json is the grundtruth of test dataset, the result.json is the generate result. We use the metrics of WangLei(https://github.com/wangleihitcs/CaptionMetrics)

Acknowledgement

Some code copy from vijayvee(https://github.com/vijayvee/video-captioning)

About

S2VT pytorch implementation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%