Skip to content

Speech transcription from Youtube and mp4 videos with Whisper

License

Notifications You must be signed in to change notification settings

gamallo/transcription-video

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

transcription-video

Transcription of videos, including YouTube videos, with Whisper (OpenAI), based on this tutorial.

This repository contains two different scripts:

  • transcript_youtube allows you to transcript videos from youtube by just providing the link and the language.
  • transcript_mp4 allows you to transcript your videos in mp4.

INSTALL

Install ffmpeg, yt-dlp, and Whisper:

pip install yt-dlp openai-whisper==20231106 openai
sudo apt install -y ffmpeg

How to use

1.Transcription of a Youtube video

you have to specify the link and the language (en, pt, es, ...):

sh transcript_youtube.sh <link_youtube> <language>

For instance:

sh transcript_youtube.sh https://www.youtube.com/watch?v=AJhkLwMvgrg pt

2. Transcription of a mp4 video

you have to specify the path to your file and the language (en,pt,es...)

sh transcript_mp4.sh <your_file.mp4> <language>

OUTPUT

The output is a file with the name of the input file following by _srt.mp4. It contains the subtitles extracted with Whisper aligned with the speech.

Whisper model

You can choose the Whisper model in script transcript.py

The medium model is set by default, but you can also choose: small and large.

About

Speech transcription from Youtube and mp4 videos with Whisper

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published