Skip to content

milahu/srtgen

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

55 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

srtgen

Generate subtitles for video file

Using the paid Google Cloud Speech-To-Text API

This program requires a Google account and an API key: Create project on Google Cloud

usage

$ ./srtgen.py 
usage
  srtgen.py --apikey path/to/keyfile.json path/to/input-video.mp4

environment variables
  GOOGLE_APPLICATION_CREDENTIALS=path/to/keyfile.json srtgen.py path/to/input-video.mp4

keyfile
  This program requires a Google account and an API key
  https://console.cloud.google.com/projectcreate

subtitle is written to stdout and output/xxxxxx-input-video.mp4/output_file.srt
where xxxxxx is the sha1 hash of the input video file

temporary files are stored in output/xxxxxx-input-video.mp4/

features

  • workaround size limit in google API
    • no need for Google Cloud Storage = gs protocol
    • duration is limited to 60 seconds
    • file size is limited to 10485760 bytes

dependencies

  • ffmpeg
  • python
    • pydub
    • google.cloud.speech
      • API key
      • pricing
        • speech recognition needs lots of space and time = there is no free lunch
        • https://cloud.google.com/speech-to-text/pricing#pricing_table
          • first hour is free
            • TODO one hour per month or one hour per google account?
          • Speech Recognition without Data Logging: $0.006 / 15 seconds = $0.024 / 1 minute = about $1.50 / 1 hour
          • Speech Recognition with Data Logging: $0.004 / 15 seconds = $0.016 / 1 minute = about $1.00 / 1 hour
          • Data Logging = feedback of manually corrected text to improve quality of service
            • TODO implement upload of corrected text
      • TODO Automatic punctuation

related

based on

postprocessing tools

similar tools

todo

  • use speech_recognition module, so srtgen can use multiple backend services
  • hybrid of offline and online speech recognition
    • deepspeech for offline speech recognition
    • google for online speech recognition
    • can deepspeech return confidence values?
    • run deepspeech with different models? (and manually select the best result?)
  • automatic postprocessing
    • reduce manual work
    • split long sentences
    • merge short sentences

Releases

No releases published

Packages

No packages published

Languages

  • Python 85.0%
  • Nix 15.0%