egorka13 / Recognition_miem Public

Notifications You must be signed in to change notification settings
Fork 0
Star 1

Educational project for video and speech recognition

1 star 0 forks Branches Tags Activity

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
files		files
.gitattributes		.gitattributes
.gitignore		.gitignore
FFMPEGFrames.py		FFMPEGFrames.py
README.md		README.md
SpeechRecognition.py		SpeechRecognition.py
TextFromPicture.py		TextFromPicture.py
convert_audio2mp3.py		convert_audio2mp3.py
convert_video2mp3.py		convert_video2mp3.py
import_to_db.py		import_to_db.py
main.py		main.py
mp3_to_wav.py		mp3_to_wav.py
requirements.txt		requirements.txt
video2mp3.py		video2mp3.py
video_to_audio.py		video_to_audio.py

Repository files navigation

Recognition_miem

To run this script, you need to:

Install ffmpeg, pytesseract, cv2, pydub, pdf2image, SpeechRecognition

    sudo pip3 install ffmpeg-python pytesseract opencv-python pydub pdf2image SpeechRecognition

Put your video file into the folder ./files in the root of the project
Run from cli
```
    python3 main.py -vi files/videoplayback.mp4 output.txt 1
```
This command will convert video into the series of images with 1 frame per second. After that, array of images are going to be parsed into .txt file as text.
```
    python3 main.py -va files/videoplayback.mp4 output.txt
```
This command will convert video into audio file of the format .wav. After that, audio file is going to be parsed into .txt file as text.
```
    python3 main.py -i files output.txt 
```
This command will convert images in directory 'files' into .txt file as text.
```
    python3 main.py -a files/videoplayback.wav output.txt
```
This command will convert audio file of the format .wav into .txt file as text.