Skip to content

Codes for Video & Audio Preprocessing and 3D CNN for Lip Reading

License

Notifications You must be signed in to change notification settings

meghbhalerao/lip-reading

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Lip-Reading

This repository contains the codes for lip reading using 3D cross audio-visual Convolutional Neural Networks.
Link to our project report : ref

Brief Description of the project

In this small project, we tried to re-engineer [1], by using similar network architecture, but using our own data and different video and audio preprocessing techniques, as described below. Due to large computational requirements for Audio and Visual Preprocessing, we trained the model on a dummy dataset, with random placeholders for the data, instead of actual intensity values.

Steps to run the code

Audio and Video Preprocessing

  1. Download either VidTimit or the BBC Lip Reading in the Wild datasets and place them in ./dataset/ folder
  2. To extract the lip region (bounding box) using Histogram of Oriented Gradients: cd Visual_Preprocessing. Then run python mouth_cropping_in_video.py for getting the crops of the mouth region from the video.
  3. To run the audio preprocessing: cd Audio_Preperocessing. Then run the file: matlab MMSESTSA84.m, which performs the audio preprocessing using the MMSE STSA method. Another Audio Preprocessing, Voice Activity Detection, which is an energy based method is also supported, which can be run using python unsupervised_vad.py.

Training the CNN Model

  1. To train the CNN model, run python train.py, with the appropriate paths to the audio and video files.

Dependencies

References

  1. 3D Convolutional Neural Networks for Cross Audio-Visual Matching Recognition. Amirsina Torfi, Seyed Mehdi Iranmanesh, Nasser Nasrabadi, Jeremy Dawson et al. IEEE Access, Volume 5.

About

Codes for Video & Audio Preprocessing and 3D CNN for Lip Reading

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published