This Python script utilizes the MTCNN (Multi-Task Cascaded Convolutional Networks) algorithm for detecting signs of drunken behavior in a person's face using facial features. It combines computer vision techniques with deep learning models for drunk detection to identify potential signs of intoxication.
- Real-time detection: Captures video frames from a webcam and processes them in real-time.
- Facial detection: Utilizes the MTCNN (Multi-Task Cascaded Convolutional Networks) algorithm for detecting faces in the video stream.
- Eye detection: Uses OpenCV's Haar cascade classifier for detecting eyes within each detected face region.
- Drunk behavior detection: Analyzes eye regions for signs of redness, indicative of drunken behavior.
- Simple interface: Provides visual feedback by drawing rectangles around detected faces and displaying labels ("Drunk" or "Sober") on the video stream.
- Clone the repository to your local machine.
- Install the required dependencies (OpenCV, NumPy, MTCNN, TensorFlow).
- Run the Python script (
drunk_detection.py
). - Ensure your webcam is connected and accessible.
- Observe the real-time video stream with detected faces and their associated labels.
- This script is provided for demonstration purposes and should not be relied upon for real-world applications without proper testing and validation.
- The eye detection model (
facial_drunk.keras
) used in this script should be trained separately using appropriate data and methodologies.
- OpenCV (cv2)
- NumPy
- MTCNN (Multi-Task Cascaded Convolutional Networks)
- TensorFlow
MTCNN is a deep learning-based face detection algorithm that detects faces in images and provides bounding boxes and facial landmark points. It consists of three stages:
- Proposal Network (P-Net): Generates candidate bounding boxes for faces using a convolutional neural network (CNN) and applies bounding box regression to refine the proposals.
- Refinement Network (R-Net): Filters the candidate bounding boxes generated by the P-Net and performs further refinement to improve the accuracy of face detection.
- Output Network (O-Net): The final stage performs additional filtering and refinement to produce the final bounding boxes and facial landmark points.
MTCNN is widely used for face detection due to its accuracy and efficiency in detecting faces under various conditions, including variations in scale, pose, and illumination.
- Initialization: The MTCNN detector is initialized at the beginning of the
detect_drunk
function using theMTCNN()
constructor. - Face Detection: Once the PNG image is read and converted to RGB format, MTCNN is used to detect faces within the image. This is done by calling the
detect_faces
method of the MTCNN detector, passing the RGB image as input. - Face Analysis: For each detected face, the program analyzes the facial features to determine signs of redness, which may indicate drunken behavior. This analysis includes extracting a region of interest (ROI) corresponding to the detected face, converting it to the HSV color space, and applying a red color mask to isolate red pixels.
- Result Visualization: After analyzing each detected face, the program draws bounding boxes around the faces and displays labels indicating whether the person is "Drunk" or "Sober" based on the analysis results.
- Accuracy: MTCNN offers high accuracy in face detection, making it suitable for applications where precision is crucial.
- Robustness: It can handle various face orientations, lighting conditions, and occlusions, making it robust in real-world scenarios.
- Efficiency: Despite its multi-stage architecture, MTCNN is computationally efficient and can process images quickly, making it suitable for real-time applications such as video processing.
- The MTCNN face detection library is developed by Iván de Paz Centeno, Github: kpzhang93.
- Haar cascade classifiers for eye detection are part of the OpenCV library.
- TensorFlow is an open-source deep learning framework developed by Google.
Personal license (stole it idc)
The results provided by this script are for informational purposes only and should not be considered as a definitive indication of a person's sobriety or level of intoxication (by image processing). Factors such as lighting conditions, camera quality, and individual differences may affect the accuracy of the detection. Always use caution and judgment when interpreting the results.