This project aims to create a highly accurate and efficient model for character recognition in images. The dataset used in this project can be found here. The dataset consists of two sections: Data and Data2, each having training and testing directories with 36 subdirectories representing different character classes. The training data contains 573 images per class, while the testing data includes approximately 88 images per class. Understanding the dataset's structure is crucial for proper organization and analysis.
The task is a computer vision challenge to detect characters in input images, emphasizing image processing, analysis, and modern deep learning techniques. It's more aligned with computer vision than Optical Character Recognition (OCR), utilizing various models like ResNet, Xception, Inception, and MobileNet to process and analyze the dataset for accurate predictions.
The code for this project is hosted on GitHub. You can clone the repository using the following command:
gh repo clone Geo-y20/Standard-OCR-
- Set Up: Importing necessary modules, setting hyperparameters, and constants.
- Data Loading: Loading the dataset into memory for processing.
- Data Processing: Converting raw data, including techniques like data augmentation, normalization, and resizing images.
- Data Visualization: Inspecting the dataset for insights and potential issues.
- Backbone Comparison: Comparing different pre-trained backbones to identify the best performer.
- Model Building: Constructing a model architecture using selected backbones.
- Model Predictions: Evaluating model performance on unseen data, analyzing predictions, and identifying areas for improvement.
app.py
: Contains Flask web application code for image prediction.templates/
: Directory for HTML templates.static/
: Directory for static files (CSS, JS, images).
- Install necessary libraries and dependencies.
- Ensure Python environment compatibility.
- Run
pip install -r requirements.txt
to install dependencies. - Train and save your model using the provided dataset.
- Update
app.py
with the path to your trained model. - Run the Flask application (
python app.py
) and navigate tolocalhost:5000
in your browser.
- Access the application through the browser.
- Upload an image containing characters.
- Get predictions for the characters present in the image.
To download the H5 model file, click here.