🚀 Intelligent e-commerce product categorization using machine learning
- Overview
- Problem Statement
- Key Features
- Technology Stack
- Models Implemented
- Installation & Setup
- Usage Guide
- Project Structure
- Configuration
- Documentation
- Contributing
- Roadmap
- Contact
- Acknowledgments
The aim of this project is to develop a machine learning model that accurately classifies product titles into their respective categories. This solution helps organize products efficiently and improves the user experience on e-commerce platforms.
- 🎯 Multi-level product categorization
- 🔍 Advanced text preprocessing
- 🤖 Multiple ML models (SVM, Random Forest, Decision Tree)
- 🌐 User-friendly web interface
- 📊 Real-time classification and performance visualization
- Backend: Python 3.x, Flask
- Machine Learning: scikit-learn, NLTK, pandas, numpy
- Data Visualization: matplotlib, seaborn
- Development Tools: Jupyter Notebook, Git, Virtual Environment (venv)
- Support Vector Machine (SVM)
- Random Forest Classifier
- Decision Tree Classifier
-
Clone the repository:
git clone https://github.com/muhammadhamzagova666/product-title-classification.git cd product-title-classification
-
Create and activate a virtual environment:
python3 -m venv venv source venv/bin/activate # For Linux/Mac # or for Windows: .\venv\Scripts\activate
-
Install required packages:
pip install -r requirements.txt
-
Download NLTK data:
python -m nltk.downloader stopwords
-
Start the Flask application:
python Source\ Code/app.py
Note: Depending on your setup, the path to
app.py
may differ. -
Access the web interface:
- Open your web browser.
- Navigate to
http://127.0.0.1:5000/
.
-
Upload and classify product titles:
- Use the provided interface to upload product titles and descriptions for classification.
Product Title Classification/
├── Source Code/
│ ├── app.py # Main Flask application
│ ├── Utilities.py # Helper functions
│ ├── KNNImpute.py # KNN imputation implementation
│ ├── templates/ # HTML templates
│ ├── static/ # Static assets (CSS, images)
│ └── Svm_Models/ # Trained model files
├── data/ # Training and validation datasets (CSV files)
├── notebooks/ # Jupyter notebooks and model training scripts
└── other_files/ # Additional resources (e.g., labels.csv)
Note: Directory names may vary slightly between versions.
Create a .env
file in the root directory with the following content:
FLASK_APP=app.py
FLASK_ENV=development
DEBUG=True
For more detailed information about the project and its underlying models, please refer to the documentation available in the Project Documentation directory.
Contributions are welcome! To contribute:
- Fork the repository.
- Create a feature branch:
git checkout -b feature/YourFeature
- Commit your changes:
git commit -m 'Add some feature'
- Push to the branch:
git push origin feature/YourFeature
- Open a Pull Request.
Contributors:
- Muhammad Hamza Gova
- Muhammad Salar
- Talha Bilal
- Add support for more languages
- Implement deep learning models
- Enhance API documentation
- Add batch processing capability
For any queries or further information, please contact:
- Muhammad Hamza Gova - @muhammadhamzagova666
Project Link: https://github.com/muhammadhamzagova666/product-title-classification
- Many thanks to all the contributors for their support.
- Special thanks to the scikit-learn team for their excellent machine learning library.
- Appreciation goes to the Flask team for providing a robust web framework.