This project is a Spam Email Classifier that detects spam vs. non-spam (ham) messages using Natural Language Processing (NLP) and Machine Learning (ML).
- Uses Naïve Bayes Classifier for text classification
- Preprocesses text (removes stopwords, punctuation, and converts to lowercase)
- Uses TF-IDF Vectorization for feature extraction
- Achieves high accuracy (~97–99%) on the dataset
- The dataset used is
spam.csv
, which contains labeled spam and ham messages. - Columns:
Category
: Spam or HamMessage
: The actual text message
- Clone the repo:
git clone https://github.com/najeeb-08/Spam-Email-Classifier.git cd Spam-Email-Classifier
- Install dependencies:
pip install pandas numpy sklearn nltk
- Run the script:
python spam_classifier.py
- Accuracy: ~97–99%
- Precision/Recall/F1-Score: High scores indicating good spam detection
This project is open-source under the MIT License.
🔹 Contributions are welcome! Feel free to fork and improve the model.
📩 Contact: najeeb-08