Skip to content

Spam Detector is a web-based application that allows users to train a machine learning model to detect spam messages.

License

Notifications You must be signed in to change notification settings

adrien-dimitri/spam-detector

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

48 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spam detector License

Spam Detector is a web-based application that allows users to train a machine learning model to detect spam messages. It supports two feature extraction methods: Bag of Words (BoW) and Term Frequency-Inverse Document Frequency (TF-IDF).

Table of Contents

General Info

This project is an SMS spam detector implemented as a Flask web application. It provides a user-friendly interface to train and evaluate a machine learning model for spam detection.

Deployment on PythonAnywhere

This application is deployed on PythonAnywhere, a cloud-based Python hosting service. You can access the live application at adriendimitri.pythonanywhere.com.

Data

The available data is the SMS Spam Collection Dataset from the UC Irvine Machine Learning Repository.

The SMS Spam Collection v.1 (hereafter the corpus) is a set of SMS tagged messages that have been collected for SMS Spam research. It contains one set of SMS messages in English of 5,574 messages, all classified as either ham (legitimate) or spam. The first word of each text is the classification, followed by a space, and the rest is the SMS itself.

The SMS Spam Collection v.1 (text file: smsspamcollection) has a total of 4,827 SMS legitimate messages (86.6%) and a total of 747 (13.4%) spam messages.

How it Works

For features extraction, you may choose either Bag-of-Words (BoW) or Term Frequency - Inverse Document Frequency (TF-IDF), with both giving good results.

A Naive Bayes Classifier is used as the core during training which saves all the parameters of spam and ham sms messages based on the features extracted prior.

Technologies

Python Version

Requirements

Setup

  1. Clone the repository

  2. Navigate to the project directory:

    $ cd spam-detector
  3. Create a virtual environment (optional but recommended):

    $ python -m venv .venv
  4. Activate the virtual environment:

    • On Windows:
    $ .venv\Scripts\activate
    • On macOS and Linux:
    $ source .venv/bin/activate
  5. Install the required packages using the command below:

    $ pip install -r requirements.txt
  6. Run the Flask application:

    $ python run.py

Note: It's recommended to use a virtual environment to isolate the project dependencies. If you choose not to use a virtual environment, make sure to adapt the installation command (pip install -r requirements.txt) accordingly.

Deployment

Access the application by navigating to http://localhost:5000 in your web browser.

Contributing

If you'd like to contribute to the development of the SMS Spam Detector, please follow the guidelines in CONTRIBUTING.md.

About

Spam Detector is a web-based application that allows users to train a machine learning model to detect spam messages.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published