Text Classification for Disaster Relief Matching

This project is a text classification system for categorizing disaster relief types. For a message that mentions disaster relief, the system tries to identify the proper types of responses that are required. The classification model is a GradientBoostingClassifier trained on tfidf and message statistic features.

Usage

1. Clone the Project and Install Requirements

git clone https://github.com/qiaochen/TextClsApp
cd TextClsApp
pip install -r requirements.txt

2. ETL process

cd data
python process_data.py disaster_messages.csv disaster_categories.csv DisasterResponse.db

This will produce the database DisasterResponse.db (You may rename the database name as you like, but remember to change the app/run.py file accordingly).

3. NLP and ML process

cd models
python train_classifier.py ../data/DisasterResponse.db classifier.pkl

It may take a long time, and end with the trained model classifier.pkl (You may rename the model name as you like, but remember to change the app/run.py file accordingly).

4. Run the Flask Server

cd app
python run.py

This will start the web server.

In a new web browser window, type in the following url and press enter:

http://0.0.0.0:3001/

You will see the frontpage of the webapp. Type in a sentence in the input bar, and click the Classify Message button, the server will redirect you to the calssification result page.

Project Components

This project consists of three components:

1. ETL Pipeline

Implemented in process_data.py.

Loads the messages and categories datasets
Merges the two datasets for model training and testing
Cleans the data
Stores the cleaned data in a SQLite database

2. NLP and ML Pipeline

Implemented in train_classifier.py.

Loads data from the SQLite database
Splits the dataset into training and test sets
Builds a text processing pipline
Builds a machine learning pipeline incorporates feature extraction and transformation
Trains and tunes a model using GridSearchCV
Outputs results on the test set
Exports the final model as a pickle file

3. Flask Web App

The web app that visualizes the statistics of the dataset
Responds to user input and classify the message
Display the classification result

Folder Structure

- app
| - template
| |- master.html  # main page of web app
| |- go.html  # classification result page of web app
|- run.py  # Flask file that runs app

- data
|- disaster_categories.csv  # data to process 
|- disaster_messages.csv  # data to process
|- process_data.py        # program that processes data
|- DisasterResponse.db  # database to save clean data to

- models
|- train_classifier.py  # program that trains the classification model
|- feature_extractor.py # class and functions for nlp and extracting features
|- classifier.pkl  # saved model 

- README.md

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
app		app
data		data
models		models
.gitignore		.gitignore
README.md		README.md
ScreenShot.jpg		ScreenShot.jpg
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text Classification for Disaster Relief Matching

Usage

1. Clone the Project and Install Requirements

2. ETL process

3. NLP and ML process

4. Run the Flask Server

Project Components

1. ETL Pipeline

2. NLP and ML Pipeline

3. Flask Web App

Folder Structure

About

Releases

Packages

Languages

qiaochen/TextClsApp

Folders and files

Latest commit

History

Repository files navigation

Text Classification for Disaster Relief Matching

Usage

1. Clone the Project and Install Requirements

2. ETL process

3. NLP and ML process

4. Run the Flask Server

Project Components

1. ETL Pipeline

2. NLP and ML Pipeline

3. Flask Web App

Folder Structure

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages