Sentiment Analysis using Support Vector Machine

This is an end to end implementation of Sentiment Analysis using SVM. This uses a self-made SVM classifier with tf-idf as the feature. For the details of SVM classifier and the source referred see:link

Any Pre-requisites?

YES!

CVXopt: This is an optimization library used for solving the quadratic programming problem that is used to make the classifier. Not available for Windows. You might want to switch to Ubuntu or any other debian distro.
Natural Language Toolkit: This toolkit is used for various operations like stemming of words. ( If you dont know what stemming is, google it. Really cool stuff)
numpy

These libraries have to be pre-installed

How to use?

svm2.py is the file you want to run
this uses a .csv file as data. See training.csv to know the format of the data to be kept
variable 'inp' contains the entire data and the filename/location to be used. Change it with the data you want.
stopwords.txt is the file containing the stopwords or the useless words classifier doesn't want to care about.
You can change the split_ratio to determine the number of training/testing examples
This uses tf idf as feature vector. You can apply anything.

How to prepare the data?

The csv file should consist of two columns. The first column should be the sentiment and the next column should be the data (strictly text)
The sentiments should be organized, that is, all examples of a single sentiment should appear before the other and should not be intermixed.
Only two sentiments are supported right now.

How do I see the results?

in the function test_non_linear, you can use the number of correctly/incorrectly classified to calculate precision, recall, etc.

Where can I find data?

The training.csv and trainingo.csv contain small amounts of data. However, inside Data folder, training.csv contains a 2000 set data you can use.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Sentiment Analysis using Support Vector Machine

Any Pre-requisites?

How to use?

How to prepare the data?

How do I see the results?

Where can I find data?

Files

README.md

Latest commit

History

README.md

File metadata and controls

Sentiment Analysis using Support Vector Machine

Any Pre-requisites?

How to use?

How to prepare the data?

How do I see the results?

Where can I find data?