GitHub - y3nk0/Theano-CNN: A Theano CNN implementation for Text Classification in Python 3.5

Convolutional Neural Networks for Sentence Classification

Code for reproducing all results for all datasets by the paper Convolutional Neural Networks for Sentence Classification (EMNLP 2014).

Requirements

Code is written in Python (3.5) and requires Theano (1.0.2). We provide all datasets, and the implementation supports multi-class, as well as cross validation and pre-split datasets.

Using the pre-trained word2vec vectors will also require downloading the binary file from https://code.google.com/p/word2vec/

Data Preprocessing

To process the raw data, run

python process_data.py

where path points to the word2vec binary file (i.e. GoogleNews-vectors-negative300.bin file). This will create a pickle object called mr.p in the same folder, which contains the dataset in the right format.

Running the models (CPU)

Example commands:

THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python3.5 conv_net_sentence.py -nonstatic -rand
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python3.5 conv_net_sentence.py -static -word2vec
THEANO_FLAGS=mode=FAST_RUN,device=cpu,floatX=float32 python3.5 conv_net_sentence.py -nonstatic -word2vec

This will run the CNN-rand, CNN-static, and CNN-nonstatic models respectively in the paper.

Using the GPU

GPU will result in a good 10x to 20x speed-up, so it is highly recommended. To use the GPU, simply change device=cpu to device=gpu (or whichever gpu you are using). For example:

THEANO_FLAGS=mode=FAST_RUN,device=cuda0,floatX=float32 python3.5 conv_net_sentence.py -nonstatic -word2vec

Other Implementations

TensorFlow

Denny Britz has an implementation of the model in TensorFlow:

https://github.com/dennybritz/cnn-text-classification-tf

He also wrote a nice tutorial on it, as well as a general tutorial on CNNs for NLP.

Torch

HarvardNLP group has an implementation in Torch.

https://github.com/harvardnlp/sent-conv-torch

Hyperparameters

At the time of my original experiments I did not have access to a GPU so I could not run a lot of different experiments. Hence the paper is missing a lot of things like ablation studies and variance in performance, and some of the conclusions were premature (e.g. regularization does not always seem to help).

Ye Zhang has written a very nice paper doing an extensive analysis of model variants (e.g. filter widths, k-max pooling, word2vec vs Glove, etc.) and their effect on performance.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.gitignore		.gitignore
README.md		README.md
conv_net_classes.py		conv_net_classes.py
conv_net_sentence.py		conv_net_sentence.py
process_data.py		process_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Convolutional Neural Networks for Sentence Classification

Requirements

Data Preprocessing

Running the models (CPU)

Using the GPU

Other Implementations

TensorFlow

Torch

Hyperparameters

About

Releases

Packages

Languages

y3nk0/Theano-CNN

Folders and files

Latest commit

History

Repository files navigation

Convolutional Neural Networks for Sentence Classification

Requirements

Data Preprocessing

Running the models (CPU)

Using the GPU

Other Implementations

TensorFlow

Torch

Hyperparameters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages