Latent Keyphrase Inference (LAKI)

Publication

Jialu Liu, Xiang Ren, Jingbo Shang, Taylor Cassidy, Clare Voss and Jiawei Han, "Representing Documents via Latent Keyphrase Inference”, Proc. of the 25th Int. Conf. on World Wide Web (WWW'16), Montreal, Canada, April 2016.

Notes

The current implementation requires SegPhrase to extract domain keyphrases. It has been added under this repository as a submodule.

Requirements

We will take Ubuntu for example.

g++ 4.8

$ sudo apt-get install g++-4.8

python 2.7

$ sudo apt-get install python

scikit-learn

$ sudo apt-get install pip
$ sudo pip install sklearn

nltk

$ sudo pip install nltk

Build

LAKI can be easily built by Makefile in the terminal.

$ make

Default Run

$ ./train_dblp.sh  #train a LAKI model using DBLP dataset.
$ ./test/test_inference #receives a string query and returns top ranked document keyphrases

Parameters

All the parameters are located in train_dblp.sh

INPUT=data/AMiner-Paper.txt

INPUT refers to the input file of LAKI, can be downloaded from AMiner. For other datasets, please refer to the format of file indicated by RAW_TEXT (each single line indicates a document) and comment out line 25-28.

OMP_NUM_THREADS=4

Number of threads.

NUM_KEYPHRASES=40000

Number of domain keyphrases extracted by SegPhrase

MIN_PHRASE_SUPPORT=10

Number of occurrences for a valid domain keyphrase in the corpus.

####For other parameters regarding each individual module, please check the corresponding cpp files.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
SegPhrase @ 81a52d5		SegPhrase @ 81a52d5
output		output
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
domain_keyphrase_extraction.sh		domain_keyphrase_extraction.sh
train_dblp.sh		train_dblp.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Latent Keyphrase Inference (LAKI)

Publication

Notes

Requirements

Build

Default Run

Parameters

About

Releases

Packages

Languages

UIUC-data-mining/Latent-Keyphrase-Inference

Folders and files

Latest commit

History

Repository files navigation

Latent Keyphrase Inference (LAKI)

Publication

Notes

Requirements

Build

Default Run

Parameters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages