Structured Gradient Tree Boosting

Author: Yi Yang

Basic description

This is the Python implementation of the structured gradient tree boosting model for collective named entity disambiguation, described in

Yi Yang, Ozan Irsoy, and Kazi Shefaet Rahman 
"Collective Entity Disambiguation with Structured Gradient Tree Boosting"
NAACL 2018

[pdf]

BibTeX

@inproceedings{yang2018collective,
  title={Collective Entity Disambiguation with Structured Gradient Tree Boosting},
  author={Yang, Yi and Irsoy, Ozan and Rahman, Kazi Shefaet},
  booktitle={Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers)},
  volume={1},
  pages={777--786},
  year={2018}
}

Data

The preprocessed AIDA-CoNLL data ('AIDA-PPR-processed.json') is available in the data folder:

The entity candidates are generated based on the PPRforNED candidate generation system.
The system uses 19 local features, including 3 prior features, 4 NER features, 2 entity popularity features, 4 entity type features, and 6 context features. Please look into the paper for details.

The system also uses entity-entity features, which can be quickly computed on-the-fly. Here, we provide pre-computed entity-entity features (3 features per entity pair) for the AIDA-CoNLL dataset, which is available in the data folder ('ent_ent_feats.txt.gz').

Reproduce results

You can reproduce the SGTB-BSG results by running:

python structured_learner.py --num-thread=16 --num-epoch=250

I got 95.32 accuracy on the test dataset. Training took 35 min on 16 threads.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
LICENSE		LICENSE
README.md		README.md
structured_gradient_boosting.py		structured_gradient_boosting.py
structured_learner.py		structured_learner.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Structured Gradient Tree Boosting

Basic description

Data

Reproduce results

About

Releases

Packages

Languages

License

bloomberg/sgtb

Folders and files

Latest commit

History

Repository files navigation

Structured Gradient Tree Boosting

Basic description

Data

Reproduce results

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages