Refined Commonsense Knowledge from Large-Scale Web Contents

The pipeline is executed in the following order:

nlp_pipeline.pipeline
open_ie.open_ie
triple_filtering.filter
triple_grouping.group_per_c4_part, triple_grouping.group_all, triple_grouping.get_frequent_triples
triple_clustering.precompute_embeddings, triple_clustering.clustering
conceptnet_mapping.inference
ranking
final_filtering.final_filtering

Global configurations can be found in app_config.py.

Files needed for the pipeline to run are:

Precomputed similarity scores between C4 documents and Wikipedia articles: https://nextcloud.mpi-inf.mpg.de/index.php/s/nJSSW5QBQR3XoxH (cf. triple_filtering/filter.py)
Subjects: https://nextcloud.mpi-inf.mpg.de/index.php/s/TiSm3rrJ9kEqfm8 (cf. triple_filtering/filter.py)
ConceptNet mapping train/dev files: https://nextcloud.mpi-inf.mpg.de/index.php/s/JeLRgsiNykcnRbs (cf. conceptnet_mapping/finetune.py)

If you use Ascent++, please cite the following paper:

@ARTICLE{ascentpp,
  author={Nguyen, Tuan-Phong and Razniewski, Simon and Romero, Julien and Weikum, Gerhard},
  journal={IEEE Transactions on Knowledge and Data Engineering}, 
  title={Refined Commonsense Knowledge from Large-Scale Web Contents}, 
  year={2022},
  doi={10.1109/TKDE.2022.3206505}
}

Name	Name	Last commit message	Last commit date
Latest commit phongnt570 Update README.md Oct 19, 2022 d71c594 · Oct 19, 2022 History 19 Commits
conceptnet_mapping	conceptnet_mapping	added conceptnet mapping	Feb 8, 2022
data	data	added final filtering	Feb 8, 2022
final_filtering	final_filtering	added final filtering	Feb 8, 2022
libs	libs	added requirements	Feb 7, 2022
nlp_pipeline	nlp_pipeline	refactored	Feb 7, 2022
open_ie	open_ie	refactored	Feb 7, 2022
ranking	ranking	added ranking module (cont.)	Feb 8, 2022
triple_clustering	triple_clustering	added triple clustering	Feb 7, 2022
triple_filtering	triple_filtering	refactored	Feb 7, 2022
triple_grouping	triple_grouping	added triple grouping	Feb 7, 2022
.gitignore	.gitignore	added requirements	Feb 7, 2022
LICENSE	LICENSE	Initial commit	Nov 30, 2021
README.md	README.md	Update README.md	Oct 19, 2022
app_config.py	app_config.py	added conceptnet mapping	Feb 8, 2022
requirements.txt	requirements.txt	added requirements	Feb 7, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Refined Commonsense Knowledge from Large-Scale Web Contents

About

Languages

License

phongnt570/large-scale-csk-extraction

Folders and files

Latest commit

History

Repository files navigation

Refined Commonsense Knowledge from Large-Scale Web Contents

About

Topics

Resources

License

Stars

Watchers

Forks

Languages