The kgextension package allows to access and use Linked Open Data to augment existing datasets. It enables to incorporate knowledge graph information in pandas.DataFrames and can be used within the scikit-learn pipeline.
Its functionality includes:
- Linking datasets to any Linked Open Data (LOD) Source such as DBpedia, WikiData or the EU Open Data Portal
- Generation of new features from the LOD Sources
- Hierarchy-based feature selection algorithms
- Data Integration of features from different sources
The project started in March 2020 as a Masters Team Project at the University of Mannheim.
The newest stable release can be found on the Python Package Index (PyPi).
pip install kgextension
Detailed documentation and usage instructions can be found in the kgextension documentation.
The contributors can be reached by email: [email protected].