In this repository, you can find all information to reproduce the knowledge graph built for the PlantHub QueryBuilder.
Contained in this repository are:
- (1) All datasets used to build the graph.
- (2) All Ontotext Refine projects and editing histories to reproduce the final datasets.
- (3) All SQL exports to create a database from the datasets.
- (4) All R2RML mappings to create an RDF graph from the database.
- (5) The finished knowledge graph can be found under'./graph/output.zip'.
For a more detailed description of the workflow, please refer to the publication "Semantic technologies for interdisciplinary research: A case study on improving data synthesis and integration in the biodiversity domain" at BTW2025.
The final knowledge graph is uploaded at './graph/output.zip'. There you can find a .ttl file that can be uploaded to a triplestore of your choice.
Visitors of this repository can reproduce and use the contained data and mappings in a variety of ways. The simplest way of arriving at a finished RDF graph is by loading the SQL exports into a database and consequently executing the R2RML mappings on that database. This results in a .ttl file that can be uploaded to a triplestore of your choice.
The SQL statements are exported from Ontotext Refine projects containing the datasets uploaded to this repository. We publish the finished datasets and provide the project files that take the raw data to their processed state, such that all operations on the data can be viewed. We also publish the operation histories exported from the project files.
The data shared in this repository contains 100K entries from TRY, the plant trait database, and the first 100K entries from the naturgucker.de dataset hosted on GBIF (https://doi.org/10.15468/dl.e6ry9b).
In an update, we include a further subset of the naturgucker.de dataset, that contains plant occurrences in Germany from 01.01.2024-07.02.2025 (https://doi.org/10.15468/dl.x5cexq).
We enrich these datasets with additional information from Wikidata, the GBIF API, the IUCN API, and the OpenElevationAPI.
All sources are licensed under CC BY 4.0