Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prez4 Ready #36

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
bd3a352
Update index.json
Metaduck Sep 27, 2023
8cf35a2
Create CountryCode.ttl
Metaduck Sep 27, 2023
6028bcf
Update index.json
Metaduck Sep 28, 2023
6f21d38
Uploading the next round of general geology vocabs
Metaduck Sep 28, 2023
77259b1
Update index.json
Metaduck Sep 28, 2023
b85a52f
Delete vocabularies/ConfidenceLevelBoreholes.ttl
Metaduck Sep 28, 2023
fb6dfd9
Delete vocabularies/EntityTypeOrFeatureBorehole.ttl
Metaduck Sep 28, 2023
de45407
Delete vocabularies/QaStatusCode.ttl
Metaduck Sep 28, 2023
ebc5f55
Delete vocabularies/SampleType.ttl
Metaduck Sep 28, 2023
adea97f
Delete vocabularies/SamplingMethod.ttl
Metaduck Sep 28, 2023
5074664
Trying again to load the ttl
Metaduck Sep 28, 2023
04d5dec
Pin validator to v4.6
vedgell Oct 12, 2023
e611c45
Update README.md
Metaduck Nov 2, 2023
a4b2c48
all vocabs VocPub 4.7 valid
nicholascar Jan 4, 2024
59aac10
merge in PR #16
nicholascar Jan 4, 2024
70a997a
Merge remote-tracking branch 'ga/develop'
nicholascar Feb 21, 2024
a7df3f6
consistent namespaces; LongTurtle formatting
nicholascar Feb 21, 2024
c63fdc8
remove duplicate vocabs; fix validation errors; improve CC with altLa…
nicholascar Feb 21, 2024
f09572d
remove redundant dcterms:identifier
nicholascar Feb 21, 2024
3c33018
fix index.json
nicholascar Feb 21, 2024
332b8bc
minor file name change
nicholascar Feb 21, 2024
6b2e6b5
minor file name change
nicholascar Feb 21, 2024
59ebd1e
Pre4 ready
nicholascar Oct 16, 2024
e966297
fix catalogu IRI
nicholascar Oct 17, 2024
ca2e5de
fix SKOS link in ADOC
nicholascar Oct 17, 2024
abd5d50
label for Coral Reef
nicholascar Oct 25, 2024
ad0dec1
regenerated labels
nicholascar Dec 30, 2024
643b67a
manifest
nicholascar Dec 30, 2024
79fc13e
catalogue reformat
nicholascar Dec 30, 2024
35f59ac
typo
nicholascar Dec 30, 2024
f284f1f
move GA org metadata from vocabs to background/labels.ttl
nicholascar Jan 22, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 37 additions & 0 deletions README.adoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
= Geoscience Australia's Vocabularies

This repository contains the source files of Geoscience Australia (GA)'s public vocabularies.

All these vocabularies - from multiple sources, not just from GA itself - are presented as https://www.w3.org/TR/skos-reference/[Simple Knowledge Organization System (SKOS)] vocabularies and are delivered online at:

https://vocabs.ga.gov.au/

== License
Geoscience Australia's vocabularies in this repository are licensed using the https://creativecommons.org/licenses/by/4.0/[CC BY 4.0] licence. See the [LICENSE file](LICENSE) for the deed.

Other vocabularies reproduced by GA may have other licensing arrangements. See the individual vocabulary files for details.


== Custodian
https://www.ga.gov.au[Geoscience Australia]'s Data Catalogue Team

== Contact
Manager Client Services +
Geoscience Australia +
[email protected] +
ph: 1800 800 173 +
Cnr Jerrabomberra Ave and Hindmarsh Dr GPO Box 378, Canberra, ACT, 2601, Australia


== Prez resources

This listing of the resources in this repository is used by the https://kurrawong.ai/products/prez/[Prez System] to display the vocabularies correctly.

|===
| Resource | Location | Notes

| Catalogue Definition | `catalogue.ttl` |
| Items | `./vocabularies/*.ttl` | Multiple ttl files
| Profile Definition | https://github.com/RDFLib/prez/blob/main/prez/reference_data/profiles/ogc_records_profile.ttl[Prez Records Profile] | Default Prez profile for Records API
| Context Resources | `_background/*.ttl` | Multiple labels files for ontologies, licenses & agents
|===
17 changes: 0 additions & 17 deletions README.md

This file was deleted.

483 changes: 483 additions & 0 deletions _background/labels.ttl

Large diffs are not rendered by default.

54 changes: 54 additions & 0 deletions catalogue.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,54 @@
PREFIX dcat: <http://www.w3.org/ns/dcat#>
PREFIX dcterms: <http://purl.org/dc/terms/>
PREFIX schema: <https://schema.org/>
PREFIX skos: <http://www.w3.org/2004/02/skos/core#>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>

<http://pid.geoscience.gov.au/catalogue/ga-vocabs>
a dcat:Catalog ;
dcterms:hasPart
<https://pid.geoscience.gov.au/def/voc/ga/AssociationType> ,
<https://pid.geoscience.gov.au/def/voc/ga/BoreholeConstructionMaterial> ,
<https://pid.geoscience.gov.au/def/voc/ga/BoreholeConstructionType> ,
<https://pid.geoscience.gov.au/def/voc/ga/BoreholePurpose> ,
<https://pid.geoscience.gov.au/def/voc/ga/BoreholeStatus> ,
<https://pid.geoscience.gov.au/def/voc/ga/BoreholesSamplingMethod> ,
<https://pid.geoscience.gov.au/def/voc/ga/CDCS> ,
<https://pid.geoscience.gov.au/def/voc/ga/ConfidenceLevel> ,
<https://pid.geoscience.gov.au/def/voc/ga/ContactCharacter> ,
<https://pid.geoscience.gov.au/def/voc/ga/ContactType> ,
<https://pid.geoscience.gov.au/def/voc/ga/CountryCodes> ,
<https://pid.geoscience.gov.au/def/voc/ga/DateQualifier> ,
<https://pid.geoscience.gov.au/def/voc/ga/DirectionalSurveyAzimuth> ,
<https://pid.geoscience.gov.au/def/voc/ga/DirectionalSurveyClass> ,
<https://pid.geoscience.gov.au/def/voc/ga/DirectionalSurveyMethod> ,
<https://pid.geoscience.gov.au/def/voc/ga/DirectionalSurveyPathComputeMethod> ,
<https://pid.geoscience.gov.au/def/voc/ga/DirectionalSurveyRecordingMode> ,
<https://pid.geoscience.gov.au/def/voc/ga/DrillingMethods> ,
<https://pid.geoscience.gov.au/def/voc/ga/EntityTypeOrFeature> ,
<https://pid.geoscience.gov.au/def/voc/ga/FieldSitePurpose> ,
<https://pid.geoscience.gov.au/def/voc/ga/FieldSiteTypes> ,
<https://pid.geoscience.gov.au/def/voc/ga/GeologySampleType> ,
<https://pid.geoscience.gov.au/def/voc/ga/LandformTypes> ,
<https://pid.geoscience.gov.au/def/voc/ga/Legislation> ,
<https://pid.geoscience.gov.au/def/voc/ga/LocationMethod> ,
<https://pid.geoscience.gov.au/def/voc/ga/ModesOfOccurence> ,
<https://pid.geoscience.gov.au/def/voc/ga/OnlineFunctions> ,
<https://pid.geoscience.gov.au/def/voc/ga/PetrophysicalProperty> ,
<https://pid.geoscience.gov.au/def/voc/ga/ProportionTerms> ,
<https://pid.geoscience.gov.au/def/voc/ga/QaStatusCode> ,
<https://pid.geoscience.gov.au/def/voc/ga/SampleMaterialClass> ,
<https://pid.geoscience.gov.au/def/voc/ga/SourceRockQuality> ,
<https://pid.geoscience.gov.au/def/voc/ga/StatisticalResultQualifier> ,
<https://pid.geoscience.gov.au/def/voc/ga/StatisticalUncertaintyTypes> ,
<http://qudt.org/community/ga/voc> ;
skos:historyNote "This catalogue was was created in 2024 from pre-existing vocabularies"@en ;
schema:codeRepository "https://github.com/GeoscienceAustralia/ga-vocabs" ;
schema:contributor <https://kurrawong.ai> ;
schema:creator <https://linked.data.gov.au/org/ga> ;
schema:dateCreated "2017"^^xsd:gYear ;
schema:dateModified "2024-12-30"^^xsd:date ;
schema:description "Geoscience Australia's vocabularies of controlled terms" ;
schema:name "GA Vocabularies" ;
schema:publisher <https://linked.data.gov.au/org/ga> ;
.
Binary file not shown.
Binary file added excel/0.4.3/Borehole_Material_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Borehole_Purpose_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Borehole_status_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Confidence_Level_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Date_Qualifiers_Boreholes.xlsx
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added excel/0.4.3/Drilling_Methods_Boreholes.xlsx
Binary file not shown.
Binary file not shown.
Binary file added excel/0.4.3/Legislation_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/LocationMethodBoreholes.xlsx
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added excel/0.4.3/QA_status_code_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Sample_Type_Boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Sampling_Methods_boreholes.xlsx
Binary file not shown.
Binary file added excel/0.4.3/Source_Rock_Quality_Boreholes.xlsx
Binary file not shown.
Binary file not shown.
Binary file not shown.
33 changes: 33 additions & 0 deletions manifest.ttl
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
PREFIX mrr: <https://prez.dev/ManifestResourceRoles/>
PREFIX prez: <https://prez.dev/>
PREFIX prof: <http://www.w3.org/ns/dx/prof/>
PREFIX schema: <https://schema.org/>

[]
a prez:Manifest ;
prof:hasResource
[
prof:hasArtifact "catalogue.ttl" ;
prof:hasRole mrr:CatalogueData ;
schema:description "The definition of, and medata for, the container which here is a dcat:Catalog object" ;
schema:name "Catalogue Definition"
] ,
[
prof:hasArtifact "vocabularies/*.ttl" ;
prof:hasRole mrr:ResourceData ;
schema:description "skos:ConceptsScheme objects in RDF (Turtle) files in the vocabs/ folder" ;
schema:name "Resource Data"
] ,
[
prof:hasArtifact "https://github.com/RDFLib/prez/blob/main/prez/reference_data/profiles/ogc_records_profile.ttl" ;
prof:hasRole mrr:CatalogueAndResourceModel ;
schema:description "The default Prez profile for Records API" ;
schema:name "Profile Definition"
] ,
[
prof:hasArtifact "_background/labels.ttl" ;
prof:hasRole mrr:CompleteCatalogueAndResourceLabels ;
schema:description "An RDF file containing all the labels for the container content" ;
schema:name "Labels" ;
] ;
.
18 changes: 18 additions & 0 deletions scripts/add_change_notes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from pathlib import Path
from rdflib import Graph
from rdflib.namespace import RDF, SKOS
from rdflib import Literal


THIS_FILE = Path(__file__).resolve()
ITEMS_DIR = THIS_FILE.parent.parent / "vocabularies"

for f in ITEMS_DIR.glob("*.ttl"):
print()
print(f)
g = Graph().parse(f)
for cs in g.subjects(RDF.type, SKOS.ConceptScheme):
print(cs)
g.add((cs, SKOS.changeNote, Literal("2024-10-16 NJC: update to be VocPub 4.10 valid", lang="en")))

g.serialize(destination=f, format="longturtle")
23 changes: 23 additions & 0 deletions scripts/add_org_rdf.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
import glob
from rdflib import Graph
from rdflib.namespace import SDO

for f in glob.glob("/Users/nick/Work/ga/ga-vocabs/vocabularies/*.ttl"):
g = Graph().parse(f)
addition = """
PREFIX sdo: <https://schema.org/>
PREFIX xsd: <http://www.w3.org/2001/XMLSchema#>


<https://linked.data.gov.au/org/ga>
a sdo:Organization ;
sdo:description "Geoscience Australia is Australia's pre-eminent public sector geoscience organisation. It is the nation's trusted advisor on the geology and geography of Australia. It applies science and technology to describe and understand the Earth for the benefit of Australia."@en ;
sdo:name "Geoscience Australia" ;
sdo:url "https://www.ga.gov.au"^^xsd:anyURI ;
.
"""
g += Graph().parse(data=addition, format="turtle")
g.bind("sdo", SDO)
g.serialize(destination=f, format="longturtle")

print(f"Done {f}")
15 changes: 15 additions & 0 deletions scripts/process_043_excel.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@

# /Users/nick/Library/Caches/pypoetry/virtualenvs/vocexcel-XgwzAsEM-py3.11/bin/python /Users/nick/Work/rdflib/VocExcel/vocexcel/__main__.py Borehole_Construction_Type_Boreholes.xlsx > Borehole_Construction_Type_Boreholes.ttl
# echo "Borehole_Construction_Type_Boreholes"


FILES=/Users/nick/Work/ga/ga-vocabs/excel/0.4.3/*.xlsx

for f in $FILES
do
echo "Processing $f"
fttl=${f%.*}.ttl

/Users/nick/Library/Caches/pypoetry/virtualenvs/vocexcel-XgwzAsEM-py3.11/bin/python /Users/nick/Work/rdflib/VocExcel/vocexcel/__main__.py $f > $fttl
echo "Completed $fttl"
done
2 changes: 0 additions & 2 deletions scripts/requirements.txt

This file was deleted.

14 changes: 14 additions & 0 deletions scripts/tidy.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
from pathlib import Path
from rdflib import Graph

THIS_FILE = Path(__file__).resolve()
ITEMS_DIR = THIS_FILE.parent.parent / "vocabularies"

all_ttl_files = Path(ITEMS_DIR).glob("*.ttl")

for f in ITEMS_DIR.glob("**/*.ttl"):
print(f"Tidying {f}")
g = Graph().parse(f)
open(f, "w").write(g.serialize(format="longturtle"))

print("done")
39 changes: 20 additions & 19 deletions scripts/validate_vocabs.py
Original file line number Diff line number Diff line change
@@ -1,33 +1,34 @@
from pathlib import Path
from pyshacl import validate
import httpx
from rdflib import Graph

WARNINGS_INVALID = False # Allows warnings to flag as invalid when true
SHOW_WARNINGS = True
THIS_FILE = Path(__file__).resolve()
ITEMS_DIR = THIS_FILE.parent.parent / "vocabularies"

def main():
# get the validator
r = httpx.get("https://w3id.org/profile/vocpub/validator/4.6", follow_redirects=True)
assert r.status_code == 200
SHACL_GRAPH = Graph().parse("vocpub-4.10.ttl")

CONTEXT_GRAPH = Graph().parse(THIS_FILE.parent.parent / "_background" / "agents.ttl")

def main():
# for all vocabs...
warning_vocabs = {} # format {vocab_filename: warning_msg}
invalid_vocabs = {} # format {vocab_filename: error_msg}
vocabs_dir = Path(__file__).parent.parent / "vocabularies"
for f in vocabs_dir.glob("**/*"):
for f in ITEMS_DIR.glob("*.ttl"):
# ...validate each file
if f.name.endswith(".ttl"):
try:
v = validate(str(f), shacl_graph=r.text, shacl_graph_format="ttl")
if not v[0]:
if "Severity: sh:Violation" in v[2]:
invalid_vocabs[f.name] = v[2]
elif "Severity: sh:Warning" in v[2]:
warning_vocabs[f.name] = v[2]

# syntax errors crash the validate() function
except Exception as e:
invalid_vocabs[f.name] = e
try:
DATA_GRAPH = Graph().parse(f) + CONTEXT_GRAPH
v = validate(DATA_GRAPH, shacl_graph=SHACL_GRAPH, allow_warnings=True)
if not v[0]:
if "Severity: sh:Violation" in v[2]:
invalid_vocabs[f.name] = v[2]
elif "Severity: sh:Warning" in v[2]:
warning_vocabs[f.name] = v[2]

# syntax errors crash the validate() function
except Exception as e:
invalid_vocabs[f.name] = e

# check to see if we have any invalid vocabs
if len(warning_vocabs.keys()) > 0 and SHOW_WARNINGS:
Expand Down
Loading