|
| 1 | +Writing a new profile |
| 2 | +===================== |
| 3 | + |
| 4 | +This page is about writing a SHACL validation profile for a new or |
| 5 | +existing RO-Crate profile. It does *not* offer guidance on creating the |
| 6 | +RO-Crate profile itself - for that, see the |
| 7 | +`RO-Crate page on Profiles <https://www.researchobject.org/ro-crate/profiles#making-an-ro-crate-profile>`_. |
| 8 | + |
| 9 | +Learning SHACL |
| 10 | +-------------- |
| 11 | + |
| 12 | +The validator profiles are written in SHACL (Shapes Constraint Language), a |
| 13 | +language for validating RDF graphs against a set of conditions. |
| 14 | +To use SHACL effectively, you also need some familiarity with RDF |
| 15 | +(Resource Description Framework), the technology which underpins |
| 16 | +JSON-LD and therefore RO-Crate. |
| 17 | + |
| 18 | +For an RDF introduction, try the `RDF 1.1 Primer <https://www.w3.org/TR/rdf11-primer/>`_ or |
| 19 | +`Introduction to the Principles of Linked Open Data <https://programminghistorian.org/en/lessons/intro-to-linked-data>`_. |
| 20 | + |
| 21 | +This `chapter on SHACL <https://book.validatingrdf.com/bookHtml011.html>`_ |
| 22 | +from the book `Validating RDF Data <https://book.validatingrdf.com>`_ |
| 23 | +has examples of most of SHACL's features and is a good place |
| 24 | +to start learning. Other chapters in that book may provide an understanding |
| 25 | +of *why* SHACL is our language of choice for this purpose. |
| 26 | + |
| 27 | +For complex validation, you may also need some knowledge of SPARQL, an RDF |
| 28 | +query language. You can learn about SPARQL in the tutorial |
| 29 | +`Using SPARQL to access Linked Open Data <https://programminghistorian.org/en/lessons/retired/graph-databases-and-SPARQL>`_. |
| 30 | + |
| 31 | +All these tools are best learned through practice and examples, so when building a |
| 32 | +profile, it's encouraged to use the |
| 33 | +`other profiles <https://github.com/crs4/rocrate-validator/tree/develop/rocrate_validator/profiles>`_ |
| 34 | +as a point of reference. |
| 35 | + |
| 36 | +Setting up profile files and tests |
| 37 | +---------------------------------- |
| 38 | + |
| 39 | +These instructions assume you are familiar with code development using Python and Git. |
| 40 | + |
| 41 | +#. `Install the repository from source <https://rocrate-validator.readthedocs.io/en/latest/1_installation/#installation>`_. |
| 42 | +#. From the root folder of the repo, create a folder for the profile under |
| 43 | + `rocrate_validator/profiles <https://github.com/crs4/rocrate-validator/tree/develop/rocrate_validator/profiles>`_. |
| 44 | +#. To set up the profile metadata, copy across ``profile.ttl`` from another |
| 45 | + profile folder to the folder you created |
| 46 | + (`example <https://github.com/crs4/rocrate-validator/blob/develop/rocrate_validator/profiles/workflow-ro-crate/profile.ttl>`_) |
| 47 | + & update that metadata to reflect your profile. In particular: |
| 48 | + |
| 49 | + #. change the token for the profile to a new and unique name, e.g. |
| 50 | + ``prof:hasToken "workflow-ro-crate-linkml"``. This is the name which |
| 51 | + can be used to select the profile using ``--profile-identifier`` |
| 52 | + argument (and should also be the name of the folder). |
| 53 | + #. Ensure the URI of the profile is unique (the first line after the |
| 54 | + ``@prefix`` statements), to prevent conflation between this profile |
| 55 | + and any other profile in the package. |
| 56 | + #. If this profile inherits from another profile in the validator |
| 57 | + (including the base specification), set ``prof:isProfileOf`` / |
| 58 | + ``prof:isTransitiveProfileOf`` to that profile's URI (which can be found |
| 59 | + in that profile's own ``profile.ttl``). |
| 60 | + |
| 61 | +#. Create a ``profile-name.ttl`` file in the folder you created - this is |
| 62 | + where you will write the SHACL for the validation. If you have a lot of |
| 63 | + checks to write, you can create multiple files - the validator will |
| 64 | + collect them all automatically at runtime. |
| 65 | + |
| 66 | + * Note: some profiles split the checks into folders called ``must/``, |
| 67 | + ``should/`` and ``may/`` according to the requirement severity. This |
| 68 | + is not mandatory - you can also label individual checks/shapes with |
| 69 | + ``sh:severity`` in the SHACL code instead. |
| 70 | + |
| 71 | +#. From the root folder of the repo, create a test folder for the profile |
| 72 | + under |
| 73 | + `tests/integration/profiles <https://github.com/crs4/rocrate-validator/tree/develop/tests/integration/profiles>`_. The name should match the folder you made earlier. |
| 74 | +#. Copy the style of other profiles' tests to build up a test suite for the |
| 75 | + profile. Add any required RO-Crate test data under |
| 76 | + `tests/data/crates/ <https://github.com/crs4/rocrate-validator/tree/develop/tests/data/crates>`_ |
| 77 | + and create corresponding classes in |
| 78 | + `tests/ro_crates.py <https://github.com/crs4/rocrate-validator/blob/develop/tests/ro_crates.py>`_ |
| 79 | + which can be used to fetch the data during the tests. |
| 80 | +#. When your profile & tests are written, open a pull request to contribute |
| 81 | + it back to the repository! |
| 82 | + |
| 83 | +Running validator & tests during profile development |
| 84 | +---------------------------------------------------- |
| 85 | + |
| 86 | +To run the test suite, run ``pytest``. New tests should be picked up automatically for |
| 87 | +the new profile. |
| 88 | + |
| 89 | +When running the validator manually, use ``--profile-identifier`` to select the desired profile. |
| 90 | + |
| 91 | +The crates in ``tests/data/crates``` can be used as examples for running the validator. For example: :: |
| 92 | + |
| 93 | + rocrate-validator validate --profile-identifier your-profile-name tests/data/crates/invalid/1_wroc_crate/no_mainentity/ |
0 commit comments