diff --git a/docs/untp-data-model/README.md b/docs/untp-data-model/README.md new file mode 100644 index 00000000..742f4990 --- /dev/null +++ b/docs/untp-data-model/README.md @@ -0,0 +1 @@ +Currently the release process doesn't have a transparent identification of changes between UNTP versions nor the reasons for those changes. Until we get a chance to update the process to include this, the files in this directory serve to make this transparent for our future selves working on the data model. diff --git a/docs/untp-data-model/changes-v0.6.0-alpha1.md b/docs/untp-data-model/changes-v0.6.0-alpha1.md new file mode 100644 index 00000000..d0570548 --- /dev/null +++ b/docs/untp-data-model/changes-v0.6.0-alpha1.md @@ -0,0 +1,269 @@ +# Changes proposed for the v0.6.0-alpha1 UNTP data model + +These changes are the result of various inputs, most from Patrick St-Louis, regarding using the v0.5.0 UNTP data model. + +Each section identify issues for that model that either result in invalid jsonld, or valid docs that are deemed invalid from the json schema, or credentials that are deemed valid by our JSON Schema but invalid by the Verified Credentials schema, or are otherwise needing improvement. Each section identifies the proposed solution that will be included for 0.6.0 once reviewed and agreed upon. + +These changes are in addition to the requested jargon changes for the schema output which we've worked with Alastair to ensure are present, recorded in this [schema diff example](https://github.com/absoludity/tests-untp/pull/1/files). + +Current jargon dependent issues: +- [Terms used indirectly from an imported domain are not exported in the generated context](https://github.com/jargon-sh/issues/issues/21) - fixing this will fix the missing property definitions below that lead to dropped fields. +- [[jargon.reference] support for object with id](https://github.com/jargon-sh/issues/issues/23) - adding this support will enable us to have a *valid* doc (according to the generated schema) when using rich references. +- [Enable control over JSON-Schema fields published for a model in different contexts](https://github.com/jargon-sh/issues/issues/20) - enabling support for this will allow us to control which extra fields should be in the JSON Schema for a reference. + +## Digital Conformity Credential + +### Required: Removing the redefinition of `issuedToParty` + +This was identified by Patrick on (TODO find reference, originally slack but ash created issue). + +The problem comes down to the fact that `ConformityAttestation` inherits from `untp.Attestation` but tries to redefine the `issuedToParty` property to use the DCC-specific, cut-down version of `Party` rather than `untp.Party`. This breaks json-ld parsers due to the redefinition of a protected term. + +I've suggested removing the re-definition of `ConformityAttestation.issuedToParty` by simplifying `ConformityAttestation` to just inherit and define the extra fields: +``` +ConformityAttestation:untp.Attestation + scope:ConformityAssessmentScheme + assessment:ConformityAssessment[] +``` +The result is that the `ConformityAttestation.issuedToParty` property is then a normal `untp.Party`, which does not mean you need to set all the properties of `Party`, though without any extra info, the JSON-Schema generated by Jargon may currently include all the other fields (with the required `name`?). I've [asked about the possibility of customising the JSON-Schema for references to types](https://github.com/jargon-sh/issues/issues/20), but if that's not possible within Jargon, we may (later) consider doing so outside of Jargon (post-processing perhaps). Right now, we can use the `[jargon.reference]=true` option on the `untp.Attestation.issuedToParty` (or possibly redefine the one property identically, but with this extra key-value pair? Try both) so that only the id is required for now, until the above issue 20 has a resolution or we post-process. + +Note that we could have also just changed the type of `issuedToParty` to be `untp.Party` but it's neater to not redefine things unnecessarily (though again, depending on the outcome of [20](https://github.com/jargon-sh/issues/issues/20) we may want to re-define it identically to associate custom key-values). + + +### Optional: Removing the `dcc.Party` with the unset fields + +I'm also suggested removing the re-definition of `dcc:Party` in the dcc jargon defs which re-defines the untp-core `Party` class but with fields removed, with the intent of simplifying the JSON-Schema *presentation* to devs, but its effect is to redefine terms (ie. it would not be possible for a future credential to import the untp-core context as well as the dcc context as it is). I have captured Steve's intent to be able to reference a type on the model (such as `ConformityAttestation.issuedToParty: untp.Party`) but without the JSON-Schema including all the details that are present on the `untp.Party`, but rather, just those that are relevant in that specific context on [jargon issue 20](https://github.com/jargon-sh/issues/issues/20). This applies to the other `Party` references on the DCC model as well. + + +### Optional: Simplifying `dcc.Product` and `dcc.Facility` with the unset fields + +Similar to `dcc.Party` above, the `dcc.Product` and `dcc.Facility` classes redefine their untp-core counterparts, but in this case, add an extra field `IDverifiedByCAB`. For the same reason as above, I've simplified these classes so that they *only* inherit and add the field, no longer unsetting a bunch of fields, and will defer the intent to keep references simple to [jargon issue 20](https://github.com/jargon-sh/issues/issues/20). + +These are slightly odd in that they are references to a `Product` or `Facility` but also add extra information, so it's more than a rich reference. But nothing json-ld can't handle (the extra term will be merged). + + +### Required: `issuingParty` should be a `Party` not an `Identifier` in the schema + +This isn't a jsonld error, but a semantic error which caught Patrick. You can't currently validate a document which sets a party for the `ConformityAssessmentScheme.issuingParty` field because it expects an `Identifier` and a `Party` is not an `Identifier`. + +`ConformityAssessmentScheme` is a `Standard`, and `Standard.issuingParty` is an `Identifier`. I've discussed this on [Issue 201](https://github.com/uncefact/tests-untp/issues/201) and am suggesting: +- the untp-core `Standard.issuingParty` should be a `Party` rather than an `Identifier`. +- similarly, the `Facility.operatedByParty` and `Product.producedByParty` should be `Party` rather than `Identifier`. +- similarly, the `Product.producedAtFacility` should be a `Facility` +Note that this will initially include the whole model in the schema for these fields, but we will update the schema for rich references with [Jargon issue 20](https://github.com/jargon-sh/issues/issues/20). + +Steve agrees with the above changes for `issuingParty`. These are all changes to untp-core, not DCC, since the `issuingParty` property is inherited from `untp-core.Standard`. + +Where things get trickier is with three other references in untp-core for `Identifier`: `CredentialIssuer.otherIdentifier`, `Party.otherIdentifier` and `Facility.otherIdentifier`. According to Steve, these are "also known as" type fields, so the `Party.otherIdentifier` is meant to reference one or more `Party`'s by which the `Party` is also known, but Steve has concerns about recursive references. Similarly, a `CredentialIssuer`'s `$id` is a DID, representing an identity of some sort, where the "also known as" can only be referring to a `Party` (not a `Facility` or `Product`), while `Facility.otherIdentifier` should refer to a `Facility`. + +`otherIdentifier` can just be a list of ids (references) or richer references. For now, to avoid any recursive references (which can be handled as [outlined on the JSON-Schema blog](https://json-schema.org/blog/posts/modelling-inheritance#adding-a-recursive-reference)) I'll set these to be `[jargon.reference]=true` for `Party.partyAlsoKnownAs` and `Facility.facilityAlsoKnownAs` so that they are simple string arrays, until we have a [solution for rich references in jargon](https://github.com/jargon-sh/issues/issues/20) (this is not necessary for `CredentialIssuer.issuerAlsoKnownAs` since there's no recursive reference). +- TODO: Check if they're used in the in-situ presentation, but even if so, we'll need to wait for [jargon issue 20](https://github.com/jargon-sh/issues/issues/20) or otherwise post-process for a rich-reference validation, or not use a reference and have a recursive reference for now. + +Note that all properties in the domain with the same name must all be of the same type, so rather than `otherIdentifier`, I've switched to `partyAlsoKnownAs`, `facilityAlsoKnownAs` and `issuerAlsoKnownAs` based on a related comment from Steve. + +### Required: Some terms used in the DCC are not defined in our imported context + +[JSON-LD has the requirement](https://www.w3.org/TR/json-ld11/#node-objects) that + +> All keys which are not [IRIs](https://tools.ietf.org/html/rfc3987#section-2), [compact IRIs](https://www.w3.org/TR/json-ld11/#dfn-compact-iri), [terms](https://www.w3.org/TR/json-ld11/#dfn-term) valid in the [active context](https://www.w3.org/TR/json-ld11/#dfn-active-context), or one of the following [keywords](https://www.w3.org/TR/json-ld11/#dfn-keyword) (or alias of such a keyword) _MUST_ be ignored when processed + +Currently, if I run our same test DCC 0.5.0 document through the `jsonld lint`, I see "Invalid Property" warnings with "Dropping property that did not expand into an absolute IRI or keyword" for the following terms: +- `addressCountry` +- `addressLocality` +- `addressRegion` +- `postalCode` +- `streetAddress` +- `geoBoundary` +- `geoLocation` +- `plusCode` +- `file` [1] +- `fileName` [1] +- `fileType` [1] + +This is because the fields, for example, of Address, are defined in our **untp-core jsonld**[2] as follows: +```json +"streetAddress": { + "@id": "untp-core:streetAddress", + "@type": "xsd:string" +}, +``` + +The issue is that **this property is not defined in the DCC sample credential doc because it is not part of the dcc jsonld vocabulary**. + +Currently, jargon re-exports any untp-core classes (with their properties) that we directly reference in a domain, such as `untp.Facility` and all its properties such as `Address`, but it is not exporting properties of those properties, such as `Address.streetAddress` and so they are missing from the context. I have reported this on jargon as [Terms used indirectly from an imported domain are not exported in the generated context](https://github.com/jargon-sh/issues/issues/21). + +As I understand it, the `untp-core` domain is meant to be used like a library of re-usable parts and was not intended to be imported in an LD document, which is why jargon currently includes any untp-core models used in a credential model, such as the DCC, in the generated linked data. This simplifies the credentials themselves too, as they only need to import the VC context and the DCC context (for example). + +We could alternatively switch to require credentials to import the VC, untp-core and the specific DCC context, but that would require changes on jargon too (since we wouldn't want to re-define all the untp-core models in the DCC context). So for now, I'm waiting to hear back on [the above issue](https://github.com/jargon-sh/issues/issues/21). + +This also answers another question I had: +- How do the redefinitions of certain classes, but with fields removed, not cause jsonld redefinition errors? Answer: because we're not importing the core context at all, so the redefinition is the only definition. + +[1] Some of the fields, such as the `file` ones marked with an asterisk, are actually redefined in the dcc context, so it's unclear to me why the jsonld lint tool is saying they weren't resolved. + +[2] Steve did try originally to link fields to schema.org, but because schema.org defines many properties (eg. [Country](https://schema.org/Country) the schema becomes pretty ridiculous, which is why we have our own definitions. We could possibly switch to use those that make sense, like `streetAddress` (just text on schema.org too). + +### Optional: `@id` URL's required on all models + +Mentioned by Patrick on [issue 184](https://github.com/uncefact/tests-untp/issues/184): +> > Required ID fields +> +> * The test suite expects all objects to have an id field. In our case, we simply do not have these values for all objects. For example, we do not have a url for our permits. We do however, have a url for our governance documents and legal acts. I wouldn't make id required for all fields, otherwise there needs to be guidance for implementers that do not have values for these fields. + +Note that we don't need to specify URLs for every required id, a URI is ok. That is, it does not need to be a URL that points to an online resource (even though that's preferred). So for example, for a permit, the id `permit:com.example.permits.12345` will validate fine (or any other uri format that people use to uniquely identify a node) and can be used to reference the node from an external document. If external referencing isn't required for a node, a blank identifier (such as `_:permit1`) can be used to define a [blank node](https://www.w3.org/TR/json-ld11/#identifying-blank-nodes) (a blank node is designed to allow referencing internally within the document only, when external referencing is not required). + +That said, JSON-LD doesn't require all nodes to be identified with an `@id`, and indeed we don't require them for things like `Address`, `Location`, `Metric` etc. as they aren't nodes we need to reference. Looking through the defs, all the nodes with a required `@id` are things that should be identifiable (like permits) even if they're not derefencencable, so I think we just need to update the docs to make it clear that the required `@id` URI just needs to be a unique resource identifier (so `permit:com.example.permits.12345` is fine), not necessarily a unique resource locator (such as `https://permits.example.com/12345`). + +Verified with Patrick and he is OK with these being required as long as we ensure the docs have clear instructions for creating them. + +### Required: DateTime format with regex + +Mentioned by Patrick on [issue 184](https://github.com/uncefact/tests-untp/issues/184): +> > Inconsistent date values +> +> * The VCDM has a specific datetime format. Dates can be complicated, I would suggest to align with this format instead of requiring a different format for other dates within the DCC. + +The date format used by VCDM requires a timezone, whereas the `datetime` format of JSON-Schema does not. We can manually add the following required (regex) pattern to the `validUntil` etc. fields (taken directly from the [VCDM Schema](https://github.com/w3c/vc-data-model/blob/bbebf31de4feed0a182a857490c807cc6885acff/schema/verifiable-credential/verifiable-credential-schema.json#L229)): + +``` + "pattern": "-?([1-9][0-9]{3,}|0[0-9]{3})-(0[1-9]|1[0-2])-(0[1-9]|[12][0-9]|3[01])T(([01][0-9]|2[0-3]):[0-5][0-9]:[0-5][0-9](\\.[0-9]+)?|(24:00:00(\\.0+)?))(Z|(\\+|-)((0[0-9]|1[0-3]):[0-5][0-9]|14:00))" +``` + +[Jargon supports specifying the pattern](https://docs.jargon.sh/#/pages/data_definitions?id=jargon-recognised-key-value-pairs) for a property using `[jargon.pattern] = ^...$` + +We should do this at least for the VC properties that we use, as otherwise we may validate a DPP, but it fails as a VC due to the missing timezone (for example). + + +## UNTP Core + +The changes to untp-core are mainly the result of the DCC changes above which required core changes. Currently this is: +- `VerifiableCredential.validFrom` and `validUntil` - set patterns as defined for each credential (should get imported to each from core). +- The `Identifier` relationship with Facility/Product/Party (IS-A or not) related to `Standard.issuingParty`, `operatedByParty`, `producedByParty` and `otherIdentifiers`. See above in the DCC section. +- In addition to the use of Identifier from core in DCC, there's also `Regulation.administeredBy`, and `Endorsement.issuingAuthority` which have both been switched to `Party` (without the `[jargon.reference]=true` for now, since there's no recursion risk). +- With those changes, `Identifier` is no longer used anywhere and is therefore removed. + + +## Digital Product Passport + +### Deferred: many terms used in the DPP are not defined in our imported context + +[JSON-LD has the requirement](https://www.w3.org/TR/json-ld11/#node-objects) that + +> All keys which are not [IRIs](https://tools.ietf.org/html/rfc3987#section-2), [compact IRIs](https://www.w3.org/TR/json-ld11/#dfn-compact-iri), [terms](https://www.w3.org/TR/json-ld11/#dfn-term) valid in the [active context](https://www.w3.org/TR/json-ld11/#dfn-active-context), or one of the following [keywords](https://www.w3.org/TR/json-ld11/#dfn-keyword) (or alias of such a keyword) _MUST_ be ignored when processed + +Currently, if I run our test example DPP 0.5.0 document through the `jsonld lint`, I see "Invalid Property" warnings with "Dropping property that did not expand into an absolute IRI or keyword" for the following terms: +- `materialCircularityIndicator` +- `recyclableContent` +- `recycledContent` +- `recyclingInformation` +- `repairInformation` +- `utilityFactor` +- `encryptionMethod` +- `hashDigest` +- `hashMethod` +- `linkName` +- `linkType` +- `linkURL` +- `accuracy` [2] +- `metricName` [2] +- `metricValue` [2] +- `score` [2] +- `accuracy` [2] +- `height` +- `length` +- `volume` +- `weight` +- `width` +- `carbonFootprint` +- `declaredUnit` +- `operationalScope` +- `primarySourcedRatio` +- `reportingStandard` +- `hazardous` +- `massAmount` +- `massFraction` +- `materialSafetyInformation` +- `materialType` +- `originCountry` +- `recycledAmount` +- `symbol` + +See the discussion in the Digital Conformity Credetial section regarding this issue and the possible jargon fix vs us possibly importing untp-core. + +I can confirm these fields are missing from the rdf/graph by dropping in a valid 0.5.0 DPP example credential into the jsonld playground and viewing the flattened data / graph. The `metricName` and `metricValue` properties are present (as they are defined in the dcc context) but none of the others from the doc which match, such as `recycledAmount`, `recyclableContent` or `recycledContent` are present in the rdf/graph. + +[2] Some of the fields, such as the `metric` ones marked with an asterisk, are actually redefined in the dcc context, so it's unclear to me why the jsonld lint tool is saying they weren't resolved. + + +### Required: DateTime format with regex + +See the same section for Digital Conformity Credential above. + + +### Required: Removing class redefinitions + +The `dpp.Product` class inherits but redefines all inherited fields unnecessarily, revert to simple inherit with additional field defined. + +The `dpp.Claim` class inherits from `untp.Declaration` and redefines all the existing fields but apparently mistakenly defines the core property `declaredValue` as `declaredValues`, and as a result, the model has *both* the inherited `declaredValue` and the newly defined `declaredValues` defined. I have left the existing field redefinitions as they are used to display the linked imported classes (which disappear if you don't re-define the fields and add the metadata), but have updated the definition of `declaredValues` property to match the core `declaredValue`. + + +## Digital Facility Record + + +### Deferred: many terms used in the DFR are not defined in our imported context + +See above explanation for the Digital Conformance Certificate's similar issue and planned resolution. + +Dropped properties: +- `addressCountry` +- `addressLocality` +- `addressRegion` +- `postalCode` +- `streetAddress` +- `encryptionMethod` +- `hashDigest` +- `hashMethod` +- `linkName` +- `linkType` +- `linkURL` +- `geoBoundary` +- `geoLocation` +- `plusCode` + +See the discussion in the Digital Conformity Credetial section regarding this issue and the possible jargon fix vs us possibly importing untp-core. + +### Required: DateTime format with regex + +See the same section for Digital Conformity Credential above. + + +## Digital Traceability Events + + +### Optional: many terms used in the DTR are not defined in our imported context + +See above explanation for the Digital Conformance Certificate's similar issue and planned resolution. + +Dropped properties: +- `productId` +- `productName` + +These are obviously pretty serious properties to be dropped from the rdf/graph. In this case, the issue is not related to missing untp property definitions, but rather because we're creating two new terms and omitting their definitions (due to inheriting the term types). + +The jargon DTE model defines a `QuantityElement` with these two terms imported from the untp-core properties `productId:untp.Product.id` and `productName:untp.Product.name` respectively. Given that these imported property types have the `[jsonld.contextOmit]=true` set, the `productId` and `productName` properties here are inheriting those properties. I've tried overriding this key-value on the DTE property to be false. + + +### Required: DateTime format with regex + +See the same section for Digital Conformity Credential above. + + +## Digital Identity Anchor + +There were no other issues with jsonld lint. The datetime regex pattern is the only change. + +Not quite true. With the removal of `untp.Identity` (which was meant to represent either a Party, Facility or Product), the `DigitalIdentityAnchor.RegisteredIdentity` needs to be a `Party` directly. + +### Required: DateTime format with regex + +See the same section for Digital Conformity Credential above. diff --git a/docs/untp-data-model/changes-v0.6.0-alpha2.md b/docs/untp-data-model/changes-v0.6.0-alpha2.md new file mode 100644 index 00000000..fabdc77d --- /dev/null +++ b/docs/untp-data-model/changes-v0.6.0-alpha2.md @@ -0,0 +1,48 @@ + +These changes are the changes from v0.6.0-alpha1 to v0.6.0-alpha2, since the [jargon issue 20 supporting rich references](https://github.com/jargon-sh/issues/issues/20) was implemented, allowing models to reference `Party` or `Facility` by `id` while including extra fields in the reference and JSON Schema. + +The following two issues remain and although they result in failed validation of 0.6.0-alpha2 credentials (due to fields being dropped during json-ld processing), they are of less importance as they do not change the data structures: +- [Terms used indirectly from an imported domain are not exported in the generated context](https://github.com/jargon-sh/issues/issues/21) - fixing this will fix the main missing property definitions that lead to dropped fields. +- [Imported field attributes can't be overwritten](https://github.com/jargon-sh/issues/issues/27) - this will fix 2 missing property definitions. + + +## Digital Conformity Credential + +Updated +- `assessedProduct`, +- `assessedFacility`, +- `assessedOrganisation` and +- `auditor` + +to use the new jargon key-value `[jargon.object_reference]` so that each property is a rich reference for the full type (such as `Product` or `Facility`) in jargon and json-ld, while only having the id and selected fields in the json schema. Specifically, added the key-value `[jargon.objectReference]=id,name,registeredId,idScheme` (note: `idScheme` is not primitive so won't be included in the JSON-Schema currently). + + +## UNTP-Core + +Updated: + - `CredentialIssuer.issuerAlsoKnownAs`, + - `Party.partyAlsoKnownAs`, + - `Facility.operatedByParty`, + - `Facility.facilityAlsoKnownAs`, + - `Product.producedByParty`, + - `Product.producedAtFacility`, + - `Standard.issuingParty`, + - `Attestation.issuedToParty`, + - `Regulation.administeredBy`, + - `Endorsement.issuingAuthority` + +to use the new jargon key-value `[jargon.object_reference]` so that each property is a rich reference for the full type (such as `Product` or `Facility`) in jargon and json-ld, while only having the id and selected fields in the json schema. Specifically, added the key-value `[jargon.objectReference]=id,name,registeredId,idScheme` (note: `idScheme` is not primitive so won't be included in the JSON-Schema currently). + + +## Digital Product Passport + +No changes other than updating core version. + +## Digital Traceability Event + +The 0.6.0-alpha1 work-around fix for `productId` and `productName` did not work, so I've created [Imported field attributes can't be overwritten](https://github.com/jargon-sh/issues/issues/27) + +## Digital Identity Anchor + +Updated `RegisteredIdentity` from inheriting a `untp.Party` (ie. IS-A `Party`) to instead having an `identity:Party` property (ie. HAS-A `Party`) that can then use the `[jargon.objectReference]=id,name,registeredId` object reference. Note that we cannot use the object reference on the `DigitalIdentityAnchor.credentialSubject` field instead, as it does not support non-primitive fields. +