NIST 1500 in RDF #8
cwulfman
started this conversation in
Design & Development
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Proposal: Write an OWL ontology for NIST 1500-100 and use an RDF triple store as the application back end
The Election Results Common Data Format is actually a graph ontology.
The NIST SP 1500 100 standard defines a common data format for pre-election setup information and post-election results reporting: the report calls this the Election Results Common Data Format (ERCDF). The publication uses a UML data model to define the data, and then provides XML and JSON schemas that have been generated automatically from the UML.
Crucially, the documentation declares the UML model to be the primary specification:
As the document notes, the UML class model is a graph data structure, while the two supported implementation formats, XML and JSON, are tree structures.
Thus, while NIST supports two implementation formats, XML and JSON, the definition is found in the UML. Why does NIST support two implementation formats, then? The reasons are almost certainly pragmatic: implementers are very likely to use JSON, and, to a lesser extent today, XML, to use the standard. But there is no reason to be wedded to these two implementation formats, and, indeed, it is truer to the specification to model it as an ontology.
Indeed, it is better to think of XML and JSON as equivalent serialization formats.
I propose that we consider developing an ontology for NIST 1500-100 based on the UML expressions and using an RDF triple store as the back end data store.
Sample RDF
Before we consider defining the ontology, let's experiment with expressing some data as RDF triples. Here are some simple Geopolitical Units:
Here is the equivalent JSON:
Here is the equivalent RDF:
The RDF is without question more perspicuous than either the XML or the JSON representations, yet it conveys the same information. But because this is RDF, it can convey much more.
First, the <ComponentUnitIds> element is an awkward element. It is used to represent geo-political composition: a particular county contains particular municipalities; a municipality contains particular precincts; etc. the ID/IDREF feature of XML is used to link elements together.
Topomerology is a rich area of study, and there are numerous systems for describing the relationships among geopolitical units.
So we can replace the awkward <ComponentUnitIds> class with relational properties, using one of several already-established semantics. This is one of the primary features of the Semantic Web: by sharing ontologies, you compound the ways your data can be linked with other data. (This feature is, unfortunately, often abused, however: one must be careful to avoid simply adopting an ontology because it uses the same English words to describe things; those things and relations may mean something very different in the domain for which the ontology was developed.)
The other, much-touted feature of the Semantic Web is its composability. We might be able to say:
(Or, if wikidata is not authoritative, some government-maintained authority file.)
By doing so, mercer
county"inherits" all the properties defined in Wikidata.Beta Was this translation helpful? Give feedback.
All reactions