-
Notifications
You must be signed in to change notification settings - Fork 3
Reading W3C Documents
At some point during your research into linked data, you will find yourself being directed towards the The The World Wide Web Consortium (W3C) website and, more specifically, the standards that they produce. W3C are in fact a standards agency and serve as the de facto custodians of many major web standards including those surrounding the semantic web such as RDF, RDFS and OWL as well as many widely used vocabularies and adjacent protocols. If you are interested in new and emerging web technologies, you may also find yourself on the W3C site reading draft proposals to get an idea of the kind of direction things might be taking and what we can expect in the years to come.
These are seldom trivial documents, and reading through one can be a significant undertaking as many W3C specifications are composed of multiple separate documents, collectively running to potentially hundreds of pages. The terminology used also tends to be very formal and academic, which can make them even more difficult to consume for people who aren't used to that kind of language.
The documents themselves are not terribly well signposted on the W3C site either so it can be hard to figure out where to begin (search engines also have a tendency to "drop you in the deep end" rather than directing you to a good starting point). The standards produced by W3C, which is well over 1000 documents at this point, are searchable via their website and whilst they do already provide an extremely thorough guide to reading their own documentation, that in itself is over 80 pages long and is also not a trivial document.
This page aims to provide a more concise introduction to how to understand W3C documents and to explain the main approaches and things to look out for when making the decision to read one.
W3C is comprised of around 50 staff and over 400 member companies including the BBC, Amazon, CERN, Google and Microsoft. Each of these companies must elect one "Advisory committee member", with the advisory committee being responsible for making the final decision about whether a document becomes an official W3C recommendation or not. That means that complete member consensus across hundreds of people is required in order for something to become a recommendation (this is one of the reasons these documents use such verbose language - the process is nothing if not rigorous!). The W3C hosts various events throughout the year including scheduled meetings and ceremonies, workshops and other such events - it is at these events that certain technical problems may be raised or discussed and potentially groups or small communities, made up of W3C members, may form around solving those problems - it is these groups which produce the documents.
The W3C recognises two distinct types of group: Interest Groups and Working Groups. Both types of group are required to have a charter (the W3C has a pretty robust definition of what they mean when they say charter, but it essentially boils down to "a goal and a plan for how to achieve that goal"). There are also formal requirements around particular roles and resources that need to exist within a group, for example both group types need to have a Chair and a 'mailing list'.
The main difference between the two group types is that a Working Group is typically working towards producing a deliverable (e.g. a recommendation) and members of working groups carry much a greater obligation to participate. Interest Groups, by contrast, are formed to create a forum for the exchange of ideas and the evaluation of technologies, approaches and policies. Interest groups do not produce recommendations, although they may still produce other documentation (for example "Notes", see below).
The first thing to figure out when you first open a W3C document is "what am I actually looking at?". W3C broadly publishes three different flavours of document: Standards, Notes and Registries.
-
Standards are formal specifications and are the core output of the W3C. They are very detailed implementation guides for some of the most important building blocks used on the web today. These documents are verbose by necessity as the intended audience is implementors, however you may find yourself consulting the standards as a user regardless of this (certainly this will be the case for semantic web technologies, where there is precious little documentation outside of the official specifications).
-
Notes are useful documents that aim to provide less formal explanations for particular topics. Free from the required verbosity of specifications, notes may include things like examples, use cases and the motivation or justification behind an initiative. Notes do not constitute recommendations or endorsements themselves, and any notes that do become endorsed by W3C are elevated to being instead called "statements". The intended audience for notes is much wider than standards and specifications, so when you are first researching a topic you will likely have an easier time starting with the notes first (particularly those marked as "primer" or "explainer") before tackling the full standard itself.
-
Registies are the least common type of document produced by the W3C. Their aim is to serve as an authority for a particular type of lookup. An example of a W3C-produced registry is the W3C Alternative and Augmented Communication (AAC) Symbol Registry which provides a list of established symbols used to supplement or replace speech or writing. The registry was created as part of the WAI-Adapt proposal which aims to improve the web user experience for people with particular needs.
You should be able to establish which type of document you are looking at pretty easily - it's usually indicated in the document subtitle.
Once you've figured out what kind of document you have, next you should establish the current status of the document. You can normally see this at the very top of the document directly under the title, beside the date. It is vitally important to check you are reading the appropriate version - search engines have a tendency to favour the newer "working draft" versions in their results rather than the older, established specifications (which might be the things you actually want to read).
The nearly-100-pages-long W3C process document details exactly how a document progresses through the various statuses before becoming a recommendation, but it essentially boils down to this:
-
Editors Draft - This is an in-flight, unpublished, unreviewed version of a document that is still being internally iterated upon. Due to their nature, Editors Drafts represent the most up to date version of a particular draft, but the content may be unstable.
-
Working Draft - Eventually a group may take the decision to publish an editors draft, at which point it becomes a time-stamped "working draft". These are still considered work-in-progress documents and can be thought of in the same way as you think of software version releases or tags. Throughout its life, a particular document may publish many successive working drafts before being put forward as a candidate recommendation.
-
Candidate Recommendation - This stage represents the group who created the document declaring that they believe it now "works as intended". It is at this point that the group will seek more detailed feedback from outside of the working group, encouraging practical implementation and testing from the wider community, so you may see things appearing in browsers as experimental features or prototypes being built. A recent example is the W3C page transitions standard, one of the most transformative proposals on the horizon from a web UX perspective, which is currently a candidate recommendation and has been implemented as an experimental feature in Chrome.
-
Proposed Recommendation - After feedback has been gathered, the document can be put forward as a proposed recommendation at which point it enters a review period of at least 4 weeks. During this period, the W3C advisory council are expected to review the document and may appeal the decision to advance.
-
W3C Recommendation - This is the final stage a document reaches after being reviewed and approved by the W3C advisory council. By the time a document gets to here, it is considered to be endorsed and ready for deployment and can be cited as a "W3C standard". Despite the finality of getting to this stage, it is still possible for recommendations to be superseded, rescinded or made obselete after they reach this stage.
W3C documents make a point of separating out what they call normative sections from non-normative. So what does that mean?
If a particular section is described as normative, it means that it forms an official and essential part of the specification and should be considered the exact definition of how something is intended to work. It is in these sections you will see italicised words like MUST and SHOULD - these words should be interpreted as per RFC2119 but as you can likely already infer, "MUST" means that something is an absolute, non-negotiable requirement within the specification whereas "SHOULD" is only a recommendation (there may, for example, exist valid reasons to not implement a particular recommended feature).
Counter to that are non-normative sections, sometimes called "informative" sections. These do not contain any formal specification detail and are usually there to further explain the normative sections via worked examples or other supporting detail such as diagrams. Sometimes sections are explicitly marked as being non-normative but there may also be blanket-statements made in the document (normally in the "Conformance" section) which label certain types of content as all being non-normative (for example, statements like 'all diagrams in this document are non-normative' are often made). Primer or "explainer" documents, such as the RDF Primer, are entirely non-normative and will indicate this either in the "Status of this document" section or in their introduction.
You will often find comments indicating that the English version of a document is the only "normative" version. This is because the word normative is very powerful and if two normative statements contradict each other then it's considered a very serious error in the documentation. For this reason, working groups try to only maintain one normative version of a specification (you wouldn't want the Spanish version to be contradicting the English version due to a translation error, for example). Normative content always takes precedent over non-normative content if any contradictions are found between them.
There are a few other sections and features that you will commonly find in W3C documentation that it's worth taking note of.
- Metadata - the very top of the document provides some information such as when documents were published, links to previous versions as well as authors.
- Abstract - this is normally one of the first sections in the document and is intended to give you an overview of what is to come.
- References - links to other documents that are referenced within the one you're currently reading. These references also are often split into normative vs non-normative references.
- Status of this document - This section is also one of the first you will see in a document and can serve a number of useful purposes - firstly it may formally outline what the 'working status' of the document means and re-iterate some of the information in the metadata, but there may also be useful information in here about whether the document is part of a suite of documents (and you may find links to those here too). It is also within this section that you often can discover which working group is behind the document.
It is recommended to read at least the Status of this Document and the Abstract sections thoroughly before proceeding!
Here are a few final pieces of advice which form a quick summary of the above.
- Make acclimatisation a priority. Understand why the document exists, who wrote it, when it was written and what broader ecosystem or set of documents it belongs to before diving in.
- Always make sure you are reading the correct version. Search engines make it very easy to accidentally start reading an early draft rather than an approved recommendation. Check the status, or in the case of a working draft consult the metadata to make sure you're looking at the latest version.
- Read the primers first. Before diving head-first into a massive specification document, see if there's an associated "note" which gives you a broad overview. W3C almost always produce these alongside their main standards and they're much more digestible guides. You will likely find these listed in the references or sign-posted during the abstract/introduction.
- Editors Drafts should be considered bleeding edge versions even if they are years old - remember that you cannot rely on them to remain unchanged, they are unstable documents!
- Don't neglect other sources. Despite W3C being the main "source" of the standards information, they might not be the best place to start looking at a topic. A specification is not a user manual and much of the W3C guidence is written by deep subject matter experts who are likely to be less cognisant of what newcomers need to know. If you're finding the documentation there heavy going, maybe try looking elsewhere first (blogs, articles, wikipedia, MDN, books) before tackling the W3C documentation itself.
© Crown Copyright GCHQ 2024 - This content is licensed under the Open Government License 3.0