OAI · handrews · Jun 18, 2025 · Jun 22, 2025 · Jun 27, 2025 · Jun 28, 2025
@@ -309,7 +309,7 @@ Using a `contentEncoding` of `base64url` ensures that URL encoding (as required
 
 The `contentMediaType` keyword is redundant if the media type is already set:
 
-* as the key for a [MediaType Object](#media-type-object)
+* as the key for a [Media Type Object](#media-type-object)
 * in the `contentType` field of an [Encoding Object](#encoding-object)
 
 If the [Schema Object](#schema-object) will be processed by a non-OAS-aware JSON Schema implementation, it may be useful to include `contentMediaType` even if it is redundant. However, if `contentMediaType` contradicts a relevant Media Type Object or Encoding Object, then `contentMediaType` SHALL be ignored.
@@ -1257,6 +1257,8 @@ See [Working With Examples](#working-with-examples) for further guidance regardi
 
 This object MAY be extended with [Specification Extensions](#specification-extensions).
 
+Note that correlating Encoding Objects with Schema Objects may require [schema searches](#searching-schemas) for keywords such as `properties`, `prefixItems`, and `items`.
+
 See also the [Media Type Registry](#media-type-registry).
 
 ##### Complete vs Streaming Content
@@ -1639,7 +1641,7 @@ These fields MAY be used either with or without the RFC6570-style serialization
 
 | Field Name | Type | Description |
 | ---- | :----: | ---- |
-| <a name="encoding-content-type"></a>contentType | `string` | The `Content-Type` for encoding a specific property. The value is a comma-separated list, each element of which is either a specific media type (e.g. `image/png`) or a wildcard media type (e.g. `image/*`). Default value depends on the property type as shown in the table below. |
+| <a name="encoding-content-type"></a>contentType | `string` | The `Content-Type` for encoding a specific property. The value is a comma-separated list, each element of which is either a specific media type (e.g. `image/png`) or a wildcard media type (e.g. `image/*`). The default value depends on the type (determined by a [schema search](#searching-schemas)) as shown in the table below. |
 | <a name="encoding-headers"></a>headers | Map[`string`, [Header Object](#header-object) \| [Reference Object](#reference-object)] | A map allowing additional information to be provided as headers. `Content-Type` is described separately and SHALL be ignored in this section. This field SHALL be ignored if the media type is not a `multipart`. |
 
 This object MAY be extended with [Specification Extensions](#specification-extensions).
@@ -2599,6 +2601,10 @@ Note that JSON Schema Draft 2020-12 does not require an `x-` prefix for extensio
 The [`format` keyword (when using default format-annotation vocabulary)](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#section-7.2.1) and the [`contentMediaType`, `contentEncoding`, and `contentSchema` keywords](https://www.ietf.org/archive/id/draft-bhutton-json-schema-validation-01.html#section-8.2) define constraints on the data, but are treated as annotations instead of being validated directly.
 Extended validation is one way that these constraints MAY be enforced.
 
+In addition to extended validation, annotations are the most effective way to determine whether these keywords impact the type and structure of the fully parsed data.
+For example, formats such as `int64` can be applied to JSON strings, as JSON numbers have limitations that make large integers non-portable.
+If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.
-If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.
+If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and SHOULD document the limitations this imposes.
-If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and MUST document the limitations this imposes.
+If annotation collection is not available, implementations MUST perform a [schema search](#searching-schemas) for these keywords, and SHOULD document the limitations this imposes.
+
 ###### Validating `readOnly` and `writeOnly`
 
 The `readOnly` and `writeOnly` keywords are annotations, as JSON Schema is not aware of how the data it is validating is being used.
@@ -2611,6 +2617,108 @@ Even when read-only fields are not required, stripping them is burdensome for cl
 
 Note that the behavior of `readOnly` in particular differs from that specified by version 3.0 of this specification.
 
+##### Working with Schemas
+
+In addition to schema evaluation, which encompasses both validation and annotation, some OAS features require inspecting schemas in other ways.
+
+###### Preparing Data for Schema Evaluation
+
+When the data source is a JSON document, preparing the data is trivial as parsing JSON produces a suitable data structure.
+Some other media types, as well as URL components and header values, lack sufficient type information to parse directly to suitable data types.
+
+Consider this URL-encoded form:
+
+```uri
+foo=42&bar=42
+```
+
+As URL query parameters are strings, this would naturally parse to something equivalent to the following JSON:
+
+```json
+{
+  "foo": "42",
+  "bar": "42"
+}
+```
+
+But consider this [Media Type Object](#media-type-object) for the form:
+
+```yaml
+application/x-www-form-urlencoded:
+  schema:
+    type: object
+    properties:
+      foo:
+        type: string
+      bar:
+        type: integer
+```
+
+From the `schema` field, we can tell that the correct data structure would actually be equivalent to:
+
+```json
+{
+  "foo": "42",
+  "bar": 42
+}
+```
+
+In order to prepare the correct data structure for evaluation in such cases, implementations MUST perform a [schema search](#searching-schemas) for the `type` keyword.
+
+###### Applying Further Type Information
+
+The `format` keyword provides more fine-grained type information, and can even change the underlying data type for the purposes of the application.
+For example, if `foo` had the schema `{"type": "string", "format": "int64")`, the data structure used for validation would still be the same, but the application will need to convert the string `"42"` to the 64-bit integer `42`.
+Similarly, the `content*` keywords can indicate further structure within a string.
+
+Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.
-Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.
+Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and SHOULD document which approach it implements.
-Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and MUST document which approach it implements.
+Implementations MUST either use [annotation collection](#extended-validation-with-annotations) to gather this information, or perform a [schema search](#searching-schemas), and SHOULD document which approach it implements.
+
+Note that parsing string contents based on `contentMediaType` carries the same security risks as parsing HTTP message bodies based on `Content-Type`; see [Handling External Resources](#handling-external-resources) for further information.
+
+###### Schema Evaluation and Binary Data
+
+Few JSON Schema implementations directly support working with binary data, as doing so is not a mandatory part of that specification.
+
+OAS Implementations that do not have access to a binary-instance-supporting JSON Schema implementation MUST examine schemas and apply them in accordance with [Working with Binary Data](#working-with-binary-data).
+When the entire instance is binary, this is straightforward as few keywords are relevant.
+
+However, `multipart` media types can mix binary and text-based data, leaving implementations with two options for schema evaluations:
+
+1. Use a placeholder value, on the assumption that no assertions will apply to the binary data and no conditional schema keywords will cause the schema to treat the placeholder value differently (e.g. a part that could be either plain text or binary might behave unexpectedly if a string is used as a binary placeholder, as it would likely be treated as plain text and subject to different subschemas and keywords).
+2. Perform [schema searches](#searching-schemas) to find the appropriate keywords (`properties`, `prefixItems`, etc.) in order to break up the subschemas and apply them separately to binary and JSON-compatible data.
+
+Implementations MUST document which strategy or strategies they use, as well as any known limitations.
-Implementations MUST document which strategy or strategies they use, as well as any known limitations.
+Implementations SHOULD document which strategy or strategies they use, as well as any known limitations.
-Implementations MUST document which strategy or strategies they use, as well as any known limitations.
+Implementations SHOULD document which strategy or strategies they use, as well as any known limitations.
+
+##### Searching Schemas
+
+Several OAS features require searching Schema Objects for keywords indicating the data type and/or structure.
+Each feature that needs such a search documents which keywords or structures need to be found.
+
+Even if the requirement is given in terms of schema keywords, if the data is in a form [suitable for schema evaluation](#preparing-data-for-schema-evaluation) and the necessary information (including type) can be determined by inspecting the data (and possibly also annotations such as `format`), implementations MUST support doing so as this is effective regardless of how schemas are structured.
+
+If this is not possible, the schemas MUST be searched to see if the information can be determined without performing evaluation.
+As schema organization can become very complex, implementations are not expected to handle every possible schema layout.
+However, given a known starting point schema (usually the value of the nearest `schema` field), implementations MUST search the following for the relevant keywords, which vary depending on the use case but might include `type`, `format`, `contentMediaType`, `properties`, `prefixItems`, `items`, etc.:
+
+* The starting point schema itself
+* Any schema reachable from there solely through `$ref` and/or `allOf`
+
+These schemas are guaranteed to be applied to any instance.
+
+In some cases, such as correlating [Encoding Objects](#encoding-object) with Schema Objects using fields in a [Media Type Object](#media-type-object), it is be necessary to first find a keyword such as `properties`, and then treat its subschema(s) as starting point schemas for further searches.
+
+Implementations MAY analyze subschemas of other keywords such as `oneOf` or `dependentSchemas`, or examine possible `$dynamicRef` targets, and MUST document the extent and nature of any such additional support.
+
+###### Handling Multiple Types
+
+When searching for `type`, if the `type` keyword has multiple values, one of which is `"null"` (e.g. `type: ["number", "null"]`), the non-null type MUST be treated as the relevant type if a single type is needed to determine behavior.
+
+For other multi-valued `type` keywords, the behavior is implementation-defined but MUST either follow a documented process or be documented to produce an informative error.
+
+If an implementation supports handling multi-valued `type` keywords for type searches, it SHOULD attempt to use non-`"string"` types before using `"string"` (if `"string"` is one of the types) as all current type interpretation use cases involve data stored in string form by default.
+
+Implementations MAY treat the order of types in the `type` keyword as significant, except when it conflicts with the above requirements.
+
 ##### Data Modeling Techniques
 
 ###### Composition and Inheritance (Polymorphism)