Reference

Media: string-encoding non-JSON data

New in draft 7

JSON schema has a set of keywords to describe and optionally validate non-JSON data stored inside JSON strings. Due to the difficulty in writing validators for all media types, JSON schema validators are not required to validate the contents of JSON strings based on these keywords. However, applications that consume validated JSON use these keywords to encode and decode data during the storage and transmission of media types.

contentMediaType and contentEncoding

The contentMediaType keyword specifies the media type of the content of a string, as described in RFC 2046. The Internet Assigned Numbers Authority (IANA) has officially registered a comprehensive list of media types, but the set of supported types depends on the application and operating system. Mozilla Developer Network maintains a shorter list of media types that are important for the web

Example

The following schema specifies a string containing an HTML file using the document's default encoding.

schema
{ "type": "string", "contentMediaType": "text/html"}
data
"<!DOCTYPE html><html xmlns=\"http://www.w3.org/1999/xhtml\"><head></head></html>"
compliant to schema

The contentEncoding keyword specifies the encoding used to store the contents, as specified in RFC 2054, part 6.1 and RFC 4648.

The acceptable values are the following:

  • quoted-printable
  • base16
  • base32
  • base64

    If not specified, the encoding is the same as the containing JSON document.

There are two main scenarios:

  1. Same encoding as JSON document: Leave contentEncoding unspecified and include the content in a string as-is. This is suitable for text-based content types (e.g., text/html, application/xml) and assumes UTF-8 encoding in most cases.
  2. Binary data: Set contentEncoding to base64 and encode the content using Base64. This is appropriate for binary content types such as images (image/png) or audio files (audio/mpeg).

Example

The following schema indicates that a string contains a PNG file and is encoded using Base64:

schema
{ "type": "string", "contentEncoding": "base64", "contentMediaType": "image/png"}
data
"iVBORw0KGgoAAAANSUhEUgAAABgAAAAYCAYAAADgdz34AAAABmJLR0QA/wD/AP+gvaeTAAAA..."
compliant to schema

To better understand how contentEncoding and contentMediaType are applied in practice, let's consider the process of transmitting non-JSON data:

1block-beta
2   columns 9
3   A space B space C space D space E
4   F space:5 G space:2
5
6   A{{"Sender"}} --> F{"contentEncoding
7contentMediaType"}
8   F{"contentEncoding
9contentMediaType"} --> B{{"Encoded data"}}
10   B{{"Encoded data"}} --> C(["Transmission"])
11   C(["Transmission"]) --> D{{"Consumer application"}}
12   D{{"Consumer application"}} --> G{"contentEncoding
13contentMediaType"}
14    G{"contentEncoding
15contentMediaType"} --> E{{"Decoded data"}}
  1. The sender encodes the content, using contentEncoding to specify the encoding method (e.g., base64) and contentMediaType to indicate the media type of the original content.
  2. The encoded data is then transmitted.
  3. Upon receiving the data, the consumer application uses the contentEncoding and contentMediaType information to select the appropriate decoding method.
  4. Finally, the consumer application decodes the data, restoring it to its original form.

This process ensures that the non-JSON content is properly encoded for transmission and accurately decoded by the recipient, maintaining the integrity of the data throughout the process.

contentSchema

New in draft 2019-09

The value of contentSchema must be a valid JSON schema that you can use to define the structure and constraints of the content. It is used in conjunction with contentMediaType when the instance is a string. If contentMediaType is absent, the value of contentSchema is ignored.

Full example

The following schema indicates that a string contains a JSON object encoded using Base64:

schema
{ "$schema": "https://json-schema.org/draft/2020-12/schema", "type": "object", "properties": { "data": { "type": "string", "contentMediaType": "application/json", "contentEncoding": "base64", "contentSchema": { "type": "object", "properties": { "name": { "type": "string" }, "age": { "type": "integer" } }, "required": ["name", "age"] } } }}
data
"eyJuYW1lIjoiSm9obiBEb2UiLCJ0b21lIjoiMjUifQ=="
compliant to schema
data
{ "name": "John Doe", "age": 25}
compliant to schema

Need Help?

Did you find these docs helpful?

Help us make our docs great!

At JSON Schema, we value docs contributions as much as every other type of contribution!

Still Need Help?

Learning JSON Schema is often confusing, but don't worry, we are here to help!.