Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON encoding #14021

Merged
merged 16 commits into from
Dec 10, 2024
3 changes: 2 additions & 1 deletion NOTICE
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,10 @@ LEGAL NOTICE INFORMATION

All the files in this distribution are copyright to the terms below.

== lib/elixir/src/elixir_json.erl
== lib/elixir/src/elixir_parser.erl (generated by build scripts)

Copyright Ericsson AB 1996-2015
Copyright Ericsson AB 1996-2024

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
Expand Down
366 changes: 366 additions & 0 deletions lib/elixir/lib/json.ex
Original file line number Diff line number Diff line change
@@ -0,0 +1,366 @@
defprotocol JSON.Encoder do
@moduledoc """
A protocol for custom JSON encoding of data structures.

If you have a struct, you can derive the implementation of this protocol
by specifying which fields should be encoded to JSON:

@derive {JSON.Encoder, only: [....]}
defstruct ...

It is also possible to encode all fields or skip some fields via the
`:except` option, although this should be used carefully to avoid
accidentally leaking private information when new fields are added:

@derive JSON.Encoder
defstruct ...
josevalim marked this conversation as resolved.
Show resolved Hide resolved

Finally, if you don't own the struct you want to encode to JSON,
you may use Protocol.derive/3 placed outside of any module:
josevalim marked this conversation as resolved.
Show resolved Hide resolved

Protocol.derive(JSON.Encoder, NameOfTheStruct, only: [...])
Protocol.derive(JSON.Encoder, NameOfTheStruct)
josevalim marked this conversation as resolved.
Show resolved Hide resolved
"""

@undefined_impl_description """
the protocol must be explicitly implemented.

If you have a struct, you can derive the implementation specifying \
which fields should be encoded to JSON:

@derive {JSON.Encoder, only: [....]}
defstruct ...

It is also possible to encode all fields, although this should be \
used carefully to avoid accidentally leaking private information \
when new fields are added:

@derive JSON.Encoder
defstruct ...

Finally, if you don't own the struct you want to encode to JSON, \
you may use Protocol.derive/3 placed outside of any module:

Protocol.derive(JSON.Encoder, NameOfTheStruct, only: [...])
Protocol.derive(JSON.Encoder, NameOfTheStruct)\
"""

@impl true
defmacro __deriving__(module, opts) do
fields = module |> Macro.struct_info!(__CALLER__) |> Enum.map(& &1.field)
fields = fields_to_encode(fields, opts)
vars = Macro.generate_arguments(length(fields), __MODULE__)
kv = Enum.zip(fields, vars)

{io, _prefix} =
Enum.flat_map_reduce(kv, ?{, fn {field, value}, prefix ->
key = IO.iodata_to_binary([prefix, :elixir_json.encode_binary(Atom.to_string(field)), ?:])
{[key, quote(do: encoder.(unquote(value), encoder))], ?,}
end)

io = if io == [], do: "{}", else: io ++ [?}]

quote do
defimpl JSON.Encoder, for: unquote(module) do
def encode(%{unquote_splicing(kv)}, encoder) do
unquote(io)
end
end
end
end

defp fields_to_encode(fields, opts) do
cond do
only = Keyword.get(opts, :only) ->
case only -- fields do
[] ->
only

error_keys ->
raise ArgumentError,
"unknown struct fields #{inspect(error_keys)} specified in :only. Expected one of: " <>
"#{inspect(fields -- [:__struct__])}"
end

except = Keyword.get(opts, :except) ->
case except -- fields do
[] ->
fields -- [:__struct__ | except]

error_keys ->
raise ArgumentError,
"unknown struct fields #{inspect(error_keys)} specified in :except. Expected one of: " <>
"#{inspect(fields -- [:__struct__])}"
end

true ->
fields -- [:__struct__]
end
end

@doc """
A function invoked to encode the given term.
josevalim marked this conversation as resolved.
Show resolved Hide resolved
"""
def encode(term, encoder)
end

defimpl JSON.Encoder, for: Atom do
def encode(value, encoder) do
case value do
nil -> "null"
true -> "true"
false -> "false"
_ -> encoder.(Atom.to_string(value), encoder)
end
end
end

defimpl JSON.Encoder, for: BitString do
def encode(value, _encoder) do
:elixir_json.encode_binary(value)
end
end

defimpl JSON.Encoder, for: List do
def encode(value, encoder) do
:elixir_json.encode_list(value, encoder)
end
end

defimpl JSON.Encoder, for: Integer do
def encode(value, _encoder) do
:elixir_json.encode_integer(value)
end
end

defimpl JSON.Encoder, for: Float do
def encode(value, _encoder) do
:elixir_json.encode_float(value)
end
end

defimpl JSON.Encoder, for: Map do
def encode(value, encoder) do
:elixir_json.encode_map(value, encoder)
end
end

defmodule JSON do
@moduledoc ~S"""
JSON encoding and decoding.

Both encoder and decoder fully conform to [RFC 8259](https://tools.ietf.org/html/rfc8259) and
[ECMA 404](https://ecma-international.org/publications-and-standards/standards/ecma-404/)
standards.

## Encoding

Elixir built-in data structures are encoded to JSON as follows:

| **Elixir** | **JSON** |
|------------------------|----------|
| `integer() \| float()` | Number |
| `true \| false ` | Boolean |
josevalim marked this conversation as resolved.
Show resolved Hide resolved
| `nil` | Null |
| `binary()` | String |
| `atom()` | String |
| `list()` | Array |
| `%{binary() => _}` | Object |
| `%{atom() => _}` | Object |
| `%{integer() => _}` | Object |

You may also implement the `JSON.Encoder` protocol for custom data structures.

## Decoding

Elixir built-in data structures are decoded from JSON as follows:

| **JSON** | **Elixir** |
|----------|------------------------|
| Number | `integer() \| float()` |
| Boolean | `true \| false` |
| Null | `nil` |
| String | `binary()` |
| Object | `%{binary() => _}` |

"""

@moduledoc since: "1.18.0"

@type decode_error ::
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call this decode_error_reason? We could also document the reasons here maybe.

{:unexpected_end, non_neg_integer()}
| {:invalid_byte, non_neg_integer(), byte()}
| {:unexpected_sequence, non_neg_integer(), binary()}

@doc ~S"""
Decodes the given JSON.

Returns `{:ok, decoded}` or `{:error, reason}`.

## Examples

iex> JSON.decode("[null,123,\"string\",{\"key\":\"value\"}]")
{:ok, [nil, 123, "string", %{"key" => "value"}]}

## Error reasons

The error tuple will have one of the following reasons.

* `{:unexpected_end, position}` if `binary` contains incomplete JSON value
* `{:invalid_byte, position, byte}` if `binary` contains unexpected byte or invalid UTF-8 byte
* `{:unexpected_sequence, position, bytes}` if `binary` contains invalid UTF-8 escape
"""
@spec decode(binary()) :: {:ok, term()} | decode_error()
josevalim marked this conversation as resolved.
Show resolved Hide resolved
def decode(binary) when is_binary(binary) do
with {decoded, :ok, rest} <- decode(binary, :ok, []) do
if rest == "" do
{:ok, decoded}
else
{:error, {:invalid_byte, byte_size(binary) - byte_size(rest), :binary.at(rest, 0)}}
end
end
end

@doc ~S"""
Decodes the given JSON with the given decoders.

Returns `{decoded, acc, rest}` or `{:error, reason}`.
See `decode/1` for the error reasons.

## Decoders

All decoders are optional. If not provided, they will fall back to
implementations used by the `decode/1` function:

* for `array_start`: `fn _ -> [] end`
* for `array_push`: `fn elem, acc -> [elem | acc] end`
* for `array_finish`: `fn acc, old_acc -> {Enum.reverse(acc), old_acc} end`
* for `object_start`: `fn _ -> [] end`
* for `object_push`: `fn key, value, acc -> [{key, value} | acc] end`
* for `object_finish`: `fn acc, old_acc -> {Map.new(acc), old_acc} end`
* for `float`: `&String.to_float/1`
* for `integer`: `&String.to_integer/1`
* for `string`: `&Function.identity/1`
* for `null`: the atom `nil`

For streaming decoding, see Erlang's `:json` module.
"""
@spec decode(binary(), term(), keyword()) :: {term(), term(), binary()} | decode_error()
def decode(binary, acc, decoders) when is_binary(binary) and is_list(decoders) do
decoders = Keyword.put_new(decoders, :null, nil)

try do
:elixir_json.decode(binary, acc, Map.new(decoders))
catch
:error, :unexpected_end ->
{:error, {:unexpected_end, byte_size(binary)}}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you considered a JSON.Error or JSON.DecodeError exception with these, matching Jason.DecodeError? In Req we gracefully handle json errors like this:

iex> Req.get(plug: & &1 |> Plug.Conn.put_resp_content_type("application/json") |> Plug.Conn.send_resp(200, "{"))
{:error, %Jason.DecodeError{position: 1, token: nil, data: "{"}}

(which I'm still somewhat skeptical of cause it's not like users can do much with these errors anyway but this was a somewhat common request to have this graceful handling)

so it'd be nice to have an exception struct I could return instead. Not a big deal though, I can invent one for Req.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can raise a specific error on JSON.decode!, and I will make it so, but we don't use the {:error, Exception.t()} struct style anywhere in Elixir, so I am not sure we should do it here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thanks!


:error, {:invalid_byte, byte} ->
{:error, {:invalid_byte, position(__STACKTRACE__), byte}}

:error, {:unexpected_sequence, bytes} ->
{:error, {:unexpected_sequence, position(__STACKTRACE__), bytes}}
end
end

defp position(stacktrace) do
with [{_, _, _, opts} | _] <- stacktrace,
%{cause: %{position: position}} <- opts[:error_info] do
position
else
_ -> 0
end
end

@doc ~S"""
Decodes the given JSON but raises an exception in case of errors.

Returns the decoded content. See `decode!/1` for possible errors.
josevalim marked this conversation as resolved.
Show resolved Hide resolved

## Examples

iex> JSON.decode!("[null,123,\"string\",{\"key\":\"value\"}]")
[nil, 123, "string", %{"key" => "value"}]
"""
def decode!(binary) when is_binary(binary) do
josevalim marked this conversation as resolved.
Show resolved Hide resolved
case decode(binary) do
{:ok, decoded} ->
decoded

{:error, {:unexpected_end, position}} ->
raise ArgumentError, "unexpected end of JSON binary at position #{position}"

{:error, {:invalid_byte, position, byte}} ->
raise ArgumentError, "invalid byte #{byte} at position #{position}"

{:error, {:unexpected_sequence, position, bytes}} ->
raise ArgumentError, "unexpected sequence #{inspect(bytes)} at position #{position}"
end
end

@doc ~S"""
Encodes the given term to JSON as a binary.

The second argument is a function that is recursively
invoked to encode a term.

## Examples

iex> JSON.encode!([123, "string", %{key: "value"}])
"[123,\"string\",{\"key\":\"value\"}]"

"""
def encode!(term, encoder \\ &encode_value/2) do
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spec?

IO.iodata_to_binary(encoder.(term, encoder))
end

@doc ~S"""
Encodes the given term to JSON as an iodata.

This is the most efficient format if the JSON is going to be
used for IO purposes.

The second argument is a function that is recursively
invoked to encode a term.

## Examples

iex> data = JSON.encode_to_iodata!([123, "string", %{key: "value"}])
iex> IO.iodata_to_binary(data)
"[123,\"string\",{\"key\":\"value\"}]"

"""
def encode_to_iodata!(term, encoder \\ &encode_value/2) do
encoder.(term, encoder)
end

@doc """
This is the default function used to recursively encode each value.
"""
def encode_value(value, encoder) when is_atom(value) do
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the same as the protocol, inlined for performance. But I am not a fan of its name. Suggestions? What about encode_callback? or default_encode? or default_encode_callback?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given it isn't technically an @callback, even if it is a generic term _callback might be confusing since its specific meaning in OTP.
default_encode captures the intent well I think.

case value do
nil -> "null"
true -> "true"
false -> "false"
_ -> encoder.(Atom.to_string(value), encoder)
end
end

def encode_value(value, _encoder) when is_binary(value),
do: :elixir_json.encode_binary(value)

def encode_value(value, _encoder) when is_integer(value),
do: :elixir_json.encode_integer(value)

def encode_value(value, _encoder) when is_float(value),
do: :elixir_json.encode_float(value)

def encode_value(value, encoder) when is_list(value),
do: :elixir_json.encode_list(value, encoder)

def encode_value(%{} = value, encoder) when not is_map_key(value, :__struct__),
do: :elixir_json.encode_map(value, encoder)

def encode_value(value, encoder),
do: JSON.Encoder.encode(value, encoder)
end
2 changes: 2 additions & 0 deletions lib/elixir/scripts/elixir_docs.exs
Original file line number Diff line number Diff line change
Expand Up @@ -99,6 +99,7 @@ canonical = System.fetch_env!("CANONICAL")
Float,
Function,
Integer,
JSON,
Module,
NaiveDateTime,
Record,
Expand Down Expand Up @@ -159,6 +160,7 @@ canonical = System.fetch_env!("CANONICAL")
Protocols: [
Collectable,
Enumerable,
JSON.Encoder,
Inspect,
Inspect.Algebra,
Inspect.Opts,
Expand Down
Loading
Loading