Skip to content
MaxBrandtner edited this page May 3, 2024 · 16 revisions

jv is the C type of all jq values in jq's internal library, libjq. All jv values are immutable, which is a requirement if you want to implement jq's backtracking while remaining approximately sane.

The jv/jq APIs are public, mostly -- we've not made any stability commitments, but we're not likely to make breaking changes either.

This means that functions operating on values of C type jv tend to be referentially transparent: you can't pass an empty array to a function and expect it to be filled in when the function returns. If you want a function to return some information, it has to actually return a new object since it can't go modifying its arguments.

This means that some of the API usage will look a little odd. For instance, the functions jv_array_get() and jv_array_set() can be used to get and set elements of an array. The usage of jv_array_get() is fairly standard:

    jv elem = jv_array_get(array, 42);

But to use jv_array_set(), you have to know that it returns the new array. You can't ignore the return value.

    array = jv_array_set(array, 42, elem);

jv values are reference-counted. The jv_*() functions, with very few exceptions (see below) consume 1 reference for every argument of jv type, so in array = jv_array_set(array, 42, elem); the value of array is as if jv_free()ed by jv_array_set() which then returns a new value that we set on array, thus a sequence of these is correct:

    jv array = jv_array();
    array = jv_array_append(array, jv_number(0));
    array = jv_array_append(array, jv_number(1));
    array = jv_array_append(array, jv_number(2));

This is such a common pattern that there is a vararg macro for it, JV_ARRAY() for this.

Kinds

The "kind" of a jv value is one of the following, defined in the enum jv_kind:

  • JV_KIND_INVALID
  • JV_KIND_NULL
  • JV_KIND_FALSE
  • JV_KIND_TRUE
  • JV_KIND_NUMBER
  • JV_KIND_STRING
  • JV_KIND_ARRAY
  • JV_KIND_OBJECT

All but the first represent normal JSON values. The next section explains invalid objects. You can check the kind of an object by calling jv_get_kind() on that jv.

Functions

Memory management:

  • jv jv_copy(jv)

    This "copies" the given jv value. Actually it increments the value's reference count.

  • void jv_free(jv)

    This frees the given jv value. Actually it decrements the value's reference count, freeing its resources only if the reference count falls to zero.

  • void* jv_mem_alloc(size_t) Like alloc(), but is guarded by a memory exhausted handler.

  • void* jv_mem_alloc_unguarded(size_t) Like jv_mem_alloc(), but doesn't have the handler.

  • void* jv_mem_calloc(size_t, size_t)

  • void* jv_mem_calloc_unguarded(size_t, size_t)

  • void jv_mem_free(void*)

  • void* jv_mem_realloc(void*, size_t)

  • char *jv_mem_strdup(const char *)

  • char *jv_mem_strdup_unguarded(const char *)

  • void jv_nomem_handler(jv_nomem_handler_f, void*)

Scalar constructors:

  • jv jv_null(void)
  • jv jv_bool(int)
  • jv jv_true(void)
  • jv_false(void)
  • jv_invalid(void) (see below, under "Invalid" jv functions)

Array functions:

  • jv jv_array(void) (constructs an empty array)
  • jv jv_array_append(jv, jv) (appends the second item to an array)
  • jv jv_array_concat(jv, jv) (concatenates two arrays)
  • jv jv_array_get(jv, int) (returns the nth item from the array)
  • jv jv_array_indexes(jv, jv)
  • int jv_array_length(jv) (returns the length of the array)
  • jv jv_array_set(jv, int, jv) (sets the nth value in the array)
  • jv jv_array_sized(int) (creates an empty array with space preallocated for as many items as requested)
  • jv jv_array_slice(jv, int, int) (creates a new array that has a slice of the given array from the start to the end indices)

Object functions:

  • jv jv_object(void) (constructs an empty object)
  • jv jv_object_delete(jv, jv) (deletes a key from an object)
  • jv jv_object_get(jv, jv) (returns the value for a key, or jv_invalid otherwise)
  • jv jv_object_has(jv, jv) (checks if a key is in the object)
  • int jv_object_iter(jv) (return a value for keeping track of iteration)
  • jv jv_object_iter_key(jv, int) (return the key corresponding to an iteration value)
  • int jv_object_iter_next(jv, int) (change to the next key value pair)
  • int jv_object_iter_valid(jv, int) (check if an iteration value is valid)
  • jv jv_object_iter_value(jv, int) (return the key corresponding to an iteration value)
  • jv jv_object_length(jv) (returns the number of keys in the object)
  • jv jv_object_merge(jv, jv) (copies keys from the second object, overwriting matching keys)
  • jv jv_object_merge_recursive(jv, jv) (same as jv_object_merge(), but merges object values instead of overwriting them)
  • jv jv_object_set(jv, jv, jv) (sets a key in an object)

String functions:

  • jv jv_string(const char *) (creates a string from a null terminated C string)
  • jv jv_string_append_buf(jv, const char *, int) (appends n bytes from a C string)
  • jv jv_string_append_codepoint(jv, uint32_t) (appends a single UTF-8 codepoint)
  • jv jv_string_append_str(jv, const char *) (appends a null terminated C string)
  • jv jv_string_concat(jv, jv) (concatenates two strings)
  • jv jv_string_empty(int) (constructs an empty string)
  • jv jv_string_explode(jv) (returns an array of the UTF-8 codepoints in the string)
  • jv jv_string_fmt(const char *, ...) (returns a formatted string just like printf())
  • unsigned long jv_string_hash(jv) (returns the hash of the string)
  • jv jv_string_implode(jv) (converts an array of UTF-8 codepoints into a string, replacing invalid codepoints)
  • jv_string_indexes(jv, jv) (finds the indexes where the second string appears)
  • int jv_string_length_bytes(jv) (returns the length of the string, in bytes in UTF-8 encoding)
  • int jv_string_length_codepoints(jv) (returns the length of the string, in Unicode code points)
  • jv jv_string_sized(const char *, int) (creates a string from n bytes of a C string)
  • jv jv_string_slice(jv, int, int) (creates a new string that has a slice of the given string from the start to the end indices)
  • jv jv_string_split(jv, jv) (splits a string by another string, or into UTF-8 codepoints if the second string is empty)
  • const char *jv_string_value() (returns the null terminated UTF-8 C string)
  • jv jv_string_vfmt(const char *, va_list) (same as jv_string_fmt() but takes a va_list)

Number functions:

  • jv jv_number(double) (construct a number from a double)
  • const char *jv_number_get_literal(jv) (return the literal value, if the number is decimal and not infinite, otherwise NULL)
  • int jv_number_has_literal(jv) (is a number a decimal number instead of a double)
  • double jv_number_value(jv) (get the value as a double)
  • jv jv_number_with_literal(const char *) (create a number with a literal value)

"Invalid" jv functions:

  • jv_invalid(void) (construct an empty invalid value, used to signal the end of a generator)
  • jv jv_invalid_get_msg(jv) (get the message from an invalid value)
  • int jv_invalid_has_msg(jv) (check if an invalid value has a message)
  • jv jv_invalid_with_msg(jv) (construct an invalid value with a message, used as an error)

JSON parsing functions:

  • jv jv_parse(const char *) (parse one JSON value from a null terminated C string)
  • void jv_parser_free(struct jv_parser *) (destroy a parser data structure)
  • struct jv_parser* jv_parser_new(int) (construct a parser data structure with flags)
  • jv jv_parser_next(struct jv_parser *) (yield the next value from the parser)
  • int jv_parser_remaining(struct jv_parser *) (returns the number of remaining characters)
  • void jv_parser_set_buf(struct jv_parser *, const char *, int, int) (add a chunk of characters to parse with a length and if there's more)
  • jv jv_parse_sized(const char *, int) (parse one JSON value from n characters of a buffer)

JSON formatting functions:

  • jv jv_dump(jv, int) (the int parameter is a flags field; the flags are from enum jv_print_flags)
  • jv jv_dumpf(jv, FILE *, int) (ditto)
  • jv jv_dump_string(jv, int) (ditto)
  • jv jv_dump_string_trunc(jv, char *, size_t) (write a possibly-truncated, NUL-terminated JSON text to the given buffer)
  • void jv_show(jv, int) (prints the JSON representation of the given value to stdout, mainly for use in debuggers)

Comparison functions:

  • jv jv_cmp(jv, jv) (comparator of two values)
  • jv jv_equal(jv, jv) (checks if the two values are equal)
  • jv_identical(jv, jv) (checks if the two values have the same memory representation)

Misc. utility functions

  • jv jv_contains(jv, jv) (recursively checks if a value is contained in another value)
  • jv jv_delpaths(jv, jv) (delete values at paths in an object or array)
  • jv jv_get(jv, jv) (same as jq a[b])
  • jv_kind jv_get_kind(jv) (get the type of a value)
  • jv_getpath(jv, jv) (get a value at a path)
  • int jv_get_refcnt(jv) (get the reference count)
  • jv jv_group(jv, jv) (return an array of groups grouped by the keys)
  • jv jv_has(jv, jv) (check if a value contains another value as a key)
  • int jv_is_integer(jv) (check if a number is an integer)
  • jv jv_keys(jv) (get a sorted array of keys of an array or object)
  • jv jv_keys_unsorted(kv) (get an array of keys in an object)
  • const char *jv_kind_name(jv_kind) (get a type name for a jv type)
  • jv jv_load_file(const char *, int) (load a file into a jv string or parse it as JSON)
  • jv jv_set(jv, jv, jv) (same as jq a[b]=c)
  • jv jv_setpath(jv, jv, jv) (set a value at a path)
  • jv jv_sort(jv, jv) (sort an array with another array of keys)

Errors

As well as the normal kinds of JSON values (array, bool, string, etc.), jv supports objects of kind JV_KIND_INVALID. Such objects are used to signal errors. Some of them carry error messages, which may be an arbitrary JSON value (you can check with jv_invalid_has_msg() and retrieve it with jv_invalid_get_msg()).

Generally, the kind-specific functions like jv_array_get() require that their argument be of the correct kind, and trigger an assertion failure aborting the program if not. That is, the program will crash if you pass a string to jv_array_get(): you must check that the argument is an array before using this function.

The functions are forgiving as long as the kinds are right. If you call jv_array_get() with an out-of-bounds index, then you will get an object of kind JV_KIND_INVALID back. This definitely indicates an invalid index; it is impossible to store an object of kind JV_KIND_INVALID in an array (or in anything else, for that matter).

You may find it more convenient to use the higher-level functions from <jv_aux.h>, which do more runtime error-checking and are implemented in terms of the primitives in <jv.h>. For instance, jv_get from <jv_aux.h> takes a value and an index. If the value is an array and the index is in-bounds, it returns the corresponding element. If the value is an object and the index is a valid string key, it returns the corresponding entry. Otherwise, it returns a JV_KIND_INVALID with a suitable error message.

Memory management

jv refcounts all heap-allocated objects. The usual objection to refcounting is that it fails when objects contain cycles. This is true. Luckily, due to the immutability of jv objects, it's impossible to create a cycle.

This is a pleasant property; as well as getting rid of pointer aliasing (a fertile source of bugs), it also limits us to acyclic heap structures. Since JSON does not support cyclic structures, this means that any jv object can be rendered as JSON.

Most jv functions are said to "consume" their arguments. That is, once you have passed the arguments to the function you may no longer use them and their memory may be reused. For instance, in the jv_array_get() example above, it is invalid to use the variable array after that line has executed. If you need to reuse a jv value, you can call jv_copy() to get a second copy of it. jv_copy() does not consume its argument.

It may seem like jv_copy() does a deep copy of the object. It certainly behaves in this way, and if you keep that model in mind when writing jv code you'll get the right answer. However, jv_copy is in fact very cheap, see below for how it works.

You must consume every jv value, otherwise there may be memory leaks (the tests won't pass if so, as they're run under valgrind). If you have nothing else to do with a value, pass it to jv_free(), which consumes its argument and does nothing with it.

Functions that do not decrement their jv arguments' references

  • jv_copy() (naturally)
  • jv_get_kind()
  • jv_string_value()
  • jv_number_value()
  • jv_show()

Implementation

The jv API can be used as though every operation copied the entire object and jv_copy did a deep-copy. That's a useful mental model to program with, but it would be horrendously slow. Instead, jv uses a copy-on-write scheme for all objects.

In the worst case, the jv functions will need to copy their input object. However, most of the time there's no reason to keep the old version around as it will never be used again. In this case, the refcount of the input object will be 1 (only one reference) and the function would have to free it. So, all of the functions that return new version of an object (e.g. jv_array_set) first check whether the refcount is 1. If so, they know they can safely modify the object in-place without allocating any new memory. Thus, most of the time, jv_array_set won't copy anything.

jv_copy is then implemented by increasing the reference count by 1. This means that the object won't be modified by future calls to jv_array_set and the like. Instead, jv_array_set will copy the object and modify that.

Clone this wiki locally