Skip to content

[Dev] Generate CPP code from the OpenAPI REST spec #157

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 65 commits into
base: main
Choose a base branch
from

Conversation

Tishj
Copy link
Collaborator

@Tishj Tishj commented Apr 7, 2025

This PR creates CPP header+source files for all the schemas in the Iceberg OpenAPI REST catalog spec.

The yyjson_val* root is parsed into a tree of PoD/vector/case_insensitive_map_t/<schema class>, with the exception of:

  • LiteralExpression's value property.
  • UnaryExpression's value property.

These remain yyjson_val*, only because the spec leaves them entirely ambiguous:

        value:
          type: object

Parse the OpenAPI spec

First we create a dictionary of Property objects from the schemas in the spec.
Whenever we encounter an object type inside a property or somewhere that isn't directly part of a schemas entry, we create a custom ($ref) entry for it (these are the classes named Object[0-9]+).

Whenever we encounter an object that consists of only a allOf with a single $ref entry, we deconstruct that object into the entry of the allOf.
(I have been told this is done to be able to include a description along with the $ref in an OpenAPI schema definition)

Whenever a property can contain itself (Expression, Type) we mark it as recursive.

Generate CPP Code

From a Property we create a CPPClass, this contains all the logic needed to construct the body of the TryFromJSON method, the variables and the nested classes (the Object[0-9]+ we mentioned earlier).

If a property is recursive, it's wrapped in a unique_ptr.

We generate the following from these CPPClass objects:

  • a CMakeLists.txt containing all the generated source files (in src/rest_catalog/objects)
  • a list.hpp file containing all the generated header files (in src/include/rest_catalog/objects)
  • a source file for every entry in schemas (in src/rest_catalog/objects)
  • a header file for every entry in schemas (in src/include/rest_catalog/objects)

Future Work

We might benefit from the ability to do the reverse of this (cpp -> JSON) if some of the REST endpoints require json to be sent in the body of the request.

It should be possible to generate the serialization code for this, and perhaps also the helper (setter) methods to populate the object, + the validity method to check compliance with the spec (required fields are populated, allOf schemas are respected, etc..)

Tishj added 30 commits March 27, 2025 14:38
…<string>', add better ordering to properties
'public:',
f'\tstatic {self.name} FromJSON(yyjson_val *obj);',
'public:',
'\tstring TryFromJSON(yyjson_val *obj);',
Copy link
Collaborator Author

@Tishj Tishj Apr 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a thought, maybe (as future work) we want to create a stack of errors:

bool TryFromJSON(yyjson_val *obj, stack<string> &error);

Especially since we have these Object[0-9]+ classes, these errors aren't very descriptive if they happen at deeper levels.
With a stack we can use:

if (!TryFromJSON(val, error)) {
    error.push("<ClassName> failed to parse");
    return false;
}

@Tishj
Copy link
Collaborator Author

Tishj commented Apr 7, 2025

I'm working on integrating them here, I figured those changes would get lost in the noise in this PR.

See the diff here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant