Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSON schema for top-level config file + use for validation #489

Merged
merged 17 commits into from
Aug 27, 2024

Conversation

alexdewar
Copy link
Contributor

This PR adds a JSON schema for top-level config files and changes the code to use it for validation. In contrast to the existing schema for index.json, I've split the schema into lots of small subschemas (see #452) for readability and maintainability and I've set additionalProperties to false everywhere (see #486).

There are a couple of small changes that will be needed to existing top-level config files in order for them to work with this new code:

  • The schema URL needs to be specified ("$ref": "https://raw.githubusercontent.com/imperialCHEPI/healthgps/main/schemas/v1/config.json")
  • The version property should be dropped

Other than that, it should work as before. I've tested that all the config files in the examples repo work (or, rather, they will after we merge imperialCHEPI/healthgps-examples#13), but it is possible that the schema is overly restrictive (e.g. a given value can be null, but we aren't allowing for this).

I also had to refactor the C++ code for working with schemas to make it more generic, but it essentially does what it did before.

Closes #449.

@jzhu20 FYI. If you send me the config files you're using at present I can check they work with the new schema (hopefully they will!).

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS.Input/configuration.cpp Outdated Show resolved Hide resolved
@@ -35,6 +36,9 @@
namespace {
using namespace hgps::input;

static constexpr const char *ConfigSchemaFileName = "config.json";
static constexpr int ConfigSchemaVersion = 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: 'ConfigSchemaVersion' is a static definition in anonymous namespace; static is redundant here [readability-static-definition-in-anonymous-namespace]

Suggested change
static constexpr int ConfigSchemaVersion = 1;
constexpr int ConfigSchemaVersion = 1;

src/HealthGPS.Input/datamanager.cpp Outdated Show resolved Hide resolved
src/HealthGPS.Input/datamanager.cpp Outdated Show resolved Hide resolved
src/HealthGPS.Input/datamanager.cpp Outdated Show resolved Hide resolved
src/HealthGPS.Input/schema.cpp Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS.Input/configuration.cpp Outdated Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@TinyMarsh TinyMarsh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, just a couple of questions.

},
"coverage_cutoff_time": {
"type": "integer",
"minValue": 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know, but it seems the keyword for setting a minimum value is minimum, rather than minValue. See here:
https://json-schema.org/understanding-json-schema/reference/numeric#range

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot! I'm not sure how I managed that 🤦. We also have a pre-commit hook to check the schemas are valid, but it seems you can add extra properties and it won't complain 🤷

},
"child_cutoff_age": {
"type": "integer",
"minValue": 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See above.

Comment on lines +12 to +16
"function_parameters": {
"type": "array",
"items": {
"type": "number"
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just checking this array doesn't require a minimum number of values? Same applies for other arrays in this schema I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure... @jamesturner246?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these depend on the SES model -- e.g. normal has mean, std. Maybe there are other models that have none? Never uses the SES module.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... ok. I'll leave as is for now. We can always add extra constraints in future if needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do any of the string types here require a minLength to ensure they are not empty?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good idea. I don't know whether I can be bothered to put minLengths all over the place though, but maybe I'm just being lazy 😆.

What do you think @jamesturner246? How pedantic do we want to be?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't push you -- one would hope users know not to put empty string fields... Certainly won't hurt though.

@alexdewar
Copy link
Contributor Author

Thanks for going through this @TinyMarsh. I know it's probably not much fun just reviewing a bunch of JSON 😆

Copy link
Contributor

@jamesturner246 jamesturner246 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

I won't push to add extra validation like string length checking now if it's tons of work. This looks like the bulk of it. And some of the arrays like SES module I don't even know how long they are supposed to be. We can look at it later if problems occur.

Comment on lines +40 to +41
// **YUCK**: We have to read in the data with jsoncons here rather than reusing the
// nlohmann-json representation :-(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would there be any benefit to switching over to another JSON lib, e.g. jsoncons in future? I saw an issue opened by someone: #173 (comment). No need to worry now.

Copy link
Contributor Author

@alexdewar alexdewar Aug 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. nlohmann-json is kind of the standard one, but it seems like jsoncons has a very similar interface. It bugs me that we're using two JSON libraries, but I fear it might involve a fair amount of refactoring to swap everything over to jsoncons.

@alexdewar
Copy link
Contributor Author

It seems that you can provide relative paths to schemas more tersely, so I've fixed that.

@alexdewar
Copy link
Contributor Author

After our discussion in the meeting, I think I'll change this so that it's fully backwards compatible with existing config files, just to make everyone's life easier. Shouldn't be hard to do.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

throw std::runtime_error(message);
} else {
fmt::print(fmt::fg(fmt::color::dark_salmon), "{}\n", message);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use 'else' after 'throw' [readability-else-after-return]

Suggested change
}
} fmt::print(fmt::fg(fmt::color::dark_salmon), "{}\n", message);

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

src/HealthGPS.Input/schema.cpp Show resolved Hide resolved
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clang-tidy made some suggestions

throw std::runtime_error(message);
} else {
fmt::print(fmt::fg(fmt::color::dark_salmon), "{}\n", message);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

warning: do not use 'else' after 'throw' [readability-else-after-return]

Suggested change
}
} fmt::print(fmt::fg(fmt::color::dark_salmon), "{}\n", message);
fmt::print(fmt::fg(fmt::color::dark_salmon), "{}\n", message);

@alexdewar alexdewar merged commit 470eaf0 into main Aug 27, 2024
5 checks passed
@alexdewar alexdewar deleted the config_schema branch August 29, 2024 08:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add JSON schema for top-level configuration file
3 participants