Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Schema Evolution #125

Open
2 tasks
dominiklohmann opened this issue Nov 13, 2023 · 1 comment
Open
2 tasks

Schema Evolution #125

dominiklohmann opened this issue Nov 13, 2023 · 1 comment
Labels
engine Core pipeline and storage engine performance Improvements or regressions of performance

Comments

@dominiklohmann
Copy link
Member

Schemas change over time. Especially with schema inference, we often end up with multiple schemas of the same name that are actually different under the hood.

We have the schema id to filter this out right now, but that just shifts the burden onto the user. Instead, we want to transparently cast events on access of a partition to a superset schema for all schemas of the same name.

Definition of Done

@dominiklohmann
Copy link
Member Author

This was (indirectly) requested by a customer alongside #123—due to schema inference in the JSON parser, they have a lot of ever so slightly different schemas for Suricata, which on their end is configured not to write null values. This makes it impossible to merge partitions, thus slowing down exports from the node because of the suboptimal partition sizes.

@dominiklohmann dominiklohmann added performance Improvements or regressions of performance engine Core pipeline and storage engine labels Dec 1, 2023
@dominiklohmann dominiklohmann transferred this issue from another repository Dec 8, 2023
@dominiklohmann dominiklohmann transferred this issue from another repository Dec 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engine Core pipeline and storage engine performance Improvements or regressions of performance
Projects
None yet
Development

No branches or pull requests

1 participant