Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parsers/json: rewrite and speedup #210

Open
laskoviymishka opened this issue Feb 8, 2025 · 4 comments
Open

Parsers/json: rewrite and speedup #210

laskoviymishka opened this issue Feb 8, 2025 · 4 comments
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed

Comments

@laskoviymishka
Copy link
Contributor

laskoviymishka commented Feb 8, 2025

🚀 Rewrite JSON Parser

Background

The current JSON parser (generic_parser.go) is complex, slow (~40-50MB/s), and overloaded with configuration options that are rarely used. We need a more efficient and streamlined solution while maintaining reasonable backward compatibility.

Why?

  • Performance bottleneck: parsing speed is suboptimal.
  • Overly complex configuration, making it harder to maintain.
  • Many features are not widely used but impact performance.
  • Some features are not used at all and there is no way to enable it, but code still there.

Scope

Canon tests should work for same old and new parser
✅ Faster parsing using tidwall/gjson or utilize current lib fastjson.Get for efficient field extraction.
✅ Simplified implementation with fewer unnecessary options.
_rest column to store fields not explicitly defined in the schema.
✅ Optional system columns.
✅ Better maintainability and performance.
✅ Performance measured in benchmark
❌ Unnecessary configuration options that add complexity without significant benefit.
❌ Legacy behaviors that are too expensive to backport.

Compatibility Considerations

  • The new parser should maintain core functionality but will not guarantee full backward compatibility if a feature is too costly to support.
  • _rest and system columns must be preserved to avoid breaking existing workflows.
@laskoviymishka laskoviymishka added enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed labels Feb 8, 2025
@HemanthKR07
Copy link

Hello @laskoviymishka ,
I can give me best to fix this one, can you assign this to me. Thanks.

@laskoviymishka
Copy link
Contributor Author

@HemanthKR07 feel free to take it

@HemanthKR07
Copy link

HemanthKR07 commented Feb 10, 2025

Hello @laskoviymishka,
url
Please review and let me know if I can create a PR.
Thanks.

@laskoviymishka
Copy link
Contributor Author

Hello @laskoviymishka, url Please review and let me know if I can create a PR. Thanks.

as a starting point - looks legit, but let's try to incorporate it with benchmark tests and exists test infra.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request good first issue Good for newcomers help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants