Skip to content

find_enrichment_table_records wildcard match #22920

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
nzxwang opened this issue Apr 21, 2025 · 0 comments · May be fixed by #23074
Open

find_enrichment_table_records wildcard match #22920

nzxwang opened this issue Apr 21, 2025 · 0 comments · May be fixed by #23074
Labels
domain: enrichment_tables Anything related to the Vector's enrichment tables type: feature A value-adding code addition that introduce new functionality.

Comments

@nzxwang
Copy link
Contributor

nzxwang commented Apr 21, 2025

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Use Cases

We would like to be able to configure the sample rate of log events based on certain tags like location, service, level, etc. but we would like to create blanket rules. To this end, we want to use an enrichment table that we can match our logs' attributes against. For example:

location,service,level,rate
usw2,login-service,*,1
*,*,ERROR,1
*,*,WARN,10
*,*,INFO,100
*,*,DEBUG,1000
*,*,*,10000

Attempted Solutions

Currently, we are calling find_enrichment_table_records once per combination of wildcards, appending them to a list and then using our custom business logic to select the best one. This is very verbose and scales by big O of combination.

_,_ = find_enrichment_table_records("sampling_rules", {"service": service, "location": location, "level": level})
_,_ = find_enrichment_table_records("sampling_rules", {"service": "*", "location": location, "level": level})
_,_ = find_enrichment_table_records("sampling_rules", {"service": service, "location": "*", "level": level})
_,_ = find_enrichment_table_records("sampling_rules", {"service": service, "location": location, "level": "*"})
_,_ = find_enrichment_table_records("sampling_rules", {"service": "*", "location": "*", "level": level})
...

Proposal

Enhance find_enrichment_table_records with an optional parameter wildcard that when specified, a row's column equal to the user provided argument is automatically matched, but the rest of the provided condition must still be matched against the rest of the columns of the row.

References

No response

Version

vector 0.46.1 (aarch64-apple-darwin 9a19e8a 2025-04-14 18:36:30.707862743)

@pront pront added type: feature A value-adding code addition that introduce new functionality. domain: enrichment_tables Anything related to the Vector's enrichment tables labels Apr 22, 2025
@nzxwang nzxwang linked a pull request May 19, 2025 that will close this issue
8 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain: enrichment_tables Anything related to the Vector's enrichment tables type: feature A value-adding code addition that introduce new functionality.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants