enhancement(match_datadog_query): add is_phrase flag to equals method by PSeitz · Pull Request #1334 · vectordotdev/vrl

PSeitz · 2025-03-11T10:44:20Z

Summary

Update the Filter trait's equals signature to include an is_phrase boolean flag.

Without that information there is no way to distinguish between phrased and non-phrased queries. They behave differently in datadog on default fields

This is for matching on default fields (typically message) in datadog. The matching behavior for phrases is different than non-phrased. In a phrase the tokens need to be in the same order. E.g.
Hello nice world => "Hello world" no match
Hello nice world => Hello world matches

Alternative Options

There is an alternative solution where the tokenizer would emit multiple tokens (only for default fields):
Hello world would expand to the equivalent of_default_:Hello AND _default_:World.
"Hello world" would expand to the equivalent of_default_:"Hello World"

Change Type

Bug fix
New feature
Non-functional (chore, refactoring, docs)
Performance

Is this a breaking change?

Yes
No

Does this PR include user facing changes?

Yes. Please add a changelog fragment based on
our guidelines.
No. A maintainer will apply the "no-changelog" label to this PR.

Checklist

Our CONTRIBUTING.md is a good starting place.
If this PR introduces changes to LICENSE-3rdparty.csv, please
run dd-rust-license-tool write and commit the changes. More details here.
For new VRL functions, please also create a sibling PR in Vector to document the new function.

Update the Filter trait's equals signature to include an `is_phrase` boolean flag. Without that information there is no way to distinguish between phrased and non-phrased queries.

pront · 2025-03-11T20:52:26Z

+        &self,
+        field: Field,
+        to_match: &str,
+        is_phrase: bool,


Which implementation needs this bool?
Also, please avoid passing flags if possible e.g. we can introduce a new phrase_equals.

This is for matching on default fields (typically message) in datadog. The matching behavior for phrases is different than non-phrased. In a phrase the tokens need to be in the same order. E.g.
Hello nice world => "Hello world" no match
Hello nice world => Hello world matches

Btw. there is an alternative solution where the tokenizer would emit multiple tokens.
Hello world would expand to the equivalent of_default_:Hello AND _default_:World.
"Hello world" would expand to the equivalent of_default_:"Hello World"

I considered adding a new method phrase_equals, but wasn't sure, since this behavior applies only to default fields.

updated to introduce phrase fn callback

pront · 2025-03-17T17:34:34Z

Thanks @PSeitz, this looks better now. Correct me if I am wrong, but this a breaking change? Since the filter matcher is now stricter.

bruceg · 2025-03-17T17:55:44Z

+            Field::Default(_) => {
+                let re = word_regex(phrase);
+                Ok(resolve_value(
+                    buf,
+                    Run::boxed(move |value| match value {
+                        Value::Bytes(val) => re.is_match(&String::from_utf8_lossy(val)),
+                        _ => false,
+                    }),
+                ))
+            }


How is this different from the condition in equals below? It looks word-for-word identical but maybe I missed something. I'd also like to see some unit test cases that demonstrate the different behavior.

It is the same, since I didn't know the impact of the change, so currently it's a non-breaking behavior.
Currently the non-phrase behavior is also behaving incorrectly, they both would need to be adjusted to include tokenization (and probably some adaption on the query parser)

I implemented it partially here https://github.com/DataDog/pomsky/pull/79. For full compatibility I'd need to have a closer look on how percolation tokenizes.

20agbekodo · 2025-03-17T18:36:39Z

Thanks @PSeitz, this looks better now. Correct me if I am wrong, but this a breaking change? Since the filter matcher is now stricter.

I think yes, but nothing disruptive with what's done in datadog (we want to get more in line with the datadog log explorer behaviour)

enhancement(filter): add is_phrase flag to equals method

0a7999e

Update the Filter trait's equals signature to include an `is_phrase` boolean flag. Without that information there is no way to distinguish between phrased and non-phrased queries.

pront reviewed Mar 11, 2025

View reviewed changes

introduce fn phrase callback

0b0a095

PSeitz requested a review from pront March 17, 2025 08:27

pront changed the title ~~enhancement(filter): add is_phrase flag to equals method~~ enhancement(match_datadog_query): add is_phrase flag to equals method Mar 17, 2025

bruceg reviewed Mar 17, 2025

View reviewed changes

pront added the meta: awaiting author Pull requests that are awaiting their author. label Jun 24, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

enhancement(match_datadog_query): add is_phrase flag to equals method#1334

enhancement(match_datadog_query): add is_phrase flag to equals method#1334
PSeitz wants to merge 2 commits intovectordotdev:mainfrom
PSeitz:main

PSeitz commented Mar 11, 2025 •

edited

Loading

Uh oh!

pront Mar 11, 2025

Uh oh!

PSeitz Mar 12, 2025 •

edited

Loading

Uh oh!

PSeitz Mar 17, 2025

Uh oh!

pront commented Mar 17, 2025

Uh oh!

bruceg Mar 17, 2025 •

edited

Loading

Uh oh!

PSeitz Mar 18, 2025

Uh oh!

20agbekodo commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

PSeitz commented Mar 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Alternative Options

Change Type

Is this a breaking change?

Does this PR include user facing changes?

Checklist

Uh oh!

pront Mar 11, 2025

Choose a reason for hiding this comment

Uh oh!

PSeitz Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PSeitz Mar 17, 2025

Choose a reason for hiding this comment

Uh oh!

pront commented Mar 17, 2025

Uh oh!

bruceg Mar 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

PSeitz Mar 18, 2025

Choose a reason for hiding this comment

Uh oh!

20agbekodo commented Mar 17, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

PSeitz commented Mar 11, 2025 •

edited

Loading

PSeitz Mar 12, 2025 •

edited

Loading

bruceg Mar 17, 2025 •

edited

Loading