-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: improve regex validaiton message #5447
base: main
Are you sure you want to change the base?
ENH: improve regex validaiton message #5447
Conversation
Signed-off-by: George Chen <[email protected]>
return grokCompiler.compile(item, grokProcessorConfig.isNamedCapturesOnly()); | ||
} catch (IllegalArgumentException e) { | ||
throw new RuntimeException( | ||
String.format("Invalid regex pattern in match.%s", entry.getKey()), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we want to stop the stream processing if we encounter IllegalArugmentException? or we want to collect the errors but continue with stream processing?
Also, in the exception we are throwing, we are attaching the original e
. Depending on whoever is handling this exception, it could potentially print the entire stacktrace for each failure and create a lot of noise in the logs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be a config validation error which means data prepper will crash at runtime. The change I made is to increase clarity on the failure message:
previous:
s3-log-pipeline.processor.grok: caused by: Exception thrown from plugin "grok". caused by: No definition for key 'ses_logs' found, aborting
now
2025-02-20T13:14:40,784 [main] ERROR org.opensearch.dataprepper.core.validation.LoggingPluginErrorsHandler - 1. waf-access-log-pipeline.processor.grok: caused by: Exception thrown from plugin "grok". caused by: Invalid regex pattern in match.message caused by: No definition for key 'CUSTOM_PATTERN_FROM_FILE' found, aborting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. It is not a data processing error 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one thought, instead of RuntimeException, we could possibly throw InvalidPluginConfigurationException
to be more explicit
Signed-off-by: George Chen <[email protected]>
} | ||
|
||
private boolean validateRegex(final String pattern) { | ||
if (pattern != null && !Objects.equals(pattern, "")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Empty regex pattern or null
regex pattern is valid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this config context it is valid.
|
||
private static Stream<Arguments> provideFromKeyRegexAndIsValid() { | ||
return Stream.of( | ||
Arguments.of("", true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can add one null
case too here.
return validateRegex(delimiterRegex); | ||
} | ||
|
||
private boolean validateRegex(final String pattern) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we got this method repeated multiple times. Probably a good idea to keep this in a static util class?
|
||
private static Stream<Arguments> provideDelimiterRegexAndIsValid() { | ||
return Stream.of( | ||
Arguments.of("", true), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding one null
case would help here too
Signed-off-by: George Chen <[email protected]>
pluginMetrics, grokProcessorConfig, expressionEvaluator)); | ||
assertThat("No definition for key 'CUSTOMBIRTHDAYPATTERN' found, aborting", equalTo(throwable.getMessage())); | ||
assertThat(throwable.getCause(), instanceOf(IllegalArgumentException.class)); | ||
assertThat("No definition for key 'CUSTOMBIRTHDAYPATTERN' found, aborting", equalTo(throwable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we remove the "aborting" part of this message? Small but seems a little weird for users to get
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is from java-grok: https://github.com/thekrakken/java-grok/blob/master/src/main/java/io/krakens/grok/api/GrokCompiler.java#L177. It would be fragile to overwrite
Signed-off-by: George Chen <[email protected]>
Signed-off-by: George Chen <[email protected]>
Signed-off-by: George Chen <[email protected]>
Signed-off-by: George Chen <[email protected]>
Signed-off-by: George Chen <[email protected]>
Description
This PR
Issues Resolved
Resolves #[Issue number to be closed when this PR is merged]
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.