-
-
Notifications
You must be signed in to change notification settings - Fork 1k
improve string escape handling for run_cmd #730
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@@ -16,12 +17,13 @@ import ( | |||
"github.com/gruntwork-io/terragrunt/util" | |||
) | |||
|
|||
var INTERPOLATION_PARAMETERS = `(\s*"[^"]*?"\s*,?\s*)*` | |||
var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*\w+\(%s\)\s*\}`, INTERPOLATION_PARAMETERS)) | |||
var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In other words, \\.
matches any escaped character, whereas [^\\"]
matches anything except the backslash or the quote. These are OR'd and then grouped in a non-capture group. Finally, we use the kleene star to say we want zero or more of them (non-greedy). Finally, the quotes are literal.
var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*\w+\(%s\)\s*\}`, INTERPOLATION_PARAMETERS)) | ||
var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"` | ||
var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING) | ||
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With our INTERPOLATION_QUOTED_STRING
above, we now construct a regex that will accept any number of them in a comma separate list.
The general pattern for comma separated list with at least one element: A(?:,A)*
To make it zero or more, we surround the entire thing with a non-capture group and a question mark for zero or one. This makes our pattern now look like this: (?:A(?:,A)*)?
Finally, throw in the \s*
everywhere, and we have our regex that will match a parameter list with zero or more parameters.
var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"` | ||
var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING) | ||
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING) | ||
var INTERPOLATION_PARAMETERS_REGEX = regexp.MustCompile(fmt.Sprintf(`^%s$`, INTERPOLATION_PARAMETERS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the goal is to make sure it only matches a parameter list. Since an empty string is a valid parameter list, we add a ^$
anchors around it to ensure that it matches the whole string. This allows us to use this regex to validate that the string is indeed a parameter list.
var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING) | ||
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING) | ||
var INTERPOLATION_PARAMETERS_REGEX = regexp.MustCompile(fmt.Sprintf(`^%s$`, INTERPOLATION_PARAMETERS)) | ||
var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*(\w+)\((%s)\)\s*\}`, INTERPOLATION_PARAMETERS)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, this adds the rest of the "stuff" around a properly formatted interpolation call.
The general pattern is: \$\{(\w+)\(A)\)\}
, which should match ${anything(parameter-list)}
. Again, throw in a bunch of \s*
s and we have our pattern.
Note, I capture the \w+ and the contents of the parameter list.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the regex's set up, note that we now have a regex that can accurately describe any properly formatted interpolation string in the format of ${foo_bar("a", "b", "c")}
. HCL/HIL actually allows nesting, which I don't think that this will properly handle, but that's where we would have to go to a real parser as opposed to regex hackery.
Now, I no longer need to create additional regexes to help me validate or extract my strings. I should be able to:
- Detect an interpolation string with
INTERPOLATION_SYNTAX_REGEX
- Extract match[1] as the function name
- Extract match[2] as the full parameter list
- Use
INTERPOLATION_QUOTED_REGEX
to extract all my parameters.
This is exactly what the rest of the code attempts to do.
@@ -168,7 +170,7 @@ func processMultipleInterpolationsInString(terragruntConfigString string, includ | |||
// Given a string value from a Terragrunt configuration, parse the string, resolve any calls to helper functions using | |||
// Resolve a single call to an interpolation function of the format ${some_function()} in a Terragrunt configuration | |||
func resolveTerragruntInterpolation(str string, include *IncludeConfig, terragruntOptions *options.TerragruntOptions) (interface{}, error) { | |||
matches := HELPER_FUNCTION_SYNTAX_REGEX.FindStringSubmatch(str) | |||
matches := INTERPOLATION_SYNTAX_REGEX.FindStringSubmatch(str) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The HELPER_FUNCTION_SYNTAX_REGEX
was a generalized version of the INTERPOLATION_SYNTAX_REGEX
-- presumably with capture groups. I couldn't think of a reason why I wouldn't just use INTERPOLATION_SYNTAX_REGEX
after the capture groups were properly set up.
AFACT, this is a drop-in replacement.
if len(matches) < 2 { | ||
return envVariable, errors.WithStackTrace(InvalidGetEnvParams(parameters)) | ||
} | ||
parsed := parseParamList(parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, since we're able to describe the entire parameter list, with zero or more parameters, we should just use that regex to extract all the parameters, then throw errors if it doesn't have the right number of args.
That's exactly what parseParamList
does, so we just call that.
if name == "default" { | ||
envVariable.DefaultValue = strings.TrimSpace(matches[index]) | ||
} | ||
if len(parsed) != 2 || len(parsed[0]) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then its just a matter of checking that we have the right number of parameters.
Incidentally, the original HELPER_FUNCTION_GET_ENV_PARAMETERS_SYNTAX_REGEX
also validated that the first parameter is non-empty. So we do that here.
@@ -318,22 +310,43 @@ var listQuotedParamRegex = regexp.MustCompile(`^"([^"]*?)"\s*,\s*(.*)$`) | |||
// foo("a", "b") -> return "a", "b", 2, nil | |||
// | |||
func parseOptionalQuotedParam(parameters string) (string, string, int, error) { | |||
parsed, err := splitParamList(parameters) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, we just extract the full parameter list.
Except here, there were a few tests that called this function with an invalid parameter list. Now, in the normal flow of code, this would never happen because we're validating that up-front. But, it's simple enough to validate it again in this function.
Here, I decided to factor out the guts of parseParamList
to a new function splitParamList
that would return errors if the input string wasn't a valid parameter-list. This allows parseOptionalQuotedParams
to return a proper error if validation fails, as on line 315 below.
return parsed[0], "", 1, nil | ||
} else if len(parsed) == 2 { | ||
return parsed[0], parsed[1], 2, nil | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, we're just managing the return value based on the number of parameters we extracted...
matches := oneQuotedParamRegex.FindStringSubmatch(trimmedParameters) | ||
if len(matches) == 2 { | ||
return matches[1], "", 1, nil | ||
if !INTERPOLATION_PARAMETERS_REGEX.MatchString(trimmedParameters) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, as mentioned above, I re-validate that the input string is a valid parameter-list. Normally, this wouldn't happen but there are a few tests that call this function directly.
if err != nil { | ||
return []string{}, errors.WithStackTrace(InvalidStringParams(parameters)) | ||
} | ||
trimmedArgs = append(trimmedArgs, strings.TrimSpace(v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Otherwise, we simply use the INTERPOLATION_QUOTED_REGEX
regex to find all matches, then unquote each match, trim it, and stick it in the parameters.
} | ||
|
||
return trimmedArgs | ||
return parsed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note, with splitParamList
factored out, this function mainly just handles the error condition.
@@ -138,11 +138,17 @@ func TestRunCommand(t *testing.T) { | |||
expectedErr error | |||
}{ | |||
{ | |||
`"/bin/bash", "-c", ""echo -n foo""`, | |||
`"/bin/bash", "-c", "echo -n foo"`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test doesn't make much sense.
By splitting the arguments into an array, there's no need to rely on quoting to handle spaces. This is something we do on the shell command-line because the string hasn't already been split.
This means that echo -n foo
is already treated as a single parameter being passed to bash
. There's no need to quote it again.
In fact, if we do, we're actually saying that we want to pass the string "echo -n foo"
to the bash interpreter -- which is of course an invalid command.
It just so happened to work in the old implementation because there is a unconditional removal of all quotes. With proper(ish) escape handling, this is no longer a valid test case.
Hopefully, no one was relying on this behavior.
terragruntOptionsForTest(t, homeDir), | ||
"foo", | ||
nil, | ||
}, | ||
{ | ||
`"/bin/bash", "-c", "FOO=\"a test\"; echo -n $FOO"`, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A simple test that verifies handling of escaped quotes. I didn't want to write an unit test that relies on jq
.
@@ -624,8 +637,8 @@ TERRAGRUNT_HIT","")}/bar`, | |||
nil, | |||
terragruntOptionsForTestWithEnv(t, "/root/child/"+DefaultTerragruntConfigPath, map[string]string{"TEST_ENV_TERRAGRUNT_OTHER": "SOMETHING"}), | |||
"", | |||
InvalidInterpolationSyntax(`${get_env("TEST_ENV_ | |||
TERRAGRUNT_HIT","")}`), | |||
InvalidGetEnvParams(`"TEST_ENV_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here, the newline would previously fail to even recognize as valid interpolation syntax.
Since the new parser actually handles newlines, the error that gets thrown is actually InvalidGetEnvParams
While #684 addresses the real issues here, I made a small change for better handling of escapes in run_cmd.
This allowed me to use run_cmd in the following way:
The child's terraform.tfvars is nothing special
The config.json specifies the data the run_cmd fetches:
The two plans are simple:
Finally, the magic is in the root terraform.tfvars:
If you're unfamiliar with jq, the command is:
And the output of the run:
This basically works by having run_cmd extract the source & version key from config.json and build a path starting from the plans// directory.
I believe this resolves #684, as I have both prefixes and suffixes around run_cmd. It further allows better handling of escaped quotes within the run_cmd parameters.
The PR is a bit dense with regex, so I'll try to comment inline.
Lastly, I was able to run all the tests (and even added a few) in the config directory, but I was unable to pass all tests from the root. I believe there are several tests that actually rely on AWS -- and I am unable to do so.