Skip to content

improve string escape handling for run_cmd #730

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

twang817
Copy link

@twang817 twang817 commented Jun 6, 2019

While #684 addresses the real issues here, I made a small change for better handling of escapes in run_cmd.

This allowed me to use run_cmd in the following way:

$ tree .
.
├── child
│   ├── config.json
│   └── terraform.tfvars
├── plans
│   └── testplan
│       ├── v0.1
│       │   └── main.tf
│       └── v0.2
│           └── main.tf
└── terraform.tfvars

The child's terraform.tfvars is nothing special

$ cat child/terraform.tfvars
terragrunt {
    include {
        path = "${find_in_parent_folders()}"
    }
}

The config.json specifies the data the run_cmd fetches:

$ cat child/config.json
{
    "source": "testplan",
    "version": "v0.1"
}

The two plans are simple:

$ cat plans/testplan/v0.1/main.tf
output "version" {
    value = "v0.1"
}

$ cat plans/testplan/v0.2/main.tf
output "version" {
    value = "v0.2"
}

Finally, the magic is in the root terraform.tfvars:

$ cat terraform.tfvars
terragrunt {
    terraform {
        source = "${path_relative_from_include()}/plans//${run_cmd("jq", "-j", ". | \"\\(.source)/\\(.version)\"", "config.json")}/"
    }
}

If you're unfamiliar with jq, the command is:

$ jq -j '. | "\(.source)/\(.version)"' config.json
testplan/v0.1

And the output of the run:

[terragrunt] [/<snip>/child] 2019/06/06 03:50:40 Running command: terraform --version
[terragrunt] 2019/06/06 03:50:40 Reading Terragrunt config file at /<snip>/child/terraform.tfvars
[terragrunt] 2019/06/06 03:50:40 Running command: jq -j . | "\(.source)/\(.version)" config.json
testplan/v0.1[terragrunt] 2019/06/06 03:50:40 run_cmd output: [testplan/v0.1]
[terragrunt] 2019/06/06 03:50:40 Downloading Terraform configurations from file:///<snip>/plans into /<snip>/child/.terragrunt-cache/gFJBayBZFx9S0I_EdkWSPPExqVU/nsJSOJEkyeq5WgcW7pra2dH12qg using terraform init
[terragrunt] [/<snip>/child] 2019/06/06 03:50:40 Running command: terraform init -get=false -get-plugins=false -backend=false -from-module=file:///<snip>/plans -no-color /<snip>/child/.terragrunt-cache/gFJBayBZFx9S0I_EdkWSPPExqVU/nsJSOJEkyeq5WgcW7pra2dH12qg
Copying configuration from "file:///<snip>/plans"...
Terraform initialized in an empty directory!

The directory has no Terraform configuration files. You may begin working
with Terraform immediately by creating Terraform configuration files.
[terragrunt] 2019/06/06 03:50:40 Copying files from /<snip>/child into /<snip>/child/.terragrunt-cache/gFJBayBZFx9S0I_EdkWSPPExqVU/nsJSOJEkyeq5WgcW7pra2dH12qg/testplan/v0.1
[terragrunt] 2019/06/06 03:50:40 Setting working directory to /<snip>/child/.terragrunt-cache/gFJBayBZFx9S0I_EdkWSPPExqVU/nsJSOJEkyeq5WgcW7pra2dH12qg/testplan/v0.1
[terragrunt] [/<snip>/child] 2019/06/06 03:50:40 Running command: terraform init

Terraform has been successfully initialized!

You may now begin working with Terraform. Try running "terraform plan" to see
any changes that are required for your infrastructure. All Terraform commands
should now work.

If you ever set or change modules or backend configuration for Terraform,
rerun this command to reinitialize your working directory. If you forget, other
commands will detect it and remind you to do so if necessary.
[terragrunt] 2019/06/06 03:50:40 Running command: terraform apply

Apply complete! Resources: 0 added, 0 changed, 0 destroyed.

Outputs:

version = v0.1

This basically works by having run_cmd extract the source & version key from config.json and build a path starting from the plans// directory.

I believe this resolves #684, as I have both prefixes and suffixes around run_cmd. It further allows better handling of escaped quotes within the run_cmd parameters.

The PR is a bit dense with regex, so I'll try to comment inline.

Lastly, I was able to run all the tests (and even added a few) in the config directory, but I was unable to pass all tests from the root. I believe there are several tests that actually rely on AWS -- and I am unable to do so.

@@ -16,12 +17,13 @@ import (
"github.com/gruntwork-io/terragrunt/util"
)

var INTERPOLATION_PARAMETERS = `(\s*"[^"]*?"\s*,?\s*)*`
var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*\w+\(%s\)\s*\}`, INTERPOLATION_PARAMETERS))
var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"`
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crux of the change is here. Here's a snippet from regex101 to help explain the regex:
image

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In other words, \\. matches any escaped character, whereas [^\\"] matches anything except the backslash or the quote. These are OR'd and then grouped in a non-capture group. Finally, we use the kleene star to say we want zero or more of them (non-greedy). Finally, the quotes are literal.

var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*\w+\(%s\)\s*\}`, INTERPOLATION_PARAMETERS))
var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"`
var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING)
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With our INTERPOLATION_QUOTED_STRING above, we now construct a regex that will accept any number of them in a comma separate list.

The general pattern for comma separated list with at least one element: A(?:,A)*

To make it zero or more, we surround the entire thing with a non-capture group and a question mark for zero or one. This makes our pattern now look like this: (?:A(?:,A)*)?

Finally, throw in the \s* everywhere, and we have our regex that will match a parameter list with zero or more parameters.

Again, regex101:
image

var INTERPOLATION_QUOTED_STRING = `"(?:\\.|[^\\"])*?"`
var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING)
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING)
var INTERPOLATION_PARAMETERS_REGEX = regexp.MustCompile(fmt.Sprintf(`^%s$`, INTERPOLATION_PARAMETERS))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the goal is to make sure it only matches a parameter list. Since an empty string is a valid parameter list, we add a ^$ anchors around it to ensure that it matches the whole string. This allows us to use this regex to validate that the string is indeed a parameter list.

var INTERPOLATION_QUOTED_REGEX = regexp.MustCompile(INTERPOLATION_QUOTED_STRING)
var INTERPOLATION_PARAMETERS = fmt.Sprintf(`\s*(?:%s(?:\s*,\s*%s)*)?\s*`, INTERPOLATION_QUOTED_STRING, INTERPOLATION_QUOTED_STRING)
var INTERPOLATION_PARAMETERS_REGEX = regexp.MustCompile(fmt.Sprintf(`^%s$`, INTERPOLATION_PARAMETERS))
var INTERPOLATION_SYNTAX_REGEX = regexp.MustCompile(fmt.Sprintf(`\$\{\s*(\w+)\((%s)\)\s*\}`, INTERPOLATION_PARAMETERS))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally, this adds the rest of the "stuff" around a properly formatted interpolation call.

The general pattern is: \$\{(\w+)\(A)\)\}, which should match ${anything(parameter-list)}. Again, throw in a bunch of \s*s and we have our pattern.

image

Note, I capture the \w+ and the contents of the parameter list.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the regex's set up, note that we now have a regex that can accurately describe any properly formatted interpolation string in the format of ${foo_bar("a", "b", "c")}. HCL/HIL actually allows nesting, which I don't think that this will properly handle, but that's where we would have to go to a real parser as opposed to regex hackery.

Now, I no longer need to create additional regexes to help me validate or extract my strings. I should be able to:

  1. Detect an interpolation string with INTERPOLATION_SYNTAX_REGEX
  2. Extract match[1] as the function name
  3. Extract match[2] as the full parameter list
  4. Use INTERPOLATION_QUOTED_REGEX to extract all my parameters.

This is exactly what the rest of the code attempts to do.

@@ -168,7 +170,7 @@ func processMultipleInterpolationsInString(terragruntConfigString string, includ
// Given a string value from a Terragrunt configuration, parse the string, resolve any calls to helper functions using
// Resolve a single call to an interpolation function of the format ${some_function()} in a Terragrunt configuration
func resolveTerragruntInterpolation(str string, include *IncludeConfig, terragruntOptions *options.TerragruntOptions) (interface{}, error) {
matches := HELPER_FUNCTION_SYNTAX_REGEX.FindStringSubmatch(str)
matches := INTERPOLATION_SYNTAX_REGEX.FindStringSubmatch(str)
Copy link
Author

@twang817 twang817 Jun 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HELPER_FUNCTION_SYNTAX_REGEX was a generalized version of the INTERPOLATION_SYNTAX_REGEX -- presumably with capture groups. I couldn't think of a reason why I wouldn't just use INTERPOLATION_SYNTAX_REGEX after the capture groups were properly set up.

AFACT, this is a drop-in replacement.

if len(matches) < 2 {
return envVariable, errors.WithStackTrace(InvalidGetEnvParams(parameters))
}
parsed := parseParamList(parameters)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, since we're able to describe the entire parameter list, with zero or more parameters, we should just use that regex to extract all the parameters, then throw errors if it doesn't have the right number of args.

That's exactly what parseParamList does, so we just call that.

if name == "default" {
envVariable.DefaultValue = strings.TrimSpace(matches[index])
}
if len(parsed) != 2 || len(parsed[0]) == 0 {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then its just a matter of checking that we have the right number of parameters.

Incidentally, the original HELPER_FUNCTION_GET_ENV_PARAMETERS_SYNTAX_REGEX also validated that the first parameter is non-empty. So we do that here.

@@ -318,22 +310,43 @@ var listQuotedParamRegex = regexp.MustCompile(`^"([^"]*?)"\s*,\s*(.*)$`)
// foo("a", "b") -> return "a", "b", 2, nil
//
func parseOptionalQuotedParam(parameters string) (string, string, int, error) {
parsed, err := splitParamList(parameters)
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we just extract the full parameter list.

Except here, there were a few tests that called this function with an invalid parameter list. Now, in the normal flow of code, this would never happen because we're validating that up-front. But, it's simple enough to validate it again in this function.

Here, I decided to factor out the guts of parseParamList to a new function splitParamList that would return errors if the input string wasn't a valid parameter-list. This allows parseOptionalQuotedParams to return a proper error if validation fails, as on line 315 below.

return parsed[0], "", 1, nil
} else if len(parsed) == 2 {
return parsed[0], parsed[1], 2, nil
}
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, we're just managing the return value based on the number of parameters we extracted...

matches := oneQuotedParamRegex.FindStringSubmatch(trimmedParameters)
if len(matches) == 2 {
return matches[1], "", 1, nil
if !INTERPOLATION_PARAMETERS_REGEX.MatchString(trimmedParameters) {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, as mentioned above, I re-validate that the input string is a valid parameter-list. Normally, this wouldn't happen but there are a few tests that call this function directly.

if err != nil {
return []string{}, errors.WithStackTrace(InvalidStringParams(parameters))
}
trimmedArgs = append(trimmedArgs, strings.TrimSpace(v))
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise, we simply use the INTERPOLATION_QUOTED_REGEX regex to find all matches, then unquote each match, trim it, and stick it in the parameters.

}

return trimmedArgs
return parsed
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note, with splitParamList factored out, this function mainly just handles the error condition.

@@ -138,11 +138,17 @@ func TestRunCommand(t *testing.T) {
expectedErr error
}{
{
`"/bin/bash", "-c", ""echo -n foo""`,
`"/bin/bash", "-c", "echo -n foo"`,
Copy link
Author

@twang817 twang817 Jun 6, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test doesn't make much sense.

By splitting the arguments into an array, there's no need to rely on quoting to handle spaces. This is something we do on the shell command-line because the string hasn't already been split.

This means that echo -n foo is already treated as a single parameter being passed to bash. There's no need to quote it again.

In fact, if we do, we're actually saying that we want to pass the string "echo -n foo" to the bash interpreter -- which is of course an invalid command.

It just so happened to work in the old implementation because there is a unconditional removal of all quotes. With proper(ish) escape handling, this is no longer a valid test case.

Hopefully, no one was relying on this behavior.

terragruntOptionsForTest(t, homeDir),
"foo",
nil,
},
{
`"/bin/bash", "-c", "FOO=\"a test\"; echo -n $FOO"`,
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A simple test that verifies handling of escaped quotes. I didn't want to write an unit test that relies on jq.

@@ -624,8 +637,8 @@ TERRAGRUNT_HIT","")}/bar`,
nil,
terragruntOptionsForTestWithEnv(t, "/root/child/"+DefaultTerragruntConfigPath, map[string]string{"TEST_ENV_TERRAGRUNT_OTHER": "SOMETHING"}),
"",
InvalidInterpolationSyntax(`${get_env("TEST_ENV_
TERRAGRUNT_HIT","")}`),
InvalidGetEnvParams(`"TEST_ENV_
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, the newline would previously fail to even recognize as valid interpolation syntax.

Since the new parser actually handles newlines, the error that gets thrown is actually InvalidGetEnvParams

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

function interpolation in run_cmd is inconsistent
1 participant