Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BranchPlanner] Is it possible to generate a Plan in Json format in the BranchPlanner flow? #1228

Open
joaoarthurv opened this issue Mar 5, 2024 · 10 comments
Labels

Comments

@joaoarthurv
Copy link

Good afternoon everyone, I hope you're all doing well.

I'm working on a project to set up a GitOps workflow, and after some research, I've chosen the following tools (Flux + TFController (BranchPlanner) + Jenkins).

The idea is that when a PR is opened, the BranchPlanner is executed, and it should be possible to capture the plan in Json format. This Json-formatted plan will be captured by Jenkins, where I'll perform some validations.

While studying the tool and following the documentation, I noticed that when I open a PR, temporary PLANs are generated in ConfigMap, along with a secret related to the PLAN and another related to the TFState. So far, from what I could tell, neither of these is in Json format.

Currently, is it possible to capture the plan in Json format in the BranchPlanner flow?


I've noticed that if I merge the PR and the Terraform has the storeReadablePlan: json attribute enabled, a plan is created in the secret in .json format. However, since I expect this Json before the Merge, this feature doesn't fit well with what I need.

If anyone has any suggestions, they are very welcome.

I appreciate the support in advance.


apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
  name: pod-test
  namespace: tf-test
spec:
  serviceAccountName: tf-test
  interval: 3m
  approvePlan: auto
  branchPlanner:
    enablePathScope: true
  destroyResourcesOnDeletion: true
  refreshBeforeApply: true
  storeReadablePlan: json
  backendConfig:
    secretSuffix: pod-test
  path: ./project-test/infra/poc-tf/tf
  sourceRef:
    kind: GitRepository
    name: tf-test
    namespace: tf-test
resource "kubernetes_pod" "example" {
  metadata {
    name = "pod-test-json-validate-pr"
    namespace = "tf-test"
  }

  spec {
    container {
      image = "nginx:latest"
      name  = "nginx-container"
    }
  }
}
@chanwit
Copy link
Collaborator

chanwit commented Mar 6, 2024

Hi @joaoarthurv

Nice to see you're using the controller + a Jenkins pipeline.

We have a feature to externally validate the tfplan via a Web hook, which could be in your Jenkins system for example. But I haven't tested this Web hook feature in the Branch Planner environment yet.

@yitsushi might have some opinions

@chanwit
Copy link
Collaborator

chanwit commented Mar 6, 2024

@yitsushi
Copy link
Collaborator

yitsushi commented Mar 6, 2024

I think we wipe out the webhook part in Branch Planner, if we don't that can cause issues as I don't think we updated that part at all for Branch Planner and I don't know if we send enough information and the recipient can distinguish calls from stable and branch planner calls. If the request contains the resource name and branch name, they may be able to distinguish the two kind of requests and should be fine, but still sounds a hacky solution.

For the json output, right now the branch planner has the human readable flag enabled hard-coded and no way to change that at the moment. If Jenkins can read secrets from kubernetes, it is possible to read data from the secret, but I think that store only the machine format and the human format.

@joaoarthurv
Copy link
Author

First of all, thank you for the response, @chanwit and @yitsushi .

Understood. I cloned the tofu-controller project and I'm studying the workflows here. Indeed, so far, in the Branchplanner flow, as @yitsushi mentioned, Branchplanner stores only the human format in the configmap, and in the secret, it saves the machine format.

I'm studying the flow more and considering the possibility of including this feature: (In addition to creating the files related to the current flow, creating a temporary secret or configmap with the Plan also in Json format and not just in machine and human formats) maybe in a fork to test it here, and share it with you to see what you think.

In the project I'm working on, we already have a series of Policies configured in an internal API separated from the TFController solution. We integrated them into our old workflow. The idea would be to leverage this internal API for now, which already has all the configured policies. The only hurdle I'm facing is that this API expects the Plan in Json as a contract. If TFController already had this functionality, it would fit without any hurdles.

@chanwit
Copy link
Collaborator

chanwit commented Mar 6, 2024

Code we're referring to is here:

op, err := controllerutil.CreateOrUpdate(ctx, s.clusterClient, tf, func() error {
spec := originalTF.Spec.DeepCopy()
spec.SourceRef.Name = source.Name
spec.SourceRef.Namespace = source.Namespace
// DestroyResourcesOnDeletion must be false, otherwise plan deletion will destroy resources
spec.DestroyResourcesOnDeletion = false
spec.PlanOnly = true
spec.StoreReadablePlan = "human"
// We don't need to examine or use the outputs of the plan
spec.WriteOutputsToSecret = nil
spec.ApprovePlan = ""
spec.Force = false
// Support branch planning for Terraform Cloud
// By using local state and a local backend for the branch plan object
if spec.Cloud != nil || spec.CliConfigSecretRef != nil {
spec.Cloud = nil
spec.CliConfigSecretRef = nil
}
if spec.BackendConfig == nil {
spec.BackendConfig = &infrav1.BackendConfigSpec{
SecretSuffix: originalTF.Name,
InClusterConfig: true,
}
}
tf.Spec = *spec
tf.SetLabels(branchLabels)
return nil
})

I didn't see any codes that reset the webhook setting. Maybe it still would work, if you wanted to try.

@joaoarthurv
Copy link
Author

Hello @chanwit !

I'm testing the Webhook configuration over here, I made this setup:

I spun up the local Jenkins, exposed the webhook at a public endpoint.

However, I received this error on TFController: (code that broke)

failed to load webhook TLS certificate: open /etc/certs/<host-name>/tls.crt: no such file or directory

My assumption is that I need to create a volumeMount in the TFController pod, I'm heading in that direction. But if you have another suggestion, it will be welcome.


apiVersion: infra.contrib.fluxcd.io/v1alpha2
kind: Terraform
metadata:
  name:tf-test
  namespace: tf-test
spec:
  serviceAccountName: tf-test
  interval: 3m
  approvePlan: auto
  branchPlanner:
    enablePathScope: true
  destroyResourcesOnDeletion: true
  refreshBeforeApply: true
  storeReadablePlan: json
  backendConfig:
    secretSuffix: pod-test
  path: ./tf-test/infra/poc-tf/tf
  webhooks:
    - stage: post-planning
      url: https://<jenkins_url>/generic-webhook-trigger/invoke?token=test-tf-controller-token
      testExpression: "${{ .passed }}"
      errorMessageTemplate: "Violation: ${{ (index (index .violations 0).occurrences 0).message }}"
  writeOutputsToSecret:
    name: pod-tf-controller-outputs
  sourceRef:
    kind: GitRepository
    name: tf-test
    namespace: tf-test

@chanwit
Copy link
Collaborator

chanwit commented Mar 7, 2024

Hi @joaoarthurv

Yep, you would also skip TLS verification for testing purpose or bind your cert and key files to the controller pod at the location suggested by the error message.

disableWebhookTLSVerification := os.Getenv("DISABLE_WEBHOOK_TLS_VERIFY") == "1"
for _, webhook := range hooks {
log.Info("processing post-planning webhook", "webhook", webhook.URL)
// We skip webhook if it's not enabled
if webhook.IsEnabled() == false {
continue
}
log.Info("webhook is enabled, processing")
payloadBytes, err := r.prepareWebhookPayload(terraform, runnerClient, webhook.PayloadType, tfInstance)
if err != nil {
err = fmt.Errorf("failed to prepare webhook payload: %w", err)
return terraform, err
}
log.Info("webhook payload prepared")
cli := cleanhttp.DefaultClient()
if disableWebhookTLSVerification == false {
log.Info("webhook TLS verification is enabled")
// parse webhook.URL and get the server name
u, err := url.Parse(webhook.URL)
if err != nil {
err = fmt.Errorf("failed to parse webhook URL: %w", err)
return terraform, err
}
log.Info("webhook URL parsed", "host", u.Host)
caCertPath := "/etc/certs/" + u.Hostname() + "/ca.crt"
caCertPool := x509.NewCertPool()
caCert, err := ioutil.ReadFile(caCertPath)
if err == nil {
caCertPool.AppendCertsFromPEM(caCert)
}
log.Info("webhook CA cert loaded", "path", caCertPath)
tlsCertPath := "/etc/certs/" + u.Hostname() + "/tls.crt"
tlsKeyPath := "/etc/certs/" + u.Hostname() + "/tls.key"
certificate, err := tls.LoadX509KeyPair(tlsCertPath, tlsKeyPath)

@joaoarthurv
Copy link
Author

Hey @chanwit ,

Thank you for your tip. I followed by adding the environment variable DISABLE_WEBHOOK_TLS_VERIFY as true, and it worked.

The behavior was that the tool sends the body containing a tfplan in JSON, as expected, and that's amazing. However, there is an inconvenience: the tool continues to send an event to the webhook with each reconcile.

In the project I'm working on, Jenkins should only scan the project when a PR is opened. In the case of the existing webhook tool in the tf-controller, Jenkins is triggered with each reconcile made in the configured repository.

It was close that the tool didn't meet what I need.

Do you think it would be too challenging to implement a temporary Plan JSON file as output in BranchPlanner?

@yitsushi
Copy link
Collaborator

yitsushi commented Mar 7, 2024

Do you think it would be too challenging to implement a temporary Plan JSON file as output in BranchPlanner?

Kind of... not challenging, but can't call it "temporary".

tl;dr

Without changing the CRD, the Polling service, and the Informer, we can't get that behaviour without disturbing existing functionality.

Explanation

The code on the runner has an explicit "you can have only one of them":

if r.terraform.Spec.StoreReadablePlan == "json" {
planObj, err := r.tfShowPlanFile(ctx, TFPlanName)
if err != nil {
log.Error(err, "unable to get the plan output for json")
return nil, err
}
jsonBytes, err := json.Marshal(planObj)
if err != nil {
log.Error(err, "unable to marshal the plan to json")
return nil, err
}
if err := r.writePlanAsSecret(ctx, req.Name, req.Namespace, log, planId, jsonBytes, ".json", req.Uuid); err != nil {
return nil, err
}
} else if r.terraform.Spec.StoreReadablePlan == "human" {
rawOutput, err := r.tfShowPlanFileRaw(ctx, TFPlanName)
if err != nil {
log.Error(err, "unable to get the plan output for human")
return nil, err
}
if err := r.writePlanAsConfigMap(ctx, req.Name, req.Namespace, log, planId, rawOutput, "", req.Uuid); err != nil {
return nil, err
}
}

That means, we would need an extra field on the resource that tells the system to use json, but in that case the json content would be sent to the PR as comment as the Informer only reads the plan from the resource where we saved and sends it to github as it is (with a little bit of sugar coating).

I think the ideal solution would be to save both human readable and json plan and set a field to the "requested" format. So if a TF resource indicates it wants a "human" plan:

  • Now: save plan in human readable format
  • My suggestion: save both json and human readable formats, and save an extra field with value "human".

And that's not a temporary quick-fix. I can't think any solutions right now that wouldn't require changing CRD.

  • My suggested route wouldn't change the CRD, but I think the reason why we don't save both is simply resource limits (like Destroy failing because plan is too big for a Secret object #536). It feels that would cause a lot more trouble, and basically invalidates that route too.
  • Adding an extra field for Branch Planner to use json output would require a CRD update and a lot of extra tests to make sure it outputs human formats is more cases. I wouldn't call it "a temporary fix".
  • We could change the behaviour of the Branch Planner without changing CRD, but that may upset some people. We could say "we create a comment with a format defined on the original resource", so basically without this:
    spec.StoreReadablePlan = "human"
    . But with this, if JSON is used for automation on the main branch, all PR comments would be JSON too, which is (I think) not what most people want.

@yitsushi
Copy link
Collaborator

yitsushi commented Mar 8, 2024

Suggestion: I would like to move that issue to a discussion under "Ideas".

That would make the discussion a bit easier with more ideas in the future. This issue is a very good discussion about a feature, but it's not actionable in this form. As a result we may file multiple issues, but the way we can add this functionality can't be determined now. We may have to think of other solutions.

Most likely we can add an extra field to the CRD here:

type BranchPlanner struct {
// EnablePathScope specifies if the Branch Planner should or shouldn't check
// if a Pull Request has changes under `.spec.path`. If enabled extra
// resources will be created only if there are any changes in terraform files.
// +optional
EnablePathScope bool `json:"enablePathScope"`
}
. But still questionable: should we create two plan secrets (that's a big change in the runner too), or put JSON as comment in those cases (no matter how ugly it is)?

// and GitHub Issues are very hard to follow if we have more than one threads, I hope one they we can have similar to code comments/discussions but for simple comments under an issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Todo
Development

No branches or pull requests

3 participants