Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add e2e test for HorizontalPodAutoscaler #615

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

karlkfi
Copy link
Contributor

@karlkfi karlkfi commented May 22, 2023

We get a bunch of support tickets involving HPA and metrics-service. This test uses both, which should help us ensure this use case works going forward.

Current related known issues:

  • metrics-service is a single replica, and when it fails or is rescheduled, API Discovery will error.
  • HPA updates its status relatively frequently, which can sometimes trigger the fight detector.

This test likely won't trigger either of those issues on kind, but might on GKE, especially autopilot. If the test is flakey tho, we'll have more motivation to fix them.

Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please ask for approval from karlkfi. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@karlkfi karlkfi force-pushed the karl-hpa-e2e branch 6 times, most recently from d101c00 to a3357b7 Compare December 20, 2024 19:28
@karlkfi karlkfi changed the title [WIP] Add e2e test for HorizontalPodAutoscaler Add e2e test for HorizontalPodAutoscaler Jan 2, 2025
@karlkfi karlkfi requested review from tiffanny29631 and removed request for nan-yu January 3, 2025 19:32
Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @karlkfi and reviewers,

I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, I'm providing this summary of the pull request's content to help you and other reviewers quickly understand the changes and intent.

This pull request, "Add e2e test for HorizontalPodAutoscaler", aims to add an end-to-end test for HorizontalPodAutoscaler (HPA) functionality, addressing support tickets related to HPA and the metrics-service. The author notes that the metrics-service's single replica nature and frequent HPA status updates can cause issues.

Here's a summary of the changes:

  • Added a new e2e test: e2e/testcases/hpa_test.go This test covers the interaction between HPA and the metrics-service. It checks for the availability of the metrics-server and deploys a sample HPA to verify its functionality. The test also verifies that the HPA's metrics remain stable after it becomes available.
  • Updated client.go: Imports for autoscalingv2 were added at lines 25 and 99 to support the new HPA test.
  • Added test data: Several YAML files were added under e2e/testdata/hpa/ to define the namespace, deployment, and HPA for the test. These include ns-helloworld.yaml, deployment-helloworld.yaml, and hpa-helloworld.yaml.
  • Added metrics-server deployment: The e2e/testdata/metrics-server/components-v0.7.2.yaml file was added or modified to deploy the metrics-server, which is a dependency for the HPA test.
  • Updated kinds.go: Added HorizontalPodAutoscaler() function at line 274 to get the canonical HorizontalPodAutoscaler GroupVersionKind.
  • Updated go.mod: Added github.com/pkg/errors v0.9.1 at line 71 and removed it at line 143.

The author acknowledges potential flakiness on GKE, particularly with autopilot, due to known issues with the metrics-service and the fight detector. This test should help identify and resolve these issues.


Here's a little haiku to lighten the mood:

Tests run, green light shows,
HPAs scale, metrics flow,
Bugs flee in the night.

Copy link

@code-review-assist code-review-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pull request adds an end-to-end test for the HorizontalPodAutoscaler (HPA) interacting with the metrics-server. This is a valuable addition given the described support tickets and potential flakiness related to these components. The test setup appears reasonable, deploying the necessary resources and verifying the HPA metrics stability. A few minor suggestions below for improvement.

I used the Effective Go style guide as a reference for this review: https://go.dev/doc/effective_go

e2e/testcases/hpa_test.go Outdated Show resolved Hide resolved
e2e/testcases/hpa_test.go Show resolved Hide resolved
e2e/testcases/hpa_test.go Outdated Show resolved Hide resolved
@karlkfi karlkfi force-pushed the karl-hpa-e2e branch 2 times, most recently from fddbb5f to 58f9fb0 Compare January 6, 2025 19:26
return errors.New("metrics have not changed")
}

// Once available, the spec.metrics should not change
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get the gist of this, however is there any more context you can add as to why spec.metrics should not change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it's simply because the hello-app doesn't do anything unless queried. https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/quickstarts/hello-app/main.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expanded the comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up question - What exactly are the assertions that this test is aiming to verify?

I see from the commit message: "HPA updates its status relatively frequently, which can sometimes trigger the fight detector." Would this get caught by this test as it's written?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mostly just a baseline compatibility test to make sure CS works with HPA and metrics-server. We frequently get customer issues related to these things for various reasons. I didn't add any more specific tests to cover known bugs, because they would just fail. But this way we have a baseline to copy and modify when those bugs are fixed.

This will just help us identify that the happy path works and continues to work as CS and GKE change.

Theres also been talk of replacing metrics-server in GKE, so this may give us advanced notice when that happens.

string(apiregistrationv1.Available), corev1.ConditionTrue))))

nt.T.Log("Deploy hello-world Deployment")
nt.Must(rootSyncGitRepo.Copy(fmt.Sprintf("%s/hpa/ns-helloworld.yaml", yamlDir), "acme/namespaces/helloworld/ns-helloworld.yaml"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: I prefer to inline small/simple k8s objects like this ns/deployment/hpa so that the test is readable/debuggable without needing to reference external yaml files

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤷 we already have this hello-app used in a few places. So I just left it consistent with those using testdata files. But I hear you.

Update hello-app to pull from GAR instead of GCR, and use the default
port.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants