-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add e2e test for HorizontalPodAutoscaler #615
base: main
Are you sure you want to change the base?
Conversation
e46f4af
to
b018894
Compare
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
d101c00
to
a3357b7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @karlkfi and reviewers,
I'm currently reviewing this pull request and will provide my feedback in a few minutes. In the meantime, I'm providing this summary of the pull request's content to help you and other reviewers quickly understand the changes and intent.
This pull request, "Add e2e test for HorizontalPodAutoscaler", aims to add an end-to-end test for HorizontalPodAutoscaler (HPA) functionality, addressing support tickets related to HPA and the metrics-service. The author notes that the metrics-service's single replica nature and frequent HPA status updates can cause issues.
Here's a summary of the changes:
- Added a new e2e test:
e2e/testcases/hpa_test.go
This test covers the interaction between HPA and the metrics-service. It checks for the availability of the metrics-server and deploys a sample HPA to verify its functionality. The test also verifies that the HPA's metrics remain stable after it becomes available. - Updated
client.go
: Imports forautoscalingv2
were added at lines 25 and 99 to support the new HPA test. - Added test data: Several YAML files were added under
e2e/testdata/hpa/
to define the namespace, deployment, and HPA for the test. These includens-helloworld.yaml
,deployment-helloworld.yaml
, andhpa-helloworld.yaml
. - Added metrics-server deployment: The
e2e/testdata/metrics-server/components-v0.7.2.yaml
file was added or modified to deploy the metrics-server, which is a dependency for the HPA test. - Updated
kinds.go
: AddedHorizontalPodAutoscaler()
function at line 274 to get the canonical HorizontalPodAutoscaler GroupVersionKind. - Updated
go.mod
: Addedgithub.com/pkg/errors v0.9.1
at line 71 and removed it at line 143.
The author acknowledges potential flakiness on GKE, particularly with autopilot, due to known issues with the metrics-service and the fight detector. This test should help identify and resolve these issues.
Here's a little haiku to lighten the mood:
Tests run, green light shows,
HPAs scale, metrics flow,
Bugs flee in the night.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This pull request adds an end-to-end test for the HorizontalPodAutoscaler (HPA) interacting with the metrics-server. This is a valuable addition given the described support tickets and potential flakiness related to these components. The test setup appears reasonable, deploying the necessary resources and verifying the HPA metrics stability. A few minor suggestions below for improvement.
I used the Effective Go style guide as a reference for this review: https://go.dev/doc/effective_go
fddbb5f
to
58f9fb0
Compare
e2e/testcases/hpa_test.go
Outdated
return errors.New("metrics have not changed") | ||
} | ||
|
||
// Once available, the spec.metrics should not change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get the gist of this, however is there any more context you can add as to why spec.metrics
should not change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, it's simply because the hello-app doesn't do anything unless queried. https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/blob/main/quickstarts/hello-app/main.go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Expanded the comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Follow up question - What exactly are the assertions that this test is aiming to verify?
I see from the commit message: "HPA updates its status relatively frequently, which can sometimes trigger the fight detector." Would this get caught by this test as it's written?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is mostly just a baseline compatibility test to make sure CS works with HPA and metrics-server. We frequently get customer issues related to these things for various reasons. I didn't add any more specific tests to cover known bugs, because they would just fail. But this way we have a baseline to copy and modify when those bugs are fixed.
This will just help us identify that the happy path works and continues to work as CS and GKE change.
Theres also been talk of replacing metrics-server in GKE, so this may give us advanced notice when that happens.
string(apiregistrationv1.Available), corev1.ConditionTrue)))) | ||
|
||
nt.T.Log("Deploy hello-world Deployment") | ||
nt.Must(rootSyncGitRepo.Copy(fmt.Sprintf("%s/hpa/ns-helloworld.yaml", yamlDir), "acme/namespaces/helloworld/ns-helloworld.yaml")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor nit: I prefer to inline small/simple k8s objects like this ns/deployment/hpa so that the test is readable/debuggable without needing to reference external yaml files
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤷 we already have this hello-app used in a few places. So I just left it consistent with those using testdata files. But I hear you.
Update hello-app to pull from GAR instead of GCR, and use the default port.
We get a bunch of support tickets involving HPA and metrics-service. This test uses both, which should help us ensure this use case works going forward.
Current related known issues:
This test likely won't trigger either of those issues on kind, but might on GKE, especially autopilot. If the test is flakey tho, we'll have more motivation to fix them.