-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
set webhooks failure policy to Ignore #69
set webhooks failure policy to Ignore #69
Conversation
I think it would be useful to state in the commit message the recently reported mode of failure: macpool prevented the upgrade of OpenShiftSDN. starting the SDN Pod requires the macpool, but the macpool uses SDN addresses, which are unavailable when the SDN is down. I suspect we have to add a test for SDN recovery when macpool is enabled. |
can we change the PR to ignore only on the pods? |
I clearly recall a promise: "on the next blocker bug we go to Ignore". It was assertive with no questions ;-) I'm all for testing granular changes, and solving all bugs, but I think that we have waited enough. I vote for Ignore everywhere - but we should file an Issue to track a proper solution. |
6a24485
to
c5b1f8e
Compare
@dankenigsberg commit message updated, issue opened #70 |
@dankenigsberg @SchSeba lot of our tests expect failure to happen. We are blocking not only cases where Kubernetes fails to reach the webhook, but also cases where the validation simply fails. |
Oh, that's bad. I suspect there is no IgnoreOnlyIfNonresponsive. I don't see a way around this. Currently we can:
Earlier I voted against the third option, but it does allow us to save SOME face. Can we reconsider it? Either |
I will go over the test we'd need to skip tomorrow to see if it makes sense. |
+1 on the "ignore only Pod failures and skip some tests" |
@SchSeba would it be possible register the webhook so that it ignores |
@dankenigsberg we looked into it. You can (de-)select Namespaces only based on their labels, not names. And AFAIK, openshift-* namespaces don't have any specific label. Worth double-checking though. |
@dankenigsberg we can list all namespaces we want to ignore. However, I think we should just set ignore (and revert shortly after release). |
We can't do it. For now we can only specify a label on a namespace that if it exist we skip or we enforce. |
We can run on every start on all the namespaces check for openshift-* and update the namespace with a label but we can't control if someone (operator) remove or update that namespace labels |
Go go go, then. |
c5b1f8e
to
32d125b
Compare
@dankenigsberg @SchSeba please review |
tests/virtual_machines_test.go
Outdated
@@ -252,6 +252,8 @@ var _ = Describe("Virtual Machines", func() { | |||
//2165 | |||
Context("when trying to create a VM after a MAC address has just been released duo to a VM deletion", func() { | |||
It("should re-use the released MAC address for the creation of the new VM and not return an error", func() { | |||
Skip("Failures on Pod reconciliation are merely ignored https://github.com/K8sNetworkPlumbingWG/kubemacpool/pull/69") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test is for VM why we need to skip it if we change the polocy for pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only one that failed with Ignore policy on pod.
32d125b
to
233bbc3
Compare
We are still facing many issues with inaccessible webhook blocking creation of VMs and especially pods. Let's make our failure policy ignore for now and investigate mentioned issues only in test environments for now. One of these cases happened recently. Update of OpenShiftSDN prevented KubeMacPool from reconciling and ended up in a dead-lock, where SDN pods were trying to reach out to KubeMacPool disconnected from the network. Signed-off-by: Petr Horacek <[email protected]>
233bbc3
to
0cec651
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
the error is not related to this PR
We are still facing many issues with inaccessible webhook blocking
creation of VMs and especially pods. Let's make our failure policy
ignore for now and investigate mentioned issues only in test
environments for now.