Skip to content
This repository has been archived by the owner on Apr 27, 2020. It is now read-only.

Wait until ready returns success immediately #84

Open
mattdodge opened this issue Nov 15, 2019 · 4 comments
Open

Wait until ready returns success immediately #84

mattdodge opened this issue Nov 15, 2019 · 4 comments
Assignees

Comments

@mattdodge
Copy link

Is this a BUG REPORT or FEATURE REQUEST?:

  • bug

What happened:
The put step with a wait_until_ready_selector is returning success immediately. It's almost too fast for its own good!

What you expected to happen:
I expect the wait step to wait until the deployment update is complete.

How to reproduce it (as minimally and precisely as possible):
Have a kubernetes deployment with a normal RollingUpdate strategy. Use this kubernetes-resource to put changes to the deployment like so:

- put: prod-kube
  params:
    kubectl: apply -f ymls/my-app-deployment.yml
    wait_until_ready: 60
    wait_until_ready_selector: app=my-app

When this step runs I see this in the output:

+ kubectl apply -f ymls/my-app-deployment.yml
deployment.extensions/my-app configured
Waiting for pods to be ready for 60s (interval: 3s, selector: app=my-app)
Waiting for pods to be ready... (0/0)

This returns true immediately despite the fact that the new pod/replicaset hasn't actually spun up yet. It seems like resource is checking the ready status before the new pod is even created. Likely some kind of race condition with Kubernetes.

Environment:

  • Concourse CI version: 5.7.0
  • kubernetes-resource image version: 1.15
@superbrothers superbrothers self-assigned this Nov 15, 2019
@mumoshu
Copy link

mumoshu commented Nov 16, 2019

@mattdodge Hey! Thanks a lot for your detailed report.

This does seem like a race issue.

We do currently rely solely on wait_until_ready_selector to select which pods to be examined for readiness. I believe this would result in including pods from the old generation of the deployment.

A possible fix here would be to leverage the "revision" number of the deployment that is inherited down to the replicaset and pods.

  1. Look for the new "revision" of the deployment:

    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
      annotations:
        deployment.kubernetes.io/revision: "3"
    # snip
  2. Look for the replicaset generated for that revision. You could use the deployment.kubernetes.io/revision: "3" annotation in combination with the owner reference to select the replicaset:

    apiVersion: extensions/v1beta1
    kind: ReplicaSet
    metadata:
      annotations:
        deployment.kubernetes.io/desired-replicas: "2"
        deployment.kubernetes.io/max-replicas: "4"
        deployment.kubernetes.io/revision: "3"
      creationTimestamp: "2019-11-05T09:08:38Z"
      generation: 1
      labels:
        app: envoy
        component: controller
        pod-template-hash: 67dc7dd6c5
      name: envoy-67dc7dd6c5
      namespace: default
      ownerReferences:
      - apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: Deployment
        name: envoy
    # snip
  3. Use wait_until_ready_selector along with pod-template-hash or the owner reference to select pods to be count from the latest revision only.

    apiVersion: v1
    kind: Pod
    metadata:
      annotations:
      creationTimestamp: "2019-11-05T09:08:38Z"
      generateName: envoy-67dc7dd6c5-
      labels:
        app: envoy
        component: controller
        pod-template-hash: 67dc7dd6c5
      name: envoy-67dc7dd6c5-8m6hc
      namespace: default
      ownerReferences:
      - apiVersion: apps/v1
        blockOwnerDeletion: true
        controller: true
        kind: ReplicaSet
        name: envoy-67dc7dd6c5
        uid: dab0f913-ffab-11e9-92cd-025000000001
    # snip

But implementing this straight forward requires us to change the configuration syntax, maybe adding wait_until_ready_deployment: DEPLOYMENT_NAME, deprecating and removing the existing wait_until_ready_selector. I'd hope we can avoid that if possible.

@superbrothers WDYT? Did you have any reason to avoid relying on the "revision" number?

@mattdodge
Copy link
Author

I would be in favor of a wait_until_ready_deployment (or something similar) option. It's not quite as flexible as the label selector but it probably covers most of the common use cases.

If we're going that route we could probably make use of the kubectl rollout status command too, rather than digging through revision numbers and all that. This command will wait for the deployment to successfully roll out or will return an error if the timeout is hit. This is actually how I'm getting around this race condition for now. I have a pipeline YML that looks like this:

- put: prod-kube
  params:
    kubectl: apply -f ymls/my-app-deployment.yml
- put: prod-kube
  params:
    kubectl: rollout status deployment/my-app --timeout 60s

@superbrothers
Copy link
Contributor

superbrothers commented Nov 28, 2019

I'm sorry for the late reply.

But implementing this straight forward requires us to change the configuration syntax, maybe adding wait_until_ready_deployment: DEPLOYMENT_NAME, deprecating and removing the existing wait_until_ready_selector. I'd hope we can avoid that if possible.

Adding wait_util_ready_<resource> is not flexible, so I don't want to do it as much as possible.

If we're going that route we could probably make use of the kubectl rollout status command too, rather than digging through revision numbers and all that.

Yes, I think so that this is right way. However, wait_until_ready is a required parameter, so it is a breaking change to be changed to a option parameter. At this time, I recommend that you sets wait_until_ready to 0. 0 means don't wait. (I know this is too verbose...)

- put: prod-kube
  params:
    kubectl: apply -f ymls/my-app-deployment.yml
    wait_until_ready: 0
- put: prod-kube
  params:
    kubectl: rollout status deployment/my-app --timeout 60s
    wait_until_ready: 0

@superbrothers
Copy link
Contributor

I will consider to delete wait_until_ready param and bumps up the major version.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants