Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ProgressDeadlineExceeded cannot be extended #23883

Open
syzmekk opened this issue Sep 30, 2019 · 3 comments
Open

ProgressDeadlineExceeded cannot be extended #23883

syzmekk opened this issue Sep 30, 2019 · 3 comments
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@syzmekk
Copy link

syzmekk commented Sep 30, 2019

Pods are getting killed after 100 minutes and marked as Failed progressing.
We have some big boys in our environment. Our image is about ~5GB and needs around 90-110 minutes to be up and running (1st deployment).
After approximately 100 mins our pod(s) is/are getting deleted without a reason, even though everything inside was going well.

Version

oc v3.11.0+0cbc58b
kubernetes v1.11.0+d4cacc0
features: Basic-Auth GSSAPI Kerberos SPNEGO
openshift v3.11.69
kubernetes v1.11.0+d4cacc0

Steps To Reproduce
  1. Prepare DeploymentConfig with a ~5GB image
  2. Set all timeouts to 7200 (2hrs)
  3. Deploy it
Current Result

Example from one of the timeouts:

 conditions:
    - lastTransitionTime: '2019-09-27T05:33:28Z'
      lastUpdateTime: '2019-09-27T05:33:28Z'
      message: Deployment config does not have minimum availability.
      status: 'False'
      type: Available
    - lastTransitionTime: '2019-09-27T07:30:32Z'
      lastUpdateTime: '2019-09-27T07:30:32Z'
      message: replication controller "app-wls-1" has failed progressing
      reason: ProgressDeadlineExceeded
      status: 'False'
      type: Progressing

We have been also trying to patch the DC and include it in YAML DC file, but it seems that Openshift is ignoring this spec in YAML (is it available only for deployments?). Patch command returns following output:

$ oc patch dc app --patch='{"spec":{"progressDeadlineSeconds":7200}}'
deploymentconfig.apps.openshift.io/app not patched
Expected Result

Successful deployment without exceeding any deadline. ;)

Additional Information
$ oc get all -o yaml -n szymon-sandbox >> namespace.yml

namespace.yml

$ oc describe rc/app-wls-1

rc.yml

Please, kindly advise. :) Feel free to ask me for any additional info or missing details.
Thanks!

@syzmekk
Copy link
Author

syzmekk commented Oct 2, 2019

Another pod randomly deleted after about an hour:

  Type    Reason            Age   From                    Message
  ----    ------            ----  ----                    -------
  Normal  SuccessfulCreate  1h    replication-controller  Created pod: app-wls-10-mtlpz
  Normal  SuccessfulDelete  13m   replication-controller  Deleted pod: app-wls-10-mtlpz

I have been forwarding the logs to a local file and the last entry is:

rpc error: code = Unknown desc = Error: No such container: df6088d60dd12b4d2ff69108b350a66487db07541d1ab45712a06e8ea5e42956

I belive it's not related with the application itself :)

@openshift-bot
Copy link
Contributor

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci-robot openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 31, 2019
@syzmekk
Copy link
Author

syzmekk commented Jan 3, 2020

/remove-lifecycle stale
/lifecycle frozen

@openshift-ci-robot openshift-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests

3 participants