Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: extend test timeout #1549

Closed
wants to merge 1 commit into from
Closed

ci: extend test timeout #1549

wants to merge 1 commit into from

Conversation

minrk
Copy link
Member

@minrk minrk commented Oct 12, 2022

tests are getting cancelled because 10 minutes is not enough.

time limits added in #1518

tests are getting cancelled because 10 minutes is not enough
@minrk minrk added the maintenance Under the hood improvements and fixes label Oct 12, 2022
@betatim
Copy link
Member

betatim commented Oct 12, 2022

Tests got cancelled after 15min :-/

From the logs of the helm tests:

============================= test session starts ==============================
platform linux -- Python 3.9.14, pytest-7.1.3, pluggy-1.0.0 -- /opt/hostedtoolcache/Python/3.9.14/x64/bin/python
cachedir: .pytest_cache
rootdir: /home/runner/work/binderhub/binderhub
plugins: cov-4.0.0, asyncio-0.19.0
asyncio: mode=strict
collecting ... collected 129 items / 118 deselected / 11 selected

and then it got cancelled. This seems weird no? I was expecting it to run some test or do something, not spend 12minutes finding tests to run.

@consideRatio
Copy link
Member

No, they are stuck.

User pods are stuck pending, because user scheduler isnt supporting k8s 1.25.

See #1544

@betatim
Copy link
Member

betatim commented Oct 12, 2022

Ah ok. I had expected to see some output about a particular test having started (and then getting stuck).

Looking at the PR you linked, I think we don't need to extend the timeout for the tests. Instead we should migrate BinderHub to be compatible with JupyterHub 2.

@manics
Copy link
Member

manics commented Oct 12, 2022

Ah ok. I had expected to see some output about a particular test having started (and then getting stuck).

Yes, that's why I originally added the timeout. I noticed that coding errors during development could lead to tests hanging due to an unexpected server response.... which meant they got stuck for the default timeout of 6(?) hours.

Is there a way to set a default timeout for each pytest test (e.g. maybe 1 or 2 minutes?), since that would at least give us a bit more information on how extensive the failures are.

As a short term fix, will pinning the CI tests to K8s 1.24 solve the failures?

Edit: Also worth noting for anyone not aware, the Kubernetes namespace report section of the CI logs will show the state of the system, so you can check if BinderHub is running (or not). This report is included for failures:

- name: Kubernetes namespace report
uses: jupyterhub/action-k8s-namespace-report@v1
if: always()
with:
important-workloads: deploy/binder deploy/hub deploy/proxy

@minrk
Copy link
Member Author

minrk commented Oct 12, 2022

As a short term fix, will pinning the CI tests to K8s 1.24 solve the failures?

Yes, I think that's the right change. k8s version should be pinned and upgraded explicitly in CI anyway.

@minrk minrk closed this Oct 12, 2022
@minrk
Copy link
Member Author

minrk commented Oct 12, 2022

Is there a way to set a default timeout for each pytest test (e.g. maybe 1 or 2 minutes?)

Yeah, pytest-timeout should work.

@minrk
Copy link
Member Author

minrk commented Oct 12, 2022

#1550 pins k3s to 1.24 to get tests working again (sorry @consideRatio for not seeing #1541 first). #1551 adds per-test timeouts so we shouldn't hang the whole test suite anymore when similar bugs crop up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Under the hood improvements and fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants