Skip to content

Manage AWS Batch Unscheduled jobs #5936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

jorgee
Copy link
Contributor

@jorgee jorgee commented Apr 2, 2025

close #5897

Detects when a job is in an unscheduled state by checking if the reason status contains MISCONFIGURED:JOB_RESOURCE_REQUIREMENT

By default, it only prints a warning when the condition is found.

A new aws.batch.killUnscheduled flag is added in the configuration to change the default behaviour. When true, Nextflow kills the unscheduled job and throws a ProcessException with the reason.

Signed-off-by: jorgee <[email protected]>

AWS unscheduled jobs management

Signed-off-by: jorgee <[email protected]>
@jorgee jorgee requested a review from a team as a code owner April 2, 2025 16:46
Copy link

netlify bot commented Apr 2, 2025

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 133b9c7
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/67fd1f87057fe5000894223e
😎 Deploy Preview https://deploy-preview-5936--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@jorgee jorgee changed the title Manage Unscheduled jobd Manage AWS Batch Unscheduled jobs Apr 4, 2025
@jorgee jorgee requested a review from bentsherman April 4, 2025 09:53
Signed-off-by: Paolo Di Tommaso <[email protected]>
@pditommaso
Copy link
Member

Should this be "unscheduled" or "unschedulable" or "misconfigured"?

Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
@jorgee
Copy link
Contributor Author

jorgee commented Apr 10, 2025

I have received some suggestions to include other such as MISCONFIGURATION:COMPUTE_ENVIRONMENT_MAX_RESOURCE. Maybe another option would be allowing the users to define the status reasons to cancel the AWS job as part of the nextflow configuration. What do you think?

@pditommaso
Copy link
Member

Contrary to what I said during our call, let's add MISCONFIGURATION:COMPUTE_ENVIRONMENT_MAX_RESOURCE, if there are going to be more requests we'll add a config option

pditommaso and others added 3 commits April 13, 2025 16:38
Co-authored-by: Chris Hakkaart <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
@pditommaso pditommaso requested a review from a team as a code owner April 13, 2025 21:11
@pditommaso
Copy link
Member

Adjusted naming, added docs & scope options, added MISCONFIGURATION:COMPUTE_ENVIRONMENT_MAX_RESOURCE.

Please have a look and merge.

Copy link
Collaborator

@christopher-hakkaart christopher-hakkaart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docs look good

Signed-off-by: Ben Sherman <[email protected]>
@bentsherman bentsherman merged commit 44abe60 into master Apr 14, 2025
23 checks passed
@bentsherman bentsherman deleted the 5897-executing-on-aws-batch-hangs-forever-if-a-compute-env-does-not-meet-the-criteria branch April 14, 2025 14:53
riederd pushed a commit to riederd/nextflow that referenced this pull request Apr 16, 2025
---------

Signed-off-by: jorgee <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
Signed-off-by: Dietmar Rieder <[email protected]>
ejseqera pushed a commit to ejseqera/nextflow that referenced this pull request Apr 21, 2025
---------

Signed-off-by: jorgee <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
ejseqera pushed a commit to ejseqera/nextflow that referenced this pull request Apr 21, 2025
---------

Signed-off-by: jorgee <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Chris Hakkaart <[email protected]>
Co-authored-by: Ben Sherman <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Executing on AWS Batch hangs forever if a compute env does not meet the criteria
4 participants