-
Notifications
You must be signed in to change notification settings - Fork 744
feat: Azure Batch eagerly terminates jobs after all tasks have been submitted #6159
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
feat: Azure Batch eagerly terminates jobs after all tasks have been submitted #6159
Conversation
✅ Deploy Preview for nextflow-docs-staging canceled.
|
This comment was marked as outdated.
This comment was marked as outdated.
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchProcessObserver.groovy
Show resolved
Hide resolved
|
Integration tests failing, looks unrelated. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
b4b321e to
069653d
Compare
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchProcessObserver.groovy
Show resolved
Hide resolved
|
Related issue on Slack: https://nfcore.slack.com/archives/C02T98A23U7/p1753954588096009
|
|
Hi ! Should we not care about resuming a workflow, since task outputs are stored outside of the workdir, a simple flag to delete the job once the last task has finished it's lifecycle would be all we need. The downside is that to "resume" a workflow, me must find a way to tell Nextflow that task outputs already exist and are stored at X location, as well as supply them. Which leads to the fact that the easiest solution would be to relaunch the whole workflow should we have an error arise somewhere during processing. |
This comment was marked as outdated.
This comment was marked as outdated.
@ghislaindemael I'm not sure I understand this; Task outputs have to be in the working directory. Even if they're published to a new location, they are copied out after the task is completed by Nextflow itself. |
|
My error, indeed I meant as we publish the results outside of the workdir (e.g. in Blob Storage), we can query them from here and thus delete them from the Batch VMs to remove the load and free up jobs for the quota. |
To clarify the flow here:
This PR strictly refers to 1 and 9 and does not interact with any files. If you are having issues with file storage, running out of space, etc., this would be an different issue. |
|
@adamrtalbot we are also looking for a solution for this. We have executions that have hundreds of jobs sometimes, so, even in proper executions without errors, we are limited to 2 or 3 parallel executions per batch account. In our case, at any given moment, we would have something like: Run1: Run2: So, from Azure's perspective, we have 200+2+57+4=263 jobs ongoing. As the runs progress, we have more and more jobs open and we reach the limit very quickly. We are seeing if we can modify / extend the |
Not resume, but retry with an errorStrategy: https://www.nextflow.io/docs/latest/reference/process.html#errorstrategy Here is the flow that may cause issues:
|
|
So using a cron job that validates when all tasks have completed successfully would work, right? In this case, the tasks have already completed. Or can we add it to the bit that runs and "decides" that all the tasks have been completed? |
You might have the same issue, in that you terminate a job before you can resubmit a task. |
|
Note none of this will help you if you just have too many active jobs. A job needs to be active to run a task, so if you just have a lot of work to do this wont help. Really, the issue is with the terrible design by Azure but given they just fired most of their genomics staff I doubt they will bother to help 🤷 . |
|
I have too many active jobs because nextflow does not close them, not because they are actually active. Any job that has been marked by nextflow with a ✔ is, in my opinion, finished and will not be re-used for anything never again, however, nextflow does not close it until the full execution of the run is finished. This is the behaviour that I think is incorrect. If the run takes 2 days to run, the first task that finished 47 hours ago is still marked as "active" in batch because Nextflow does not close it even though it will never be used again. I think it is Nextflow that is not using the Batch account properly. |
Right - so with especially long running pipelines you have many jobs in active state which do not do anything. Unfortunately, there isn't a way of Nextflow knowing the future and determining if another task will be submitted to the job which makes it tricky to know when to close a job. Here's an alternative implementation (which has the added benefit of making @pditommaso happy because it wont use the trace observer!):
This should eagerly terminate jobs while still allowing users to submit all tasks as normal. |
Then how does it "decide" to add the ✔? |
|
The check is shown when no more tasks for that process need to be executed ie. the process execution is complete |
Excellent, can we use that logic to terminate the Azure Job? |
|
It could be done with a TraceObserver(V2). If i'm not wrong you already made a pr for that |
@ghislaindemael that would be very helpful for me as I don't have much time this week! Thank you! |
|
I'll see what I can do. Problem is that our project doesn't make it easy using a custom build of Nextflow. |
It doesn't need to be a "real" example, as long as you can determine it will help your use case. Once we merge this, there will be a release of the Edge version quite soon, and you will find it easier to configure and test using that version. But I'd like to confirm we are on the right path before doing this. |
|
Well our use case is just freeing up the quota so we can have more pipelines running, so I guess this solves our problem. |
… recreation - Make recreateJobForTask synchronized to avoid concurrent recreation of jobs for the same mapKey. - Add logic to check if another thread has already recreated the job and reuse the new job if so. - Ensures job mapping is updated atomically and only after successful job creation. - Improves reliability when terminateJobsOnCompletion is enabled and multiple tasks may trigger job recreation. Signed-off-by: adamrtalbot <[email protected]>
|
Btw @adamrtalbot we successfully managed to cherry-pick these changes and package them in our modified version of [email protected]. And on execution the jobs are properly terminated once all tasks complete. So our problem is solved until the pipelines are upgraded to work with nextflow's next update |
|
Thanks @ghislaindemael, that's very helpful. Hopefully, this will result in zero additional jobs in active state for normal operation. |
|
bumpy bump face, I'd really like this one to sneak onto edge so people can have a go before 26.04 release. |
|
Hi @adamrtalbot, Sorry for the update, but we hit an edge case where if Nextflow submits a task to a job, but - due to compute limitations as all nodes are used - while this job stays Active, Azure still terminates the job when onAllBatchTasksComplete gets fired. This causes the pipeline to halt without any error thrown. Our solution is, like you did at the start, to listen with a TraceObserver to onProcessTerminate, fired when all of the tasks are submitted and completed. |
I'm not sure I follow this.
So the overall objective is achieved, but Nextflow isn't handling the Job re-creation correctly (step 6). Is my understanding correct? |
…etion Signed-off-by: adamrtalbot <[email protected]>
|
Also note:
This will not help users such as @luanjot who have too many active jobs during a pipeline, so the solution should involve both. |
Prevent tasks from being orphaned when jobs auto-terminate while tasks are waiting for compute resources. This fixes an issue where pipelines could silently die without error when resource-constrained pools caused tasks to remain in Active state. Problem: - Task submitted to job, enters Active state (waiting for resources) - Other tasks complete, triggering onAllTasksComplete = TERMINATE_JOB - Azure terminates job while task still in Active state - Per Azure docs: "remaining active Tasks will not be scheduled" - Task never runs, pipeline silently fails Solution: Only set auto-terminate when job has running tasks. If all tasks are in Active state (waiting for resources), defer auto-terminate to avoid orphaning them. This ensures tasks get a chance to start running before the job can be terminated. Changes: - Modified setAutoTerminateIfEnabled() to check task states - Only sets onAllTasksComplete=TERMINATE_JOB if tasks are running - Defers termination if tasks are still waiting for compute resources - Updated tests to verify new behavior Related to #5839 Signed-off-by: adamrtalbot <[email protected]>
Ensure all Azure Batch jobs are set to auto-terminate when the workflow completes, even if eager termination was deferred earlier. This provides a safety net for jobs where eager termination was skipped due to tasks waiting in Active state for compute resources. Without this fallback, jobs where eager termination was deferred would remain active indefinitely, consuming quota unnecessarily. This change guarantees that jobs will eventually terminate and free up quota. Changes: - Added terminateAllJobs() method to set onAllTasksComplete for all jobs - Called from close() when terminateJobsOnCompletion is enabled - Catches any jobs where eager termination was deferred - Ensures quota is freed even if Nextflow dies mid-execution Related to #5839 Signed-off-by: adamrtalbot <[email protected]>
Add TraceObserver that sets jobs to auto-terminate when processes complete, providing an additional layer of job cleanup during workflow execution. This complements the existing termination mechanisms: 1. Smart eager termination (per-task, when safe) 2. Process-level termination (this observer, when process completes) 3. Fallback termination (on workflow close) The TraceObserver fires when a process completes (all tasks submitted and finished), ensuring jobs are terminated promptly without waiting for the entire workflow to finish. This is particularly useful for long-running workflows with many processes, as it frees up quota progressively. Implementation: - AzBatchProcessObserver: Implements TraceObserver.onProcessTerminate - AzBatchProcessObserverFactory: Creates observer instances - Service registration via META-INF/services for automatic discovery Related to #5839 Signed-off-by: adamrtalbot <[email protected]>
The previous implementation checked if tasks were in "Running" state immediately after submission, but tasks need time to transition from Active → Running. This caused the eager termination to always defer, making it ineffective. Changed the logic to check if the job has any tasks (count > 0) instead of checking task state. Since setAutoTerminateIfEnabled() is called AFTER task submission, the task will already be in the job, making this check reliable and timing-independent. This ensures eager per-task termination actually works as intended, with the TraceObserver and close() fallbacks providing additional safety. Related to #5839 Signed-off-by: adamrtalbot <[email protected]>
In our case :
|
|
Update @adamrtalbot In my humble opinion, I would disable the auto-termination and just go with the termination with onProcessComplete. This as it triggers only we are sure the job isn't needed anymore and we can't prevent we might hit a race condition where the job is picked up by onAllBatchTasksComplete cleaner but Nextflow seeing the job as Active still inserts a new task inside before job is set to "Terminating" and the cleaner doesn't recheck. This leads to an 'orphaned' task, that will not be deleted by Azure until job is deleted. As according to docs the maxWallClockTime only applied when tasks starts to run. Which won't happen in this case. |
OK interesting, I didn't realise Azure Batch considers an 'active' task completed when it's actually pending. What a terrible system 🤦 In which case, we can only switch it when a task is in running status. This should be straightforward to set up a reproducible example.
@pditommaso, how do you feel about using the TraceObserverV2 this way now? |
|
I may allow myself to think the Azure engineers thought about this case, but you can't guarantee some race conditions not to happen on systems of these sizes. |
Summary
Fixes Azure Batch "job leak" issue where jobs remain in Active state even after task completion, causing quota exhaustion and preventing multiple pipelines from running simultaneously.
Problem: Jobs consume quota slots unnecessarily, blocking other workflows
Solution: Leverage Azure Batch's native auto-termination to release quota immediately when tasks complete
How Azure Batch Eager Job Termination Works
Problem Addressed
Azure Batch has a limitation where jobs remain in an "Active" state even after all their tasks complete. This causes:
Solution Implementation
Job Auto-Termination Configuration
Default behavior:
terminateJobsOnCompletion = true(enabled by default)Job Termination Mechanism
The service implements a two-phase termination approach:
Phase 1: Set Jobs to Auto-Terminate
Phase 2: Cleanup on Workflow Completion
Azure Batch Native Feature Integration
How Auto-Termination Works
OnAllBatchTasksComplete.TERMINATE_JOBsettingEager Termination Flow
OnAllTasksComplete = TERMINATE_JOBKey Benefits
Resource Management
Operational Improvements
Configuration Options
Users can control the behavior:
azure { batch { terminateJobsOnCompletion = true // Enable eager termination (default) deleteJobsOnCompletion = false // Optionally delete jobs entirely deleteTasksOnCompletion = true // Clean up individual tasks } }Technical Implementation Details
Job Lifecycle Management
updateJobAPI for termination settingError Handling
Impact
This implementation provides an elegant solution to Azure Batch's job quota problem by leveraging Azure's native auto-termination feature. It ensures that jobs automatically terminate when their tasks complete, preventing quota exhaustion while maintaining full compatibility with existing workflows.
Related
Note
Implements eager Azure Batch job auto-termination after task submission with automatic job recreation on 409 conflicts; updates tests, logging, and config docs.
plugins/nf-azure/.../AzBatchService.groovy):setJobTermination(jobId)andsetAutoTerminateIfEnabled(jobId, taskId);runTasknow calls them after submission.terminateJobs()usage fromclose().submitTaskToJob(...)handles409on auto-terminated jobs byrecreateJobForTask(...)and resubmits.recreateJobForTask(...)synchronizes and updatesallJobIdsmapping.destFile(...)log level totrace; fix "creddentials" typo; minor comments/docs; preserve job constraints on create.docs/reference/config.md):azure.batch.terminateJobsOnCompletion: set jobs to terminate on task completion (defaulttrue).Written by Cursor Bugbot for commit 899f0a2. This will update automatically on new commits. Configure here.