-
Notifications
You must be signed in to change notification settings - Fork 744
Open
Labels
Description
Bug report
When running Nextflow on Google Batch with gcsfuse mounted directories, and attemping to stage in many input files for a task, the task will fail with the following error:
Error executing process > 'COPY_FILES'
Caused by:
No such file or directory: /mnt/disks/nf-tower-test-eu-1/scratch/1rnFXXfWvpW08n/f7/b2319081b9b7957733eebb846c0fd3/.command.stage
This appears to be related to the fix implemented in #4282 and reported in #4279, which was intended to disable the separate staging script for remote object storage entirely. However, the fix doesn't properly work on Google Batch.
Steps to reproduce the problem
- Run a Nextflow pipeline on GCP with a task that stages in many input files (e.g., ~1000 or more files
# create random files
for ((n=0;n<6000;n++)); do touch dummy_file_${n}.txt; done
# sync to a GCS bucket
gsutil -m rsync -r ./ gs://nf-tower-test-eu-1/esha/many_files_test/One process workflow:
process COPY_FILES {
input:
path files
output:
path("outdir", type: 'dir')
script:
"""
mkdir -p outdir
for f in ${files}; do
cp \$f outdir/
done
"""
}
workflow {
Channel.fromPath(params.input).collect()
| COPY_FILES()
}
- Instead of staging in the directory, stage in each individual file which will result in a large
.command.runexceeding 1MB. - The task fails because it tries to access the .command.stage file which isn't properly created or accessible
Environment
- Nextflow version: 24.10.5
- Seqera Platform Cloud Version 24.3.0-cycle4_803f393
Additional context
(Add any other context about the problem here)