Skip to content

fix/prevent falco main module to process an empty fastq and fail#5769

Closed
glichtenstein wants to merge 5 commits into
nf-core:5612-bcl-demultiplexfrom
glichtenstein:fix/prevent-falco-module-to-process-an-empty-fastq-file
Closed

fix/prevent falco main module to process an empty fastq and fail#5769
glichtenstein wants to merge 5 commits into
nf-core:5612-bcl-demultiplexfrom
glichtenstein:fix/prevent-falco-module-to-process-an-empty-fastq-file

Conversation

@glichtenstein

@glichtenstein glichtenstein commented Jun 6, 2024

Copy link
Copy Markdown
Contributor

PR checklist

Closes #5612
Replaces: #5638
To be merge on top-off of PR: #5720

@k1sauce What about doing the filter directly on the falco module main.nf. This way the subworkflow/bcl_demultiplex/main.nf does not have to handle this issue nor the workflow/demultiplex.nf
I have used a file size filter, to avoid inserting in the falco list of ${reads} those files that have less than 20bytes.
Rationale: An empty text file compressed in tar.gz is usually 20 bytes in size.

edited: (task.ext.when == null || task.ext.when) && reads.every { it.size() > 20 }

Closes #5638

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • For modules:
      • nf-core modules test <MODULE> --profile docker
      • nf-core modules test <MODULE> --profile singularity
      • nf-core modules test <MODULE> --profile conda
    • For subworkflows:
      • nf-core subworkflows test <SUBWORKFLOW> --profile docker
      • nf-core subworkflows test <SUBWORKFLOW> --profile singularity
      • nf-core subworkflows test <SUBWORKFLOW> --profile conda

@glichtenstein glichtenstein changed the title Fix/prevent falco module to process an empty fastq file fix/prevent falco main module to process an empty fastq and fail Jun 6, 2024
@glichtenstein glichtenstein marked this pull request as ready for review June 6, 2024 19:20
@glichtenstein glichtenstein requested review from a team and kpadm and removed request for a team June 6, 2024 19:20

@jfy133 jfy133 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I like this. The process should just silently not execute when it gets an empty FASTQ file.

An empty FASTQ file implies something has gone wrong, so the user should be told why the module is not executing.

I would rather leave as is, and pipeline developers should handle this within the pipeline in the input channel to the falco process, such as a map that uses empty() or w/s on the fastq file

Otherwise could put in the script setup block (with prefix variables etc) an error check snd actually error out if it fails

I don't think stuff like the when block should be modified because it is a standard part of the template and would make it harder to modify en masse

@glichtenstein

Copy link
Copy Markdown
Contributor Author

I'm not sure I like this. The process should just silently not execute when it gets an empty FASTQ file.

An empty FASTQ file implies something has gone wrong, so the user should be told why the module is not executing.

I would rather leave as is, and pipeline developers should handle this within the pipeline in the input channel to the falco process, such as a map that uses empty() or w/s on the fastq file

Otherwise could put in the script setup block (with prefix variables etc) an error check snd actually error out if it fails

I don't think stuff like the when block should be modified because it is a standard part of the template and would make it harder to modify en masse

I see your concern, it's a solid argument. I was trying to prevent falco from running when an empty fastq.tar.gz is among the {reads}, but this should be done at the pipeline level. I was doing it in the pipeline, but PR: 5720 refactoring of the subworkflow lead me to think of applying it here. I will close this PR then, and look for alternative routes. Thank you very much for your valuable review and observation.

@jfy133

jfy133 commented Jun 9, 2024

Copy link
Copy Markdown
Member

I see. Yes, in the case I do feel it makes more sense at a subworkflow level!

@SPPearce

SPPearce commented Jun 9, 2024

Copy link
Copy Markdown
Contributor

Does it currently crash with a clear error message?

@glichtenstein glichtenstein deleted the fix/prevent-falco-module-to-process-an-empty-fastq-file branch June 9, 2024 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

No open projects
Status: No status

Development

Successfully merging this pull request may close these issues.

3 participants