-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add trimmomatic to tools/ #192
base: main
Are you sure you want to change the base?
Conversation
} | ||
|
||
command <<< | ||
cd /usr/local/share/trimmomatic-0.36-5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure this is necessary? biocontainers should have added a trimmomatic
executable to the $PATH, as that's what they usually do. It should be unneeded to invoke the full path here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was trying to work out access to the executable, but without cd'ing to the exact directory I wasn't able to get it to work
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's likely because it's being called with java
. There is a utility wrapper script at /usr/local/bin/trimmomatic
that provides the functionality and is on PATH
. So you should be able to swap java -jar trimmomatic.jar
with trimmomatic
and remove the directory change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @adthrasher! Hadn't noticed that distinction
@a-frantz I think the PR is in enough of a stable feature state that your review/suggestions are welcome. Please let me know how I can shore this up so it's ready for approval/merging |
tools/trimmomatic.wdl
Outdated
|
||
command <<< | ||
ls -l /usr/local/share/trimmomatic-0.36-5/adapters/ | ||
trimmomatic PE -phred33 -threads ~{threads} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these tasks should be combined. You can dynamically select the subcommand with ~{if (defined(<read_two>)) then "PE" else "SE"}
tools/trimmomatic.wdl
Outdated
-trimlog trimlog.txt ~{input_file_forward} ~{input_file_reverse} \ | ||
output_forward.fq.gz output_reverse.fq.gz \ | ||
~{if (phred64) then "-phred64" else "-phred33"} \ | ||
ILLUMINACLIP:~{illumina_clip} \ | ||
~{if defined(leading) then "LEADING:~{leading}" else ""} \ | ||
~{if defined(trailing) then "TRAILING:~{trailing}" else ""} \ | ||
SLIDINGWINDOW:~{window_size}:~{window_quality} \ | ||
MINLEN:~{minlen} \ | ||
~{if defined(crop) then "CROP:~{crop}" else ""} \ | ||
~{if defined(headcrop) then "HEADCROP:~{headcrop}" else ""} \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
style: indent the line continuations by one level (four spaces) and drop this last slash
"leading": 3, | ||
"trailing": 3, | ||
"threads": 10, | ||
"window_size": 4, | ||
"window_quality": 15, | ||
"minlen": 36, | ||
"illumina_clip": "TruSeq3-SE:2:30:10", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these the defaults of trimmomatic
? You should specify the defaults in the input
block so that users don't need to specify all this (unless they want to override the defaults).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since I'm unfamiliar with the standards for trimming these read files, I'm not sure if these are "defaults" per se, but they are values used in the documentation's examples. Would you like me to make them defaults, or should we consult with end users to see what the most likely defaults would be?
tools/trimmomatic.wdl
Outdated
parameter_meta { | ||
input_file: "Input FASTQ format file to run Trimmomatic on." | ||
output_name: "Output FASTQ format file of trimmed reads." | ||
illumina_clip: "Cut adapter and other illumina-specific sequences from the read." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain this param more? What syntax does trimmomatic expect?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will admit I don't fully understand what this adapter is, but there's more information about the Adapter on the trimmomatic documentation. It appears that this is an optional argument, so I can modify the wdl task accordingly.
The docs also mention: "Trimming occurs in the order which the steps are specified on the command line. It is recommended in most cases that adapter clipping, if required, is done as early as possible." Given this, I think the adapter cutting should be applied before other parameters, as it currently is in the wdl tasks.
tools/trimmomatic.wdl
Outdated
String outfile_name_two = ( | ||
if defined(read_two) then | ||
basename(read_two, ".fastq.gz") + ".reverse.trimmed.fastq.gz" | ||
else | ||
"" | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@a-frantz this block of code is giving me trouble, causing this issue:
(tools/trimmomatic.wdl Ln 34 Col 26) Expected String instead of File?; for basename argument #1 basename(read_two, ".fastq.gz") + ".reverse.trimmed.fastq.gz"
I'm confused, because a similar passing of a File into basename on line 28 does not cause any errors. Curious if you know how I should handle the optional outfile_name_two
variable assignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that the error message from sprocket
or miniwdl
? Either way it's misleading/incorrect.
The issue is that read_two
is optional, and must be coerced to a required param before being used in the basename
func. But this section is going to be rewritten once I get the REGEX to you, so don't worry about it for now
No description provided.