Add MAGs-visualization Galaxy wrappers#7922
Conversation
|
Adjusted pathway test dataset size to ensure meaningful heatmap visualization while staying within IUC file size limits. |
SaimMomin12
left a comment
There was a problem hiding this comment.
Thanks @alexandrah1704
Some preliminary comments inline
| @@ -0,0 +1,167 @@ | |||
| <tool id="mags_visualization_comp_conta" name="MAGs-visualization comp-conta" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.05"> | |||
There was a problem hiding this comment.
| <tool id="mags_visualization_comp_conta" name="MAGs-visualization comp-conta" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.05"> | |
| <tool id="mags_visualization_comp_conta" name="MAGs-visualization comp-conta" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> |
| <param name="checkm" type="data" format="tabular,tsv" label="CheckM results" /> | ||
| <param name="checkm2" type="data" format="tabular,tsv" label="CheckM2 results" /> |
There was a problem hiding this comment.
Can we prefer either tabular or tsv here?
| </outputs> | ||
|
|
||
| <tests> | ||
| <test> |
There was a problem hiding this comment.
| <test> | |
| <test expect_num_outputs="1"> |
Please add this all tests
| <param name="checkm" type="data" format="tabular,tsv" label="CheckM results" /> | ||
| <param name="checkm2" type="data" format="tabular,tsv" label="CheckM2 results" /> | ||
|
|
||
| <conditional name="mode"> |
There was a problem hiding this comment.
Would prefer different conditional name as its similar to the below param name
| <help><![CDATA[ | ||
| **MAGs-visualization: comp-conta** | ||
|
|
There was a problem hiding this comment.
A more detailed help would be nice to have
| </param> | ||
|
|
||
| <param name="max_col" type="integer" value="10" min="1" label="Maximum number of taxonomy labels" /> | ||
| <param name="no_log" type="boolean" checked="false" truevalue="true" falsevalue="" label="Disable log10 scaling for the top bar plot" /> |
There was a problem hiding this comment.
| <param name="no_log" type="boolean" checked="false" truevalue="true" falsevalue="" label="Disable log10 scaling for the top bar plot" /> | |
| <param name="no_log" type="boolean" checked="false" truevalue="--no_log" falsevalue="" label="Disable log10 scaling for the top bar plot" /> |
| #if $no_log | ||
| --no_log | ||
| #end if |
There was a problem hiding this comment.
| #if $no_log | |
| --no_log | |
| #end if | |
| $no_log |
| @@ -0,0 +1,80 @@ | |||
| <tool id="mags_visualization_taxa_sankey" name="MAGs-visualization taxa-sankey" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.05"> | |||
There was a problem hiding this comment.
| <tool id="mags_visualization_taxa_sankey" name="MAGs-visualization taxa-sankey" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="22.05"> | |
| <tool id="mags_visualization_taxa_sankey" name="MAGs-visualization taxa-sankey" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> |
|
Thank you for the detailed review. I have applied all comments:
Please let me know if anything else should be adjusted. |
bernt-matthias
left a comment
There was a problem hiding this comment.
Great start. Some more comments from my side. Some apply to multiple/all tools.
| <param name="meta_bin_width" type="float" value="5.0" label="Bin width for numeric metadata"/> | ||
| </when> | ||
| </conditional> | ||
| <param name="top_n" type="integer" value="30" min="1" label="Top N categories"/> |
There was a problem hiding this comment.
Categories of what?
| </section> | ||
| </inputs> | ||
| <outputs> | ||
| <collection name="plots" type="list" label="${tool.name} outputs"> |
There was a problem hiding this comment.
If there is only one output you should stick to the default label. Otherwise use the prefix ${tool.name} on ${on_string}. The suffix outputs is definitely not needed - clearly its outputs :)
| <param name="checkm2" value="checkm2.tsv"/> | ||
| <param name="format" value="png"/> | ||
| <output_collection name="plots" type="list"> | ||
| <element name="comp_conta_marginals_checkm"> |
There was a problem hiding this comment.
Please add ftype to the outputs.
There was a problem hiding this comment.
For images (png) we have assertions on width and height that could be used here.
| <param name="mode_cond|mode" value="tax"/> | ||
| <param name="mode_cond|gtdb" value="gtdb.tsv"/> | ||
| <param name="tax_level" value="phylum"/> | ||
| <param name="format" value="png"/> |
There was a problem hiding this comment.
Can you adapt tests 2 and 3 to test for pdf and svg output?
| </when> | ||
| <when value="meta"> | ||
| <param name="metadata" type="data" format="tabular" label="Metadata table"/> | ||
| <param name="meta_col" type="text" label="Metadata column name"/> |
There was a problem hiding this comment.
Maybe a data_column https://docs.galaxyproject.org/en/master/dev/schema.html#data-column parameter would be an option?
| <when value="meta"> | ||
| <param name="metadata" type="data" format="tabular" label="Metadata table"/> | ||
| <param name="meta_col" type="text" label="Metadata column name"/> | ||
| <param name="meta_bin_width" type="float" value="5.0" label="Bin width for numeric metadata"/> |
There was a problem hiding this comment.
Add min and/or max to numeric parameters if possible.
| <repeat name="tax_levels" title="Taxonomy levels" min="2"> | ||
| <param name="level" type="select" label="Taxonomic level"> | ||
| <option value="domain">domain</option> | ||
| <option value="phylum" selected="true">phylum</option> | ||
| <option value="class">class</option> | ||
| <option value="order">order</option> | ||
| <option value="family">family</option> | ||
| <option value="genus" selected="true">genus</option> | ||
| <option value="species">species</option> | ||
| </param> | ||
| </repeat> |
There was a problem hiding this comment.
Replace by a select with multiple="true"?
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="1"> | ||
| <param name="drep" value="drep.csv"/> |
There was a problem hiding this comment.
Can you also test with a tabular input?
| <repeat name="meta_cols" title="Metadata columns" min="0"> | ||
| <param name="col" type="text" label="Metadata column name"/> | ||
| </repeat> |
There was a problem hiding this comment.
data_column with multiple="true"?
|
There is a new version available |
|
Thank you for the review. |
|
Hi @alexandrah1704! Thanks a lot, a few comments:
#if $mode_cond.mode == "tax"
--gtdb '$mode_cond.gtdb'
--tax_level '$mode_cond.tax_level'
#end if
#if $mode_cond.mode == "tax" <!-- DUPLICATE — remove this block? -->
--gtdb '$mode_cond.gtdb'
#end if
#if $metadata_cond.metadata and len($metadata_cond.meta_cols) > 0Fix: Use the standard idiom: #if $metadata_cond.metadata
--metadata '$metadata_cond.metadata'
#if $metadata_cond.meta_cols
--meta_cols
#for $mc in $metadata_cond.meta_cols
'$mc.col'
#end for
#end if
#end if
<assert_contents>
<has_size min="1"/>
</assert_contents>e.g. <has_image_width min="100"/>
<has_image_height min="100"/>For SVG outputs,
<param name="tax_levels" value="phylum"/>
<param name="tax_levels" value="genus"/>You can have multiple selections, correct? This: <param name="tax_levels" value="phylum,genus"/>
<param name="meta_col" type="text" label="Metadata column name"/>Galaxy's single-quoting prevents shell injection, but column names containing commas or semicolons could affect parsing inside the Python tool. Please add a |
paulzierep
left a comment
There was a problem hiding this comment.
need to continue later
| </section> | ||
| </inputs> | ||
| <outputs> | ||
| <collection name="plots" type="list"> |
There was a problem hiding this comment.
you do not need collection output here, since each tool produces only one output afaik, since file name are variable you could do
cp outputs/${dynamic_name}.png outputs/final_plot.png
on the command line and then
<data name="plot" format="png" from_work_dir="outputs/final_plot.png"/>
Please adapt for all tools.
| <option value="variance">variance (differential modules)</option> | ||
| <option value="both">both</option> | ||
| </param> | ||
| <param argument="--format" type="select" label="Output format"> |
There was a problem hiding this comment.
you have format and fig size for all tools, you could move that to the macros and then add unit to the help, also add min, max please
| label="Metadata column" | ||
| use_header_names="true" | ||
| help="Select the column to color by"/> | ||
| <param name="meta_bin_width" type="float" value="5.0" min="0.1" label="Bin width for numeric metadata"/> |
|
|
||
| This tool visualizes genome clusters generated by dRep and annotates them using GTDB taxonomy. | ||
|
|
||
| Clusters are grouped by taxonomic levels (e.g. phylum, genus) based on representative genomes. |
| </param> | ||
| <section name="representatives" title="Representative MAG selection (optional)" expanded="false"> | ||
| <param name="drep" type="data" format="csv,tabular" optional="true" label="dRep cluster table"/> | ||
| <param name="gtdb" type="data" format="tabular" optional="true" label="GTDB annotation table"/> |
There was a problem hiding this comment.
why is gtdb needed for selection ?
There was a problem hiding this comment.
gtdb is needed to select the same representative genomes as in the drep-cluster-func plot. Without gtdb, there will be a different row order and different MAGs being shown. With gtdb both plots show identical MAGs in the same order, to make it directly comparable, so there is no confusion.
There was a problem hiding this comment.
You could add a bit of help, eg via the help attribute what these inputs are used for / in which cases the user should supply them.
|
I've adjusted the following:
|
This PR adds Galaxy wrappers for MAGs-visualization, a toolkit for generating plots and dashboards for MAG evaluation results (CheckM, CheckM2, dRep, GTDB, QUAST, Bakta, CoverM, metadata, KEGG).
This PR supersedes my previous submission (#7583).
The previous single-wrapper implementation has been refactored into separate, modular wrappers for each subcommand.
The following tools are included:
It supports multiple visualization types (quality metrics, taxonomy, clustering, abundance, functional annotation).
Outputs returned as dataset collections.
Links to example outputs included in help sections.
The wrappers use the
mags-visualizationCLI distributed via Bioconda:https://github.com/bioconda/bioconda-recipes/tree/master/recipes/mags-visualization
Source code:
https://github.com/usegalaxy-eu/MAGs-visualization
Tested locally with planemo test.
All local builds and tests pass.
Example outputs can be found in the MAGs-visualization use cases:
https://github.com/usegalaxy-eu/MAGs-visualization/tree/main/use-cases
FOR CONTRIBUTOR: