-
Notifications
You must be signed in to change notification settings - Fork 509
New and updated Bioconductor tooling for single-cell analysis #7961
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 25 commits
b26c27a
6089803
2f5fd63
db2dc39
280d50b
f508ea1
eda97a7
22f9277
dbe6b27
b52bcbd
72904ad
009e82d
f2f61a3
420b701
3230179
6ba34d6
4489caa
3f779ea
8f6e940
2e99abe
3675eb7
3235501
d634d91
74e2dee
09f01a0
6ce799f
2281acb
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,54 @@ | ||
| <?xml version="1.0"?> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">1.38.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">23.0</token> | ||
|
|
||
| <xml name="requirements"> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">bioconductor-scater</requirement> | ||
| <requirement type="package" version="1.28.0">bioconductor-loomexperiment</requirement> | ||
| <yield/> | ||
| </requirements> | ||
| </xml> | ||
|
|
||
| <token name="@CMD@"><![CDATA[ | ||
| cp '$input_loom' sce.loom && | ||
| cat '$script_file' > '$hidden_output' && | ||
| Rscript '$script_file' >> '$hidden_output' | ||
| ]]></token> | ||
|
|
||
| <token name="@CMD_imports@"><![CDATA[ | ||
| library(scater) | ||
| library(LoomExperiment) | ||
| ]]></token> | ||
|
|
||
| <token name="@CMD_read_inputs@"><![CDATA[ | ||
| sce <- import('sce.loom', format = "loom", type = "SingleCellLoomExperiment") | ||
| ]]></token> | ||
|
|
||
| <xml name="input_sce"> | ||
| <param name="input_loom" type="data" format="loom" | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. why loom? why not sce.rds?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I initially took the RDS path, but had a discussion here encouraging me to use loom instead
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's a great choice! Please make sure that the outputs are also always loom so that we don't need to use conversion tools in between. |
||
| label="Input SingleCellExperiment (loom)"/> | ||
| </xml> | ||
|
|
||
| <xml name="inputs_common_advanced"> | ||
| <section name="advanced_common" title="Advanced Output" expanded="false"> | ||
| <param name="show_log" type="boolean" checked="false" | ||
| label="Output log?"/> | ||
| </section> | ||
| </xml> | ||
|
|
||
| <xml name="outputs_common_advanced"> | ||
| <data name="hidden_output" format="txt" | ||
| label="${tool.name} on ${on_string}: log"> | ||
| <filter>advanced_common['show_log']</filter> | ||
| </data> | ||
| </xml> | ||
|
|
||
| <xml name="citations"> | ||
| <citations> | ||
| <citation type="doi">10.1093/bioinformatics/btw777</citation> | ||
| </citations> | ||
| </xml> | ||
| </macros> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| <?xml version="1.0"?> | ||
| <tool id="scater_plotcoldata" name="Scater: Plot column data" | ||
| version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>from a SingleCellExperiment object</description> | ||
| <macros> | ||
| <import>macros.xml</import> | ||
| </macros> | ||
| <expand macro="requirements"/> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| @CMD@ | ||
| ]]></command> | ||
| <configfiles> | ||
| <configfile name="script_file"><![CDATA[ | ||
| @CMD_imports@ | ||
| @CMD_read_inputs@ | ||
|
|
||
| p <- scater::plotColData( | ||
| object = sce, | ||
| y = '$y' | ||
| #if $x | ||
| , x = '$x' | ||
| #end if | ||
| ) | ||
|
|
||
| ggplot2::ggsave('output.png', plot = p) | ||
| ]]></configfile> | ||
| </configfiles> | ||
| <inputs> | ||
| <expand macro="input_sce"/> | ||
| <param name="y" type="text" | ||
| label="Y-axis column" | ||
| help="Column from colData to display on the y-axis (e.g. detected)"/> | ||
| <param name="x" type="text" optional="true" | ||
| label="X-axis column (optional)" | ||
| help="Column from colData to display on the x-axis (e.g. sum). Leave empty for a violin/beeswarm plot."/> | ||
| <expand macro="inputs_common_advanced"/> | ||
| </inputs> | ||
| <outputs> | ||
| <data name="output_png" format="png" from_work_dir="output.png" | ||
| label="${tool.name} on ${on_string}: plot"/> | ||
| <expand macro="outputs_common_advanced"/> | ||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="2"> | ||
| <!-- test1: y only --> | ||
| <param name="input_loom" location="https://zenodo.org/records/19665848/files/sce_after_addpercellqcmetrics.loom" ftype="loom"/> | ||
| <param name="y" value="detected"/> | ||
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="hidden_output"> | ||
| <assert_contents> | ||
| <has_text_matching expression="plotColData"/> | ||
| </assert_contents> | ||
| </output> | ||
| <output name="output_png" ftype="png"> | ||
| <assert_contents> | ||
| <has_size size="82793" delta="10"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| <test expect_num_outputs="2"> | ||
| <!-- test2: y and x --> | ||
| <param name="input_loom" location="https://zenodo.org/records/19665848/files/sce_after_addpercellqcmetrics.loom" ftype="loom"/> | ||
| <param name="y" value="detected"/> | ||
| <param name="x" value="sum"/> | ||
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="hidden_output"> | ||
| <assert_contents> | ||
| <has_text_matching expression="plotColData"/> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
|
||
| </assert_contents> | ||
| </output> | ||
| <output name="output_png" ftype="png"> | ||
| <assert_contents> | ||
| <has_size size="43418" delta="10"/> | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. there are better asserts for images, such as |
||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
| **What it does** | ||
|
|
||
| Runs ``scater::plotColData()`` on a ``SingleCellExperiment`` object to visualise | ||
| column-level metadata. Without an x-axis variable, each column is shown as a | ||
| violin/beeswarm plot. With an x-axis variable, a scatter plot is produced. | ||
|
|
||
| **Inputs** | ||
|
|
||
| - A ``SingleCellExperiment`` object saved as a loom file. | ||
| - The name of a ``colData`` column to display on the y-axis (e.g. ``detected``). | ||
| - Optionally, the name of a ``colData`` column to display on the x-axis (e.g. ``sum``). | ||
|
|
||
| **Outputs** | ||
|
|
||
| - A PNG image of the plot. | ||
| ]]></help> | ||
| <expand macro="citations"/> | ||
| </tool> | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| categories: | ||
| - Transcriptomics | ||
| - Single-cell | ||
| description: Galaxy wrappers for the Bioconductor scuttle package (v1.20.0) | ||
|
kevinrue marked this conversation as resolved.
Outdated
|
||
| long_description: | | ||
| scuttle provides utility functions for single-cell RNA-seq analysis, | ||
| including quality control, normalisation, and transformation, built | ||
| around the SingleCellExperiment class. | ||
| name: scuttle | ||
| owner: iuc | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. eventually, needs a |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,68 @@ | ||
| <?xml version="1.0"?> | ||
| <macros> | ||
| <token name="@TOOL_VERSION@">1.20.0</token> | ||
| <token name="@VERSION_SUFFIX@">0</token> | ||
| <token name="@PROFILE@">23.0</token> | ||
|
|
||
| <xml name="requirements"> | ||
| <requirements> | ||
| <requirement type="package" version="@TOOL_VERSION@">bioconductor-scuttle</requirement> | ||
| <requirement type="package" version="1.28.0">bioconductor-loomexperiment</requirement> | ||
| <yield/> | ||
| </requirements> | ||
| </xml> | ||
|
|
||
| <!-- Mirrors @CMD@ in seurat/macros.xml --> | ||
| <token name="@CMD@"><![CDATA[ | ||
| cp '$input_loom' sce.loom && | ||
| cat '$script_file' > '$hidden_output' && | ||
| Rscript '$script_file' >> '$hidden_output' | ||
| ]]></token> | ||
|
|
||
| <!-- Mirrors @CMD_imports@ in seurat/macros.xml --> | ||
| <token name="@CMD_imports@"><![CDATA[ | ||
| library(scuttle) | ||
| library(LoomExperiment) | ||
| ]]></token> | ||
|
|
||
| <!-- Mirrors @CMD_read_inputs@ in seurat/macros.xml --> | ||
| <token name="@CMD_read_inputs@"><![CDATA[ | ||
| sce <- import('sce.loom', format = "loom", type = "SingleCellLoomExperiment") | ||
| ]]></token> | ||
|
|
||
| <!-- Mirrors @CMD_loom_write_outputs@ in seurat/macros.xml --> | ||
| <token name="@CMD_loom_write_outputs@"><![CDATA[ | ||
| export(sce, "output.loom", format = "loom") | ||
| ]]></token> | ||
|
|
||
| <xml name="input_sce"> | ||
| <param name="input_loom" type="data" format="loom" | ||
| label="Input SingleCellExperiment (loom)"/> | ||
| </xml> | ||
|
|
||
| <xml name="output_sce"> | ||
| <data name="output_loom" format="loom" from_work_dir="output.loom" | ||
| label="${tool.name} on ${on_string}: loom"/> | ||
| <expand macro="outputs_common_advanced"/> | ||
| </xml> | ||
|
|
||
| <xml name="inputs_common_advanced"> | ||
| <section name="advanced_common" title="Advanced Output" expanded="false"> | ||
| <param name="show_log" type="boolean" checked="false" | ||
| label="Output log?"/> | ||
| </section> | ||
| </xml> | ||
|
|
||
| <xml name="outputs_common_advanced"> | ||
| <data name="hidden_output" format="txt" | ||
| label="${tool.name} on ${on_string}: log"> | ||
| <filter>advanced_common['show_log']</filter> | ||
| </data> | ||
| </xml> | ||
|
|
||
| <xml name="citations"> | ||
| <citations> | ||
| <citation type="doi">10.1093/bioinformatics/btw777</citation> | ||
| </citations> | ||
| </xml> | ||
| </macros> |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,110 @@ | ||
| <?xml version="1.0"?> | ||
| <tool id="scuttle_addpercellqcmetrics" name="Scuttle: Add per-cell QC metrics" | ||
| version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@"> | ||
| <description>to a SingleCellExperiment object</description> | ||
| <macros> | ||
| <import>macros.xml</import> | ||
| </macros> | ||
| <expand macro="requirements"/> | ||
| <command detect_errors="exit_code"><![CDATA[ | ||
| @CMD@ | ||
| ]]></command> | ||
| <configfiles> | ||
| <configfile name="script_file"><![CDATA[ | ||
| @CMD_imports@ | ||
| @CMD_read_inputs@ | ||
|
|
||
| subsets <- list() | ||
| #for $s in $subsets_repeat | ||
| subsets[['${s.name}']] <- which(rownames(sce) %in% readLines('$s.feature_file')) | ||
| #end for | ||
|
|
||
| sce <- scuttle::addPerCellQCMetrics( | ||
| x = sce, | ||
| subsets = subsets, | ||
| assay.type = '$assay_type' | ||
| ) | ||
|
|
||
| @CMD_loom_write_outputs@ | ||
| ]]></configfile> | ||
| </configfiles> | ||
| <inputs> | ||
| <expand macro="input_sce"/> | ||
| <param name="assay_type" type="text" value="counts" | ||
| label="Assay to use for QC computation"/> | ||
| <repeat name="subsets_repeat" title="Gene subset" min="0"> | ||
| <param name="name" type="text" label="Subset name" | ||
| help="Short label, e.g. MT"/> | ||
| <param name="feature_file" type="data" format="txt" | ||
| label="Feature list" | ||
| help="Plain text file with one rowname per line, no header"/> | ||
| </repeat> | ||
| <expand macro="inputs_common_advanced"/> | ||
| </inputs> | ||
| <outputs> | ||
| <expand macro="output_sce"/> | ||
| </outputs> | ||
| <tests> | ||
| <test expect_num_outputs="2"> | ||
| <!-- test1: no subsets --> | ||
| <param name="input_loom" location="https://zenodo.org/records/19665848/files/sce_with_mitochondrial_features.loom" ftype="loom"/> | ||
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="hidden_output"> | ||
| <assert_contents> | ||
| <has_text_matching expression="addPerCellQCMetrics"/> | ||
| </assert_contents> | ||
| </output> | ||
| <output name="output_loom" ftype="loom"> | ||
| <assert_contents> | ||
| <has_size size="34822" delta="10"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| <test expect_num_outputs="2"> | ||
| <!-- test2: MT subset --> | ||
| <param name="input_loom" location="https://zenodo.org/records/19665848/files/sce_with_mitochondrial_features.loom" ftype="loom"/> | ||
| <repeat name="subsets_repeat"> | ||
| <param name="name" value="MT"/> | ||
| <param name="feature_file" value="mitochondrial_gene_ids.txt" ftype="txt"/> | ||
| </repeat> | ||
| <section name="advanced_common"> | ||
| <param name="show_log" value="true"/> | ||
| </section> | ||
| <output name="hidden_output"> | ||
| <assert_contents> | ||
| <has_text_matching expression="addPerCellQCMetrics"/> | ||
| </assert_contents> | ||
| </output> | ||
| <output name="output_loom" ftype="loom"> | ||
| <assert_contents> | ||
| <has_size size="45032" delta="10"/> | ||
| </assert_contents> | ||
| </output> | ||
| </test> | ||
| </tests> | ||
| <help><![CDATA[ | ||
| **What it does** | ||
|
|
||
| Runs ``scuttle::addPerCellQCMetrics()`` on a ``SingleCellExperiment`` object | ||
| and returns the same object with new columns appended to ``colData``: | ||
|
|
||
| - ``sum`` – total counts per cell | ||
| - ``detected`` – number of detected features per cell | ||
| - ``subsets_<name>_sum``, ``subsets_<name>_detected``, ``subsets_<name>_percent`` | ||
| for each user-defined gene subset (e.g. mitochondrial genes) | ||
|
|
||
| **Inputs** | ||
|
|
||
| - A ``SingleCellExperiment`` object saved as a loom file. | ||
| - Optionally, one or more named gene subsets, each defined by a plain text file | ||
| listing the rownames to include (one per line, no header). | ||
|
|
||
| **Outputs** | ||
|
|
||
| - The same ``SingleCellExperiment`` with QC columns added to ``colData``, | ||
| saved as a loom file. | ||
| ]]></help> | ||
| <expand macro="citations"/> | ||
| </tool> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
importa scater function? or bioconductor function?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
import()is a generic defined inBiocIOwith methods defined in other packages likertracklayer.The
SingleCellLoomExperimenttype is defined in the packageLoomExperimenthere https://github.com/Bioconductor/LoomExperiment/blob/devel/R/import-method.R#L135I haven't studied the internals of the
import()function, but from the choice of three types it sounds like this helps theimport()function work with the different "flavours" of Bioconductor single-cell objects that can be stored in loom files