Skip to content

New and updated Bioconductor tooling for single-cell analysis#7961

Draft
kevinrue wants to merge 27 commits into
galaxyproject:mainfrom
kevinrue:kra-dev-scuttle
Draft

New and updated Bioconductor tooling for single-cell analysis#7961
kevinrue wants to merge 27 commits into
galaxyproject:mainfrom
kevinrue:kra-dev-scuttle

Conversation

@kevinrue
Copy link
Copy Markdown

@kevinrue kevinrue commented May 6, 2026

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

There are two labels that allow to ignore specific (false positive) tool linter errors:

  • skip-version-check: Use it if only a subset of the tools has been updated in a suite.
  • skip-url-check: Use it if github CI sees 403 errors, but the URLs work.

Context

For my BioFAIR Pathfinder project, I aim to wrap a number of Bioconductor packages and functions from the latest Bioconductor release to assemble a new GTN tutorial for single-cell analysis using (mostly) Bioconductor packges.
The two GTN single-cell tutorials that exist so far cover Seurat (R) and Scanpy (Python).

Draft PR

As a first time contributor to this repo, I strive to apply the best practices for Galaxy tool development, but I'm conscious that I might not be constantly aware of all of them.

This is why I open this draft PR. If anyone with experience could glance at my work so far and spot any bad practice, I'd be more than happy to fix the existing code where relevant, while writing better code in the tool wrappers that I am yet to write, to complete the workflow.

I'll be busy teaching most the of coming weeks, so it feels a good use of time to have someone look at this PR while I'm not actively adding to it.

EDIT 1

"This PR does something else"

This PR implements wrappers for:

  • functions in 'new' Bioconductor packages not formally wrapped in Galaxy yet (scuttle, singlecellexperiment)
  • new functions in Bioconductor packages previously wrapped (scater::plotColData)

For clarity and the avoidance of name clashes, I've named my tool wrappers package-version. In particular, scater-1.38.0 avoids a clash with the existing scater (1.22.0), while scuttle-1.20.0 and singlecellexperiment-1.32.0 didn't really need the version number.

@kevinrue kevinrue changed the title Kra dev scuttle New and updated Bioconductor tooling for single-cell analysis May 6, 2026
Copy link
Copy Markdown
Member

@pavanvidem pavanvidem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is the right direction. Each of these functions has many more arguments that need to be integrated. For an LLM-assisted (?) initial version, it is not bad.

]]></token>

<token name="@CMD_read_inputs@"><![CDATA[
sce <- import('sce.loom', format = "loom", type = "SingleCellLoomExperiment")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • is import a scater function? or bioconductor function?
  • Is "SingleCellLoomExperiment" really a type?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

import() is a generic defined in BiocIO with methods defined in other packages like rtracklayer.

The SingleCellLoomExperiment type is defined in the package LoomExperiment here https://github.com/Bioconductor/LoomExperiment/blob/devel/R/import-method.R#L135

I haven't studied the internals of the import() function, but from the choice of three types it sounds like this helps the import() function work with the different "flavours" of Bioconductor single-cell objects that can be stored in loom files

]]></token>

<xml name="input_sce">
<param name="input_loom" type="data" format="loom"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why loom? why not sce.rds?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I initially took the RDS path, but had a discussion here encouraging me to use loom instead

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a great choice! Please make sure that the outputs are also always loom so that we don't need to use conversion tools in between.

</section>
<output name="hidden_output">
<assert_contents>
<has_text_matching expression="plotColData"/>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<has_text text="plotColData"/> should be enough in this case

</output>
<output name="output_png" ftype="png">
<assert_contents>
<has_size size="43418" delta="10"/>
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are better asserts for images, such as has_image_height, has_image_width and more has_image_*

Comment thread tools/scuttle-1.20.0/.shed.yml Outdated
Comment thread tools/singlecellexperiment-1.32.0/.shed.yml Outdated
including quality control, normalisation, and transformation, built
around the SingleCellExperiment class.
name: scuttle
owner: iuc
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

eventually, needs a suite section because there are multiple tools

@kevinrue
Copy link
Copy Markdown
Author

For an LLM-assisted (?) initial version, it is not bad.

I used LLM (Claude Code) to produce the first draft that is about 80-90% of what you see, with me manually tweaking bits and pieces for fixes and things the LLM couldn't guess (e.g. size of the output files for the tests)

kevinrue and others added 2 commits May 11, 2026 16:44
Co-authored-by: Pavankumar Videm <pavanvidem@gmail.com>
Co-authored-by: Pavankumar Videm <pavanvidem@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants