Open
Conversation
Add new nf-core module wrapping `modkit extract calls`, which emits a per-read per-position table of base modification calls using the same pass/fail thresholding as `modkit pileup`. Complementary to `modkit/extract/full` (raw probabilities): this module emits the thresholded categorical decisions. Useful for per-read downstream analysis such as allele-specific methylation and methylation-aware phasing. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
5035e89 to
fafc527
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
PR checklist
Summary
Adds a new nf-core module wrapping `modkit extract calls`, which produces a per-read per-position table of base-modification calls (pass / fail / filtered, with the called base) using the same thresholding algorithm as `modkit pileup`.
Complementary to `modkit/extract/full`: `extract calls` emits the thresholded categorical decision per site per read, while `extract full` emits the underlying probabilities.
The module auto-detects `--bgzf` in `ext.args` and adjusts the output filename suffix accordingly.
Why
`modkit extract calls` is the go-to tool for per-read allele-specific methylation, methylation-aware phasing validation, and read-level QC where you want the same thresholded labels as the pileup output but at read level rather than site level.
Test data
Uses the existing `test.sorted.phased.bam` from nf-core/test-datasets (modules branch). No new test data required.
🤖 Generated with Claude Code