Skip to content

--targeted analysis #511

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
dcarrillox opened this issue Mar 24, 2025 · 3 comments
Open

--targeted analysis #511

dcarrillox opened this issue Mar 24, 2025 · 3 comments
Assignees
Labels
enhancement New feature or request WIP Work in progress

Comments

@dcarrillox
Copy link

Description of feature

Hi, we are currently working with a targeted methylation protocol and thought it would be great if the methylseq pipeline could handle this scenario. Me and my colleague @luigilamparelli will be working on this during the nf-core hackathon 2025 (slack). Here are the main points we plan to implement:

  • --targeted parameter: Currently, all detected methylation signals are included in the final results. However, some enrichment protocols target specific genomic regions. To address this, we propose adding a --targeted parameter that allows filtering results to retain only on-target methylation signals. The input would be a BED file specifying the target regions.

  • Enrichment metrics: Building on this idea, we propose adding a job to calculate on-target vs. off-target fractions of aligned reads, along with other enrichment metrics. A tool like CollectHsMetrics from Picard could be used for this purpose.

We'll work on this and let you know how it goes!

@dcarrillox dcarrillox added enhancement New feature or request WIP Work in progress labels Mar 24, 2025
@dcarrillox dcarrillox self-assigned this Mar 25, 2025
@dcarrillox dcarrillox linked a pull request Mar 25, 2025 that will close this issue
11 tasks
@bounlu
Copy link
Contributor

bounlu commented Mar 28, 2025

How different will the methylation metrics be than running the current pipeline as it is on a targeted data?

@dcarrillox dcarrillox removed a link to a pull request Mar 28, 2025
11 tasks
@dcarrillox
Copy link
Author

For targeted data, and with the pipeline as it is right now, methylation results also account for the off-target, ie. reads that align to other regions than the intended ones. The approach we followed is filtering the bedGraph from bismark/methyldackel with the target BED file.

Only Bismark is available for MultiQC, which gets the metrics from the log file. Since we filter after running bismark, the log files don't change and the multiqc results will still be "whole genome".

@bounlu
Copy link
Contributor

bounlu commented Mar 28, 2025

Considering the off-target data is usually about 10-15% for a normal protocol, I am curios to know to what extent the overall methylation patterns will change. It will be interesting to see the difference once you implement this.

@dcarrillox dcarrillox mentioned this issue Mar 31, 2025
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request WIP Work in progress
Projects
None yet
Development

No branches or pull requests

2 participants