Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rMATS vs rMATS-long #27

Open
jylee43 opened this issue Feb 7, 2025 · 1 comment
Open

rMATS vs rMATS-long #27

jylee43 opened this issue Feb 7, 2025 · 1 comment

Comments

@jylee43
Copy link

jylee43 commented Feb 7, 2025

Hello,

May I ask what is the main difference between rMATS (rMATS-turbo) and rMATS-long? I know it's a naive question because it's not simple to compare the tools with different input types (short-read and long-read) and periods (in 2014 and 2024). I try to understand what is the big difference between rMATS and rMATS-long, but it seems to take some time. I would like to get brief ideas from developers.

I analyzed the ONT dataset by rMATS-long for my collaborators. rMATS-long worked very well, but the number of significant isoforms detected was less than 60.

Before the collaborators asked me to analyze the same dataset, they tried rMATS and obtained over 1000 significant isoforms. (The dataset was sequenced before rMATS-long was released.) After submitting the manuscript, a reviewer recommended using rMATS-long instead of rMATS. After finding a large difference in the significant numbers of isoforms between rMATS and rMATS-long, we are puzzled.

I am reading key references of rMATS (PNAS, 2014), rMATS-turbo (Nature Protocols, 2014), and ESPRESSo (Science Advances, 2023). I have searched for a manuscript of rMATS-long, but I haven't found any online. I also found With ONT data, I previously used [minimap+stringti+SUPPA] for differential alternative splicing events, and [minimap+salmon+DRIMSeq/DEXseq/StageR] for differential transcript usage.

Thank you,
Jiyoung

@EricKutschera
Copy link
Contributor

The main difference between rMATS-turbo and rMATS-long that could contribute to the change in the number of significant results is that rMATS-turbo identifies local splicing events (SE, A5SS, A3SS, MXE, RI) while rMATS-long identifies full-length isoforms. One possible situation is that a gene has multiple isoforms that include a particular exon and also multiple isoforms that skip that exon. If all of the inclusion isoforms are used a bit more in group 1 and all of the skipping isoforms are used a bit more in group 2 then rMATS-long may not report a significant event since each individual isoform only has a small change. However rMATS-turbo would see a larger difference since it's essentially looking at the combined change in skipping isoforms versus the combined change in inclusion isoforms. Another possible situation is that multiple splicing events overlap in such a way that rMATS-turbo counts reads toward an event but rMATS-long does not. That could happen with an exon that sometimes is skipped and also sometimes has an alternative splice site. Also it could be that some of the difference is due to read coverage or some other difference between the specific short-read and long-read datasets used

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants