Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why does ribotish call so many ORFs compared to price and ribocode #36

Open
chenwt opened this issue May 31, 2024 · 6 comments
Open

Why does ribotish call so many ORFs compared to price and ribocode #36

chenwt opened this issue May 31, 2024 · 6 comments

Comments

@chenwt
Copy link

chenwt commented May 31, 2024

Hello,
I used the same dataset to call ORFs and found that the results from Ribotish are several times more than those from other software like Price and Ribocode. Which ones are accurate, and how should I understand this situation?

Best regards,
Tian

@zhpn1024
Copy link
Owner

It depends on the running mode (-t/--framebest etc.) and pvalue/FDR parameters.
You can also check the signal data (--transprofile) or plot the signals (transplot) and see if it is correct.

@chenwt
Copy link
Author

chenwt commented May 31, 2024

Thank you for your helpful reply.
My parameters already include the --framebest and --fsqth 0.05 options. Additionally, I have randomly selected a transcript for transplot, but I am uncertain if this is correct. I have also rerun the --transprofile to generate a Psite file. Could you please help me interpret it?
图片
图片

@zhpn1024
Copy link
Owner

In the plot figure, the green frame in the annotated ORF has many RPF counts. The significance is to test if the green frame RPFs are higher than RPFs in other frames in the ORF region. From the figure I think it is likely significant. The P-site file does not show all the data in the transcript (NOC2L?).

@chenwt
Copy link
Author

chenwt commented Jun 2, 2024

Hello, Developer,
Thank you for your explanation, but I'm still not quite clear on how to directly discern whether an unannotated ORF(ie : 5uORF ) is conclusively supported from the graph. I have regenerated several candidate ORFs' transplots and their corresponding psites. Could you elaborate further?
Thank you once again.

ENST00000379046.6

ORF: ENST00000379046.6:15-321

图片

Psite:
ENSG00000181019.13 ENST00000379046.6 NQO1 {} {117:3, 126:9, 151:1, 168:1, 201:2, 246:3, 252:8, 255:4, 380:2, 394:3, 427:1, 478:1, 487:1, 509:4, 511:3, 518:3, 599:2, 609:4, 649:4, 667:6, 685:5, 715:5, 753:1, 765:1, 802:2, 810:7, 811:1, 830:5, 835:3}

pred.txt:
ENSG00000181019.13 ENST00000379046.6 NQO1 protein_coding chr16:69710975-69718185:- CTG 385 856 Truncated 0 0 None 0.005108092776242139 T None None 0.04428611834419486 None 156
ENSG00000181019.13 ENST00000379046.6 NQO1 protein_coding chr16:69718249-69726569:- GTG 15 321 5'UTR 0 0 None 0.00048346434406483193 T None None 0.005911783275077591 None 101

ENST00000263026.10

ORF: ENST00000263026.10:314-668

图片

Psite:
ENSG00000103319.12 ENST00000263026.10 EEF2K {} {46:2, 206:1, 313:1, 314:6, 326:1, 485:2, 500:4, 549:1, 559:1, 572:2, 602:3, 605:4, 617:1, 673:1, 728:1, 729:1, 775:5, 877:4, 954:4, 1057:1, 1334:3, 1378:1, 1524:1, 1600:1, 1618:2, 1879:4, 1918:2, 1927:2, 1951:2, 1984:1, 1990:2, 2032:2, 2044:3, 2119:2, 2470:1, 2512:1, 2646:1}

_pred.txt:
ENSG00000103319.12 ENST00000263026.10 EEF2K protein_coding chr16:22258577-22283996:+ CTG 1591 2656 Truncated 0 0 None 9.783024909002418e-07 T None None 2.6231140551016775e-05 None 354
ENSG00000103319.12 ENST00000263026.10 EEF2K protein_coding chr16:22206591-22225919:+ CTG 314 668 5'UTR 0 0 None 0.000706709546140818 T None None 0.008215145787362092 None 117

ENST00000379046.6

ORF: ENST00000379046.6:15-321

图片
P ste:
ENSG00000181019.13 ENST00000379046.6 NQO1 {} {117:3, 126:9, 151:1, 168:1, 201:2, 246:3, 252:8, 255:4, 380:2, 394:3, 427:1, 478:1, 487:1, 509:4, 511:3, 518:3, 599:2, 609:4, 649:4, 667:6, 685:5, 715:5, 753:1, 765:1, 802:2, 810:7, 811:1, 830:5, 835:3}

ENSG00000181019.13 ENST00000379046.6 NQO1 protein_coding chr16:69710975-69718185:- CTG 385 856 Truncated 0 0 None 0.005108092776242139 T None None 0.04428611834419486 None 156
ENSG00000181019.13 ENST00000379046.6 NQO1 protein_coding chr16:69718249-69726569:- GTG 15 321 5'UTR 0 0 None 0.00048346434406483193 T None None 0.005911783275077591 None 101

ENST00000495184.5

ORF: ENST00000495184.5:44-1838
图片
Psite:
ENSG00000107099.18 ENST00000495184.5 DOCK8 {} {49:1, 56:2, 74:10, 152:1, 755:1, 773:1, 917:4, 936:1, 977:8, 978:1, 980:8, 992:2, 993:2, 1124:1, 1220:1, 1627:3, 1632:2, 4070:1, 4325:2, 4370:3, 4529:2, 4664:1, 4665:1, 4741:1, 4901:1, 5144:1, 5318:1, 5363:6, 5453:1, 5483:2, 5543:2, 5567:4, 5686:2, 5885:4, 5942:2, 6079:1, 6146:2, 6203:3, 6234:1, 6400:3, 6638:3, 6647:1, 6755:2, 6794:1, 6852:1, 6992:2, 7106:1, 7194:2, 7328:3, 7355:2, 7412:1, 7424:1, 7520:3, 7538:2, 7763:1, 7916:1, 7934:1, 8061:1, 8240:1}

_pred.txt:
ENSG00000107099.18 ENST00000495184.5 DOCK8 protein_coding chr9:372271-464219:+ ATG 4049 8255 Novel:CDSFrameOverlap 0 0 None 1.9848242435242048e-10 T None None 1.1696320958504234e-08 None 1401
ENSG00000107099.18 ENST00000495184.5 DOCK8 protein_coding chr9:286487-368315:+ GTG 44 1838 Novel:CDSFrameOverlap 0 0 None 0.00289251798059119 T None None 0.02701417877808646 None 597

@zhpn1024
Copy link
Owner

zhpn1024 commented Jun 2, 2024

There are GTG TIS codon prediction results, while the figure only show ATG results. Adding '--alt' option in the transplot module. The 5'UTR ORF at 15-321 corresonds to the red RPF signals. The CTG 314-668 5'UTR ORF corresonds to the blue signals, The 44-1838 ORF is also blue. The CDSFrameOverlap mark means there's an annotated ORF here by annother transcript annotation.

@zhpn1024
Copy link
Owner

zhpn1024 commented Jun 2, 2024

If you have weixin/wechat, you can add me by my id.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants