Heliano in EDTA? #524

chestnutbt · 2024-12-05T15:13:47Z

Hi Prof. Ou. Thank you very much for creating this tool! I would like to ask you, would you consider including Heliano (https://github.com/Zhenlisme/heliano) in the EDTA workflow? I have been doing some tests and it seems that Heliano and HelitronScanner both find very different Helitron sets in a genome. The overlap between the results of Heliano and HelitronScanner outputs is very low.

oushujun · 2024-12-05T16:56:19Z

More is not better. Do you have a way to find out which result is in higher quality? Shujun

…

On Thu, Dec 5, 2024 at 10:14 AM chestnutbt ***@***.***> wrote: Hi Prof. Ou. Thank you very much for creating this tool! I would like to ask you, would you consider including Heliano ( https://github.com/Zhenlisme/heliano) in the EDTA workflow? I have been doing some tests and it seems that Heliano and HelitronScanner both find very different Helitron sets in a genome. The overlap between the results of Heliano and HelitronScanner outputs is very low. — Reply to this email directly, view it on GitHub <#524>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABNX4NFKBFPKH2WPYH2E4X32EBUUDAVCNFSM6AAAAABTCWDKXOVHI2DSMVQWIX3LMV43ASLTON2WKOZSG4ZDANRYGEZTOMQ> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

chestnutbt · 2024-12-06T09:36:47Z

Hi!
I honestly cannot judge. As far as I understand, HelitronScanner first scans for 5' and 3' Helitron terminal sequences and then it pairs them assuming that the sequence between two Helitron ends is a Helitron (with a maximum distance of 20 kb between ends). Heliano first scans for Helitrons transposase ORFs, then searches 5' and 3' ends close to the transposase ORFs and uses that 5' and 3' end sequences to find more Helitron elements. In principle the Heliano strategy sounds more reliable, because it first finds autonomous elements and then searches for non-autonomous elements. However, I don’t know if the Helitron transposases ORFs scanning can make mistakes, or if the fact that there may be not Helitron transposases in a genome (according to Heliano) strictly means that that genome doesn´t have Helitrons. What I have seen is: if I run Heliano and HelitronScanner (using EDTA_raw.pl) to find Helitrons in Arabidopsis TAIR10, Heliano finds around 200 helitrons and HelitronScanner find around 300. 95 are common (not all of them are exactly the same sequences, there are some discrepancies). In the TAIR10 TE annotation there are more than 10k sequences annotated as Helitrons (annotation made using RepeatMasker and the RepBase database, if I am not wrong). Both 94% and 97% of the Helitrons found by HelitronScanner and Heliano, respectively, are annotated as Helitrons in the TAIR10 TEs GFF file.

oushujun · 2024-12-17T17:34:18Z

There are pros and cons for both approaches. The assumption that two adjacent ends make a Heliton in HelitronScanner is too bold and easy to include non-Helitron sequences in the prediction. The approach to first finding Helitron ORFs and then extending to full-length elements helps to reduce false identifications but assumes all Helitrons in a genome have at least one full-length element with ORFs, which is another extreme and most likely underestimates the Helitron contents in a genome.

chestnutbt · 2025-01-15T10:41:40Z

Hi Prof Ou. Thank you very much for your comment. This is why I wonder if these two tools could be complementary, rather than competing with each other. Perhaps the two together could offer a more accurate prediction? Or would it be the opposite, would the Helitron prediction be worse?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heliano in EDTA? #524

Heliano in EDTA? #524

chestnutbt commented Dec 5, 2024

oushujun commented Dec 5, 2024 via email

chestnutbt commented Dec 6, 2024

oushujun commented Dec 17, 2024

chestnutbt commented Jan 15, 2025

Heliano in EDTA? #524

Heliano in EDTA? #524

Comments

chestnutbt commented Dec 5, 2024

oushujun commented Dec 5, 2024 via email

chestnutbt commented Dec 6, 2024

oushujun commented Dec 17, 2024

chestnutbt commented Jan 15, 2025