You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a nanopore sequencing library where one molecule could result in reads on both forward or reverse strands. I would like to know if preseq regard these pairs as "distinct" observation, or as "repeated" observation?
The definition of "distinct" observation is not very clearly stated in the documentation. I was trying to find it in the implementation:
It seems like get_start() is getting the smaller coordinate of a bam read?
If so, does it mean that both reverse and forward-mapped reads would be seen as the same molecule (same 'start') -- This is the desired behavior in my use case.
@zztin I think you are probably right about the meaning of that code. Frankly, it's a weak criterion, but it should work for generating big-picture complexity results. It would not be a great approach if the purpose were, e.g., to accurately identify SNVs. It might lose information that would help at some important locus. Also, I've only ever considered how this should work for short reads.
Unfortunately I don't think I can answer your question now and I will leave this issue open, but here's one thing that might help: the best way to use Preseq is by providing a counts histogram (or the counts themselves). That would allow you to count duplicates according to your own definition.
Hi Andrew,
I have a nanopore sequencing library where one molecule could result in reads on both forward or reverse strands. I would like to know if preseq regard these pairs as "distinct" observation, or as "repeated" observation?
The definition of "distinct" observation is not very clearly stated in the documentation. I was trying to find it in the implementation:
Related lines:
preseq/src/load_data_for_complexity.cpp
Line 90 in c0fc953
Related class: GenomicRegion.hpp
It seems like get_start() is getting the smaller coordinate of a bam read?
If so, does it mean that both reverse and forward-mapped reads would be seen as the same molecule (same 'start') -- This is the desired behavior in my use case.
I would love to hear from you!
The command I used:
Software version:
v2.0 downloaded precompiled binary
Best,
Li-Ting
The text was updated successfully, but these errors were encountered: