Skip to content

Commit

Permalink
long SV roll limit
Browse files Browse the repository at this point in the history
lexicographically minimum rolling on huge sequences is slow and may
offer little benefit
  • Loading branch information
ACEnglish committed Jan 13, 2025
1 parent 31b79f9 commit 0aedec2
Showing 1 changed file with 8 additions and 1 deletion.
9 changes: 8 additions & 1 deletion truvari/comparisons.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,15 @@ def coords_within(qstart, qend, rstart, rend, end_within):
def best_seqsim(a_seq, b_seq, st_dist):
"""
Returns best of roll, unroll, and direct sequence similarity
.. warning::
`roll_seqsim` is only called when both sequences are < 500bp in length
"""
return max(roll_seqsim(a_seq, b_seq), unroll_seqsim(a_seq, b_seq, st_dist),
# Only allow rolling on < 500bp sequences, otherwise, it gets huge/slow
if len(a_seq) < 500 and len(b_seq) < 500:
rssm = roll_seqsim(a_seq, b_seq)
else:
rssm = 0
return max(rssm, unroll_seqsim(a_seq, b_seq, st_dist),
unroll_seqsim(b_seq, a_seq, -st_dist), seqsim(a_seq, b_seq))


Expand Down

0 comments on commit 0aedec2

Please sign in to comment.