Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose some bwa indel parameters through the off-target checker. #95

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions prymer/offtarget/bwa.py
Original file line number Diff line number Diff line change
Expand Up @@ -232,6 +232,7 @@ def __init__(
max_mismatches_in_seed: int = 3,
max_gap_opens: int = 0,
max_gap_extensions: int = -1,
min_indel_to_end_distance: int = 3,
seed_length: int = 20,
reverse_complement: bool = False,
include_alt_hits: bool = False,
Expand All @@ -248,6 +249,8 @@ def __init__(
max_gap_opens: the maximum number of gap opens allowed in the full query sequence
max_gap_extensions: the maximum number of gap extensions allowed in the full query
sequence
min_indel_to_end_distance: do not place an indel within this many bp of the ends of
the query sequence
seed_length: the length of the seed region
reverse_complement: reverse complement each query sequence before alignment
include_alt_hits: if true include hits to references with names ending in _alt,
Expand Down Expand Up @@ -288,6 +291,8 @@ def __init__(
f"{max_gap_opens}",
"-e",
f"{max_gap_extensions}",
"-i",
f"{min_indel_to_end_distance}",
"-l",
f"{seed_length}",
"-k",
Expand Down
12 changes: 11 additions & 1 deletion prymer/offtarget/offtarget_detector.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,7 +159,9 @@ def __init__(
three_prime_region_length: int,
max_mismatches_in_three_prime_region: int,
max_mismatches: int,
max_amplicon_size: int,
max_gap_opens: int = 0,
max_gap_extends: int = -1,
max_amplicon_size: int = 1000,
Comment on lines +162 to +164
Copy link

@coderabbitai coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Default max_gap_opens=0 is too restrictive

PR indicates indels affect primer annealing. Consider default of 1 to allow single indels.

-        max_gap_opens: int = 0,
+        max_gap_opens: int = 1,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
max_gap_opens: int = 0,
max_gap_extends: int = -1,
max_amplicon_size: int = 1000,
max_gap_opens: int = 1,
max_gap_extends: int = -1,
max_amplicon_size: int = 1000,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tfenne any reason why not? Or should we keep this at 0 for backwards compatibility?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

min_primer_pair_hits: int = 1,
cache_results: bool = True,
threads: Optional[int] = None,
Expand Down Expand Up @@ -201,6 +203,12 @@ def __init__(
three_prime_region_length
max_mismatches: the maximum number of mismatches allowed in the full length primer
(including any in the three prime region)
max_gap_opens: the maximum number of gaps (insertions or deletions) allowable in an
alignment of a oligo to the reference
max_gap_extends: the maximum number of gap extensions allowed; extending a gap
beyond a single base costs 1 gap extension. Can be set to -1 to allow
unlimited extensions up to max diffs (aka max mismatches), while disallowing
"long gaps".
max_amplicon_size: the maximum amplicon size to consider amplifiable
cache_results: if True, cache results for faster re-querying
threads: the number of threads to use when invoking bwa
Expand All @@ -220,6 +228,8 @@ def __init__(
seed_length=three_prime_region_length,
max_mismatches_in_seed=max_mismatches_in_three_prime_region,
max_mismatches=max_mismatches,
max_gap_opens=max_gap_opens,
max_gap_extensions=max_gap_extends,
Comment on lines +231 to +232
Copy link

@coderabbitai coderabbitai bot Nov 20, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix parameter name inconsistency

Parameter names differ between constructor and BWA initialization.

-            max_gap_extensions=max_gap_extends,
+            max_gap_extends=max_gap_extends,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
max_gap_opens=max_gap_opens,
max_gap_extensions=max_gap_extends,
max_gap_opens=max_gap_opens,
max_gap_extends=max_gap_extends,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tfenne I do think we should normalize the parameter names across the two signatures. "max gap extensions" appears more commonly in the repo:

MAX_GAP_EXTENSIONS: int = -1
"""The default maximum number of gap extensions allowed in the full query sequence"""

max_hits=max_primer_hits,
)
self._max_primer_hits = max_primer_hits
Expand Down
1 change: 1 addition & 0 deletions tests/offtarget/test_bwa.py
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,7 @@ def test_map_single_hit(
max_hits=1,
max_mismatches=5,
max_gap_opens=1,
min_indel_to_end_distance=5,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Value mismatch with PR objective

PR states minimum distance should be 3, but test uses 5.

-        min_indel_to_end_distance=5,
+        min_indel_to_end_distance=3,
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
min_indel_to_end_distance=5,
min_indel_to_end_distance=3,

reverse_complement=reverse_complement,
) as bwa:
query = Query(bases=bases, id=f"{hit}")
Expand Down
Loading