-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expose some bwa indel parameters through the off-target checker. #95
base: main
Are you sure you want to change the base?
Conversation
@ ameynert for your consideration |
WalkthroughThe pull request introduces updates to the In the Possibly related PRs
Suggested reviewers
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
prymer/offtarget/offtarget_detector.py (1)
206-211
: Enhance gap parameter documentationAdd context about indel impact on primer annealing and define "long gaps".
max_gap_opens: the maximum number of gaps (insertions or deletions) allowable in an - alignment of a oligo to the reference + alignment of a oligo to the reference. Indels within 3bp of primer ends + significantly affect primer annealing. max_gap_extends: the maximum number of gap extensions allowed; extending a gap beyond a single base costs 1 gap extension. Can be set to -1 to allow - unlimited extensions up to max diffs (aka max mismatches), while disallowing - "long gaps". + unlimited extensions up to max diffs (aka max mismatches). Long gaps + (>1bp) near primer ends may prevent proper annealing.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
prymer/offtarget/bwa.py
(3 hunks)prymer/offtarget/offtarget_detector.py
(3 hunks)
🔇 Additional comments (2)
prymer/offtarget/bwa.py (2)
235-235
: LGTM: Parameter addition and documentation are clear.
Parameter name and default value align with experimental findings about indel effects on primer annealing.
Also applies to: 252-253
294-295
: Verify BWA flag usage.
Confirm -i
is the correct BWA flag for min_indel_to_end_distance
.
max_gap_opens: int = 0, | ||
max_gap_extends: int = -1, | ||
max_amplicon_size: int = 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Default max_gap_opens=0
is too restrictive
PR indicates indels affect primer annealing. Consider default of 1 to allow single indels.
- max_gap_opens: int = 0,
+ max_gap_opens: int = 1,
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
max_gap_opens: int = 0, | |
max_gap_extends: int = -1, | |
max_amplicon_size: int = 1000, | |
max_gap_opens: int = 1, | |
max_gap_extends: int = -1, | |
max_amplicon_size: int = 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tfenne any reason why not? Or should we keep this at 0 for backwards compatibility?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
max_gap_opens=max_gap_opens, | ||
max_gap_extensions=max_gap_extends, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix parameter name inconsistency
Parameter names differ between constructor and BWA initialization.
- max_gap_extensions=max_gap_extends,
+ max_gap_extends=max_gap_extends,
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
max_gap_opens=max_gap_opens, | |
max_gap_extensions=max_gap_extends, | |
max_gap_opens=max_gap_opens, | |
max_gap_extends=max_gap_extends, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tfenne I do think we should normalize the parameter names across the two signatures. "max gap extensions" appears more commonly in the repo:
prymer/prymer/offtarget/bwa.py
Lines 197 to 198 in 385e1b1
MAX_GAP_EXTENSIONS: int = -1 | |
"""The default maximum number of gap extensions allowed in the full query sequence""" |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #95 +/- ##
=======================================
Coverage 96.67% 96.67%
=======================================
Files 26 26
Lines 1712 1712
Branches 333 333
=======================================
Hits 1655 1655
Misses 31 31
Partials 26 26 ☔ View full report in Codecov by Sentry. 🚨 Try these New Features:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (1)
tests/offtarget/test_bwa.py (1)
Line range hint
218-240
: Add test cases for indel distance behaviorCurrent tests don't verify indel distance constraints.
Add test cases:
@pytest.mark.parametrize( "indel_distance, should_map", [ (2, False), # Too close to end (3, True), # Minimum allowed distance (4, True), # Above minimum ] ) def test_indel_distance_constraints(ref_fasta: Path, indel_distance: int, should_map: bool) -> None: bases = _PERFECT_BASES[:indel_distance] + "TTT" + _PERFECT_BASES[indel_distance:] with BwaAlnInteractive(ref=ref_fasta, max_hits=1, min_indel_to_end_distance=3) as bwa: result = bwa.map_one(query=bases, id="test") assert (result.hit_count > 0) == should_map
@@ -222,6 +222,7 @@ def test_map_single_hit( | |||
max_hits=1, | |||
max_mismatches=5, | |||
max_gap_opens=1, | |||
min_indel_to_end_distance=5, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Value mismatch with PR objective
PR states minimum distance should be 3, but test uses 5.
- min_indel_to_end_distance=5,
+ min_indel_to_end_distance=3,
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
min_indel_to_end_distance=5, | |
min_indel_to_end_distance=3, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirming that this is working for me in a project that uses the OffTargetDetector
.
max_gap_opens=max_gap_opens, | ||
max_gap_extensions=max_gap_extends, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tfenne I do think we should normalize the parameter names across the two signatures. "max gap extensions" appears more commonly in the repo:
prymer/prymer/offtarget/bwa.py
Lines 197 to 198 in 385e1b1
MAX_GAP_EXTENSIONS: int = -1 | |
"""The default maximum number of gap extensions allowed in the full query sequence""" |
max_gap_opens: int = 0, | ||
max_gap_extends: int = -1, | ||
max_amplicon_size: int = 1000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tfenne any reason why not? Or should we keep this at 0 for backwards compatibility?
I did some playing with ntthal, and it appears to that the when calculating Tms of duplexes, an indel closer than 3bp from the end of the primer causes it to see the rest of the primer as non-annealed. In some cases it can be more (depending on sequence, it may not see the remainder as annealed after an indel 4-5bp in), but it seems consistent that it needs at least 3bp past the indel to anneal bases. Hence setting
min_indel_to_end_distance: int = 3
.