New dseq manu bjorn rebased #484

manulera · 2025-11-11T18:52:36Z

Same as #483, but rebased. Please continue developing on this one @BjornFJohansson

Hi @BjornFJohansson look at the last commit where I fixed the looped function, now it passes the tests. I created the draft PR so we can discuss here.

I like the changes to assembly2, I think they make things clearer, and the overriding of the PCR assembly function makes a lot of sense.

I wonder if this bit from assembly2 could be turned into a function (strands_anneal or something), or some way to test for reverse-complementarily:

        seq_u = loc_u.extract(f_u).seq
        seq_v = loc_v.extract(f_v).seq
        # instead of testing for identity we test if seq_u and seq_v anneal
        anneal = all(basepair_dict.get(x, y) for x, y in zip(str(seq_u), str(seq_v)))
        if not anneal:

…mplement_table, _complement_table, to_watson_table, to_crick_table, to_N, to_5tail_table, to_3tail_table, to_full_sequence, bp_dict

2. new Dseq.__init__ w same arguments as before, but data is now stored in Bio.Seq.Seq._data 3. altered Dseq.quick classmethod 4. watson, crick and ovhg are methods decorated with @Property 5. New method to_blunt_string with returns a the string of the watson strand if the underlying Dseq object was blunt. 6. Old __getitem__ replaced 7. New __repr__ method 8. new looped method 9. new __add__ method

… imports at the top. Some tests involved strands that did not anneal prefectly, these have been corrected.

…ytestrings 2. user method that removes U and leaves an empty site. 3. cast_to_ds_right, cast_to_ds_left methods, these are *not* fill_in methods as they do not rely on a polymerase. 4. New melt method, useful for USER cloning etc.. 5. reimplemented apply_cut method

… utils. This should fix U in primers

…XME indicating a large change in behaviour.

…e x and y has meaning in the new Dseq implementation. (line 1074) 2. The expected result in test_pcr_assembly_uracil should be AUUAggccggTTOO. 3. Removed numbers at start and end of some sequenses. This could be discussed. 4. Four instances of FIXME: The assert below fails in the Sanity check on line 770 in assembly2, but gives the expected result.

…he check for internal splits in init

fuction dsbreaks is called from pydna.alphabet in __init__ simplified code overall, fuction get_parts from pydna.alphabet used in several places simpler looped method using get_parts and __add__ improved error message from __add__

Copilot

Pull Request Overview

This PR introduces a major refactoring of pydna's DNA sequence representation system, implementing a new "dsIUPAC" alphabet (dscode) to better handle double-stranded DNA with overhangs, single-stranded regions, and USER enzyme treatment. The changes enable new molecular cloning techniques like USER cloning while maintaining backward compatibility.

Key changes:

New alphabet system with dscode symbols representing base pairs and single-stranded regions
Refactored Dseq class with improved internal representation and new methods for DNA manipulation
Enhanced support for sticky ends, melting, and enzymatic treatments (USER, T4, mung bean nuclease)

Reviewed Changes

Copilot reviewed 22 out of 22 changed files in this pull request and generated 27 comments.

Show a summary per file

File	Description
src/pydna/alphabet.py	New module defining dscode alphabet with base pair dictionaries and translation tables
src/pydna/dseq.py	Major refactoring of Dseq class with new internal representation and manipulation methods
src/pydna/utils.py	Added anneal_from_left function and updated complement logic
src/pydna/assembly2.py	Updated assembly logic to use new Dseq methods (cast_to_ds_, exo1_)
src/pydna/amplify.py	Improved primer annealing detection using new alphabet system
src/pydna/dseqrecord.py	Updated looped() method to handle features properly with sticky ends
tests/test_new.py	New test file for dscode representations
tests/test_USERcloning.py	Complete rewrite for USER enzyme cloning
tests/test_module_dseq.py	Extensive test updates for new Dseq behavior

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/pydna/design.py

tests/test_module_assembly2.py

src/pydna/dseq.py

tests/test_module_dseq.py

src/pydna/assembly2.py

tests/test_module_dseqrecord.py

tests/test_new.py

manulera · 2025-11-18T12:55:00Z

Put string in ValueError. Co-authored-by: Copilot <[email protected]>

remove unused import Co-authored-by: Copilot <[email protected]>

removed unused import Co-authored-by: Copilot <[email protected]>

removed commented-out code. Co-authored-by: Copilot <[email protected]>

remove unused import Co-authored-by: Copilot <[email protected]>

codecov · 2025-12-10T14:44:48Z

Codecov Report

❌ Patch coverage is 87.22003% with 97 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/pydna/dseq.py	80.25%	64 Missing and 13 partials ⚠️
src/pydna/design.py	81.57%	7 Missing and 7 partials ⚠️
src/pydna/alphabet.py	99.03%	1 Missing and 1 partial ⚠️
src/pydna/dseqrecord.py	91.66%	2 Missing ⚠️
src/pydna/assembly2.py	96.15%	0 Missing and 1 partial ⚠️
src/pydna/ligate.py	0.00%	0 Missing and 1 partial ⚠️

@@            Coverage Diff             @@
##           master     #484      +/-   ##
==========================================
- Coverage   93.67%   92.57%   -1.10%     
==========================================
  Files          40       41       +1     
  Lines        4740     5183     +443     
  Branches      669      723      +54     
==========================================
+ Hits         4440     4798     +358     
- Misses        243      306      +63     
- Partials       57       79      +22

Files with missing lines	Coverage Δ
src/pydna/__init__.py	`59.52% <100.00%> (+11.03%)`	⬆️
src/pydna/amplify.py	`98.62% <100.00%> (-0.02%)`	⬇️
src/pydna/opencloning_models.py	`98.31% <ø> (ø)`
src/pydna/seq.py	`74.54% <100.00%> (-1.53%)`	⬇️
src/pydna/seqrecord.py	`86.69% <ø> (ø)`
src/pydna/utils.py	`88.84% <100.00%> (+0.04%)`	⬆️
src/pydna/assembly2.py	`96.39% <96.15%> (-0.30%)`	⬇️
src/pydna/ligate.py	`82.85% <0.00%> (ø)`
src/pydna/alphabet.py	`99.03% <99.03%> (ø)`
src/pydna/dseqrecord.py	`92.80% <91.66%> (-0.43%)`	⬇️
... and 2 more

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

…ot finding suitable A-T pairs across the junction.

string. Added a Dseq.getparts method. Changed all calls to the getparts function to use the method. removed transcribe, translate methods (now in pydna.seq.Seq)

BjornFJohansson added 30 commits November 11, 2025 18:26

updated seq class: use base class slicing, added full_sequence property

5f3eb2e

Added override of Bio.Restriction.FormattedSeq._table

8f4dd7c

Added dicts and tables _ambiguous_dna_complement, _keys, _values, _co…

3dc3f48

…mplement_table, _complement_table, to_watson_table, to_crick_table, to_N, to_5tail_table, to_3tail_table, to_full_sequence, bp_dict

Dseq + empty string now returns the Dseq obj unchanged. collected all…

0bc3678

… imports at the top. Some tests involved strands that did not anneal prefectly, these have been corrected.

_annealing_positions new implementation using _iupac_compl_regex from…

e7eceae

… utils. This should fix U in primers

fixed fill_left and fill_right and a FIXME

c48646d

fixed initiation

6ac6149

deleted

7eecce4

anneal_from_left function and more regexes and tables

04a5ea4

updated test for USER cloning

9c7689d

moved all imports to the beginning. Changed some tests. There is a FI…

d523aff

…XME indicating a large change in behaviour.

removed if __name__ == __main__

e938341

removed main test and moved imports to the top

d801d55

Updated docstrings in Dseq class for clarity, work in progress

45705d3

fix doctests

8e79a54

removed main chunk

89e6b60

moved import

c73970e

removed reference to .length property

42cd797

removed reference to .length property

39ea20b

removed code regarding the alphabet, not in the alphabet module

a092087

broke out the __repr__ code to a function for clarity, reintroduced t…

27f5e78

…he check for internal splits in init

alphabet related code in src/pydna/alphabet.py

f998c25

mostly comments

98416f4

Commented out code to be removed.

c061519

Only check for start of error message.

b551729

Clearer names for some dicts

417e62e

Copilot started reviewing on behalf of manulera November 18, 2025 11:44 View session

Copilot finished reviewing on behalf of manulera November 18, 2025 11:45

Copilot AI reviewed Nov 18, 2025

View reviewed changes

manulera mentioned this pull request Nov 18, 2025

Manu patch #489

Merged

manulera and others added 13 commits December 4, 2025 12:06

switch from namedtuple to dataclass

4ac6857

remove FIXMEs

cf756a9

add getitem to DseqParts dataclass

9a18eea

new function anneal_strands

d470d0e

use anneal_strands function for sanity check

bd18833

Update src/pydna/dseq.py

ee5e31c

Put string in ValueError. Co-authored-by: Copilot <[email protected]>

Update tests/test_module_dseqrecord.py

69d7ffe

remove unused import Co-authored-by: Copilot <[email protected]>

Update tests/test_module_dseqrecord.py

5fe3701

removed unused import Co-authored-by: Copilot <[email protected]>

Update tests/test_module_dseqrecord.py

e80c0ce

removed unused import Co-authored-by: Copilot <[email protected]>

Update src/pydna/dseqrecord.py

f1e0596

removed commented-out code. Co-authored-by: Copilot <[email protected]>

Update tests/test_new.py

d349a1c

remove unused import Co-authored-by: Copilot <[email protected]>

code review

2dce246

fixed doctests

2c25575

BjornFJohansson added 11 commits December 11, 2025 07:48

More docstrings.

2ab4a01

put this test into test_module_dseq.py

8cc1a65

added tests for Dseq

88eb8aa

finished the user_assembly_design function so that it can deal with n…

39add7c

…ot finding suitable A-T pairs across the junction.

added uracil to mw calc

594d95a

test for dseq.find over the origin

b342674

moved transcribe, translate methods to pydna.seq.Seq

c75245e

Mod the CircularBytes.find method so that it also accepts

bc981d8

string. Added a Dseq.getparts method. Changed all calls to the getparts function to use the method. removed transcribe, translate methods (now in pydna.seq.Seq)

changed doctests so they generate ProteinSeq

55ee50e

speed up test collection for pytest

9306741

lock

24575af

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

New dseq manu bjorn rebased #484

New dseq manu bjorn rebased #484

Uh oh!

manulera commented Nov 11, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

manulera commented Nov 18, 2025 •

edited by BjornFJohansson

Loading

Uh oh!

codecov bot commented Dec 10, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

New dseq manu bjorn rebased #484

Are you sure you want to change the base?

New dseq manu bjorn rebased #484

Uh oh!

Conversation

manulera commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

manulera commented Nov 18, 2025 • edited by BjornFJohansson Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov bot commented Dec 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manulera commented Nov 11, 2025 •

edited

Loading

manulera commented Nov 18, 2025 •

edited by BjornFJohansson

Loading

codecov bot commented Dec 10, 2025 •

edited

Loading