Skip to content

Latest commit

 

History

History
13 lines (7 loc) · 1.09 KB

File metadata and controls

13 lines (7 loc) · 1.09 KB

PAD2020_GenomicSequence_to_Species_Cluster

This project is my very first GitHub upload :-D the project useses a dynamic programming approach to get the job done. will ultimatly have 4 functions according to the following discription:

P1.py reads a file which and gets labels and DNA sequences out of this file in form of a list of tuples, where the first element of the tuple is a label and the second is a DNA sequence.

P2.py alignes the different sequences in pairs, where gaps might be introduced based on a calculation method. Further, the aligend sequences are saved in a dictonary with keys which are tuples with integers representing the sequence pairs and values as tuples with strings containing the aligned DNA sequences.

P3.py calculates a distance matrix where each aligned DNA pair results in an entriy in a matrix (list of list of floats) where the resulting values are saved.

P4.py with the distance matrix and a list of labels a binary tree can be formed in a sting with parentesis.

This project was realy fun to work on and continued to give valueable insides to the world of programming.