Skip to content

Commit 4dde447

Browse files
committed
solved pdst
1 parent 4b3c2f5 commit 4dde447

File tree

3 files changed

+199
-0
lines changed

3 files changed

+199
-0
lines changed

resources/rosalind_pdst.txt

Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
>Rosalind_8299
2+
GCCCTCGTGCTCGACATTGCAAGCGACTTTAAGCCCGTAGCGTTGGCTACATCGGACATC
3+
GTTTCTTCTTAGCGCACGGGATGAGGCCAGATTTTGCATCCTTATTGCGGCACGTAGTAG
4+
CGATTATCTTATCGCTATCACGTCCGTTTCTCACGGCATATGTCCTATACTCTAGCGATT
5+
TTGACGAGACTCCTGTATGCCACGCCCTGGGACGTCCACGTACCTTGCCGAAAAATCGGT
6+
CCAGCCAAGATTGCTTTCCACGAGCTTTAACCGAGCCCAATTAGTAATGGCAGCCAACAC
7+
AAACTATTAACGGAAGTAGGCCTTCACTAGACCTTACCAACGAGACGGAATTGAACCCGC
8+
GTAGGGCCGTTAAATCTCGAAGACCTCGCGGTGCCGATAGGGGGCGGATGACGCTTCTGA
9+
CGCCGCGCATGCGCAAGAGGATTCCCCGTCAAGAAGTAAGGATACCTTGAATCACTCCCA
10+
TTGGCCGGGGTTGGTTGGTTCGCTTCGTTGACCCAGAGCAACGTGCTAGGACTATAAGGG
11+
AAGCATCTACCCGACATGAGTACTATATGTTGCGAGTGCATCGTGGTAGTCGGGCCCTTG
12+
TAATCTTTCGCCGAGGAAGAATACATACAGCAGCAGGTTATTTTAGGCTCCGCAGTCTCT
13+
CAATACCGAAAGCAGAGGCTAGGGCCCTATTAGATGGCGGCGTGACGGTTCCATTGTGCC
14+
AAACCGCGGCATGGTGGCGCGGATCCTTGGCACGTGGAGTGGGCGTTGGTTTACATTGCT
15+
GTGATGATGTATTAGTTCCAATGGAGTGAGGATCGCTCATCCGTTCCATGACTCTTTCAA
16+
ACACTATAGGACTCGGCCTGTATCCAGACAACCACAGAGACTTTAGGGCATTTGGGGTGG
17+
TCGC
18+
>Rosalind_2906
19+
TTTGCGATACTGGGTTTGAAGTCCACGTGATGCCGCTACTTTAGAAGTCGATCAGCGCAT
20+
CCCTCGCTACAGACGGAGTGTCAGAGCAGGGTCTGGAGAATGCACTAAGACGGTGCGTGG
21+
GAGTCCTGTCCTTGTTATTCTTAGTATGTCCTTAAGTAGCCATTCTAGCCATAAAAGGTT
22+
TGGTCTACGACGCCGTGTGGAATGGCTTAAGTTACGTCTGCAATACTCCGCGCGAATTCC
23+
GCAAGGAACGCTAGCTTTTAAATTTCCTCCGACCGTCCAATGGCTCACGACATAACGGCC
24+
AAGCTTATAACGAAGACGGGTTTACCTGTAGTGTTTGTCCAGGTGCAGGGACATACGCCG
25+
GTGGATCCCTAGAAATCTCACAAACTCATAAGGGGATCCTGCTTTTGAAACCCCGCAGGG
26+
GTCCGTTATTGGGCATCAGCAATAGTCCAGGGGTAACTTGGATATCCGTGATTATTTCCA
27+
TCGGTTACGGACCGTTTATTAAGGGAGCCGCTCCTGCTTGGCCTGGGGGGGGTGAAGAGG
28+
ATGTACACCTTGCTTGCTGAGTTTGAAGACTCATACAGTATTATGACCGTCCCACTCCGT
29+
TGTCGCCTTGCTAGAACAGGATACGAATAGAAATGACTTCCTTTACACTTATTGCGCTCG
30+
CGTATTTGCGGGGAGAGTCTGGAGCCCTATTAACGGGTATTACAGAAGTCCACGTCTAAG
31+
ACACAACGACATAATTGCATCCCCATTGCGCATAAACAATATGACCGAAATTGATTTATT
32+
GAAACGGCTTGTCGCTCGCCCCGATAGGGTATGCGTCCAGAACGGTCGTATTTATGTTAA
33+
ACACTCTGCGAATTGCGTCGAGATGAAGACGCTGCGAAGACTGCCGAACACCGAATGCGT
34+
CCAC
35+
>Rosalind_3049
36+
TTTATGGTACTTGATTTGGATACTTCGCGGCAGCCCTACTTTTGAATCCGATCAGTGTGT
37+
TTATTGCTACGGACGGAGTGTCAGGTTGGGGCCTTCATATTACACTAAGGCAGCGCAGGG
38+
GAGTCCTGTCATCGCTATCATTAAGAGTTCCTTTAGTAGCCAATCTGGCTAGTAAAATTT
39+
TAGCATACGAGGGCGCATAGAACGGTATGTACTAGGTCTACGGTTTTTCCTGCTTATTCC
40+
ACAAGAGGCATTGGCTCTGTTATCTCCTCCGATCGCCCCGGGGCTCAGGGAATAACGGTC
41+
AATCCTACAATGAAAACGTATGTGCCCGCAGTGATCGTCAGGACACTGGGACATTCACTG
42+
GTCTGCTGCCATAAACCTTTCAGACCGGGATGGGACTCCTGCCTCTCAAACCGCCCAGTG
43+
GCGCGGCGTTTCGCATCAGCAACAACTCAGAAGTAACCCGAATATACGTGGTTGTATGCA
44+
CCGTTAACGGGCCGCTTACTAAGAGTACCTTTCCGGCTTAGTCCGGAAGAGGCTGAGAAG
45+
ATGTACACCTTACTTGCTAAAATCGGCGGTTCGTACAGCGTCATGACCGACCCCCCCCGC
46+
CGTTGCCTTTCTATGACAAGACACGATCAGAAAAGATTTCCGTTAGGTCCATTGCACCCG
47+
CGTATTCGTGAGAGGAAGCTGGGGCTTTACTAGCAAGTACCACAAAGGTCCACTTCCAGG
48+
CCGCAACAACGTAATCGAGTCTCCATCGTGACTGAGCGCTGTAGCCAGAGTTAACTTATC
49+
GGGATGGGTTGTCACTTGACCCAATGGGGTATGCGTTCGGGCCGGTCGTGTTTGAACTAG
50+
ACACCCTATAAATCGCGTCAGGACGGCGAGATTGAGAAGGCTGCCGAACACTAAATGCCT
51+
GCAC
52+
>Rosalind_3431
53+
ACTGTCGTACTCGGTTCGAAACCCGGGTTGAACTACTAGTCGGGAACCCCAGTGGAGCCC
54+
GTTTTCTCGCAGGGGGAGCGTCAAAGCGGGATTTGAACTATGTACTCAGGCATCGAATGG
55+
CGGCTCAGTTGCTGTCATCGCATGTATGTCATCTAACGGTTATCCTGCACGCAGAAGATT
56+
TTGTCAAGACTGCCGTATACAGTGTTCTAAGTCACCGGTGCACTCTCCCGAGAGGATGAC
57+
ATGGCGAAAACCGGTTACTACATGCCCTAGCCGGATCCAACGAGTAACGATGGACAGCAC
58+
AAGCTGTTGACGAATACGGACCCTTCTTAAGCGCTACTACAGTAACGGAATTATATACCG
59+
GTGGGACCTCCGCGGTCTTATGGCTTCATGGCGTCAATATGGAACTGATAACTCAGAAAA
60+
GTCCGTTTCTGGGCAGCATAACCAGCCGATAAGTGACACGGATGCCTTGAGTCTCCTCCA
61+
TTGATTGGAGATGCTCGATTCGTGTAGCCGAATCGGGATAACCTGTGAAGGGTAAAAAAG
62+
ATGCACGCCCGTGCCATGAACGCTACATATTTATGCTACTTCGTGACCATCCGCCCCGTC
63+
TAGTGCCCCGCTAGGATAGGACATAAGTAGGAGTAATTTCTTTCAAGCTCCTCGCGCCTC
64+
TGCACTTGTAAAGAAAGACTGGAATCCTGTTAATTGACGTCACTGCAATTCACTCATGGA
65+
ATGCAACAGGGTGATAGTATGGTCACTACGCGTAAAGAACGTGAGTCAGTTCGTGTTGTT
66+
ACGGCGACGTAACATGTATTCCGCTAAGGTGATCGCCCAGCTCTCTCAAGTTTATCTCAA
67+
ACACTCTTTGGATTGGCCTGTCTTTAGGGCTCTACAGAAACTTTGGAGCGCCTAACGTGG
68+
CTAC
69+
>Rosalind_8244
70+
ACTCCCGTGCTCGATACTGAGACCGACTTAAAACGCGAATCGTGAATTACATTGGACACC
71+
GTATTCTCCTAGGGCGTAGGCCGAGGCGAGATTCGGACTACGTACTGAGGCACGTAATAG
72+
CGACTCTGTTACCGTCATCGCATCCGTGTTGTCTGACAATTGTCCTACGCCCAAGAGATT
73+
TTGACGAGACTGCTGTATGCAGCGTTCTCAGACATCCATGCGCTTTGCCGAAAGATCGGC
74+
CCAGCGAAAATTAGTTGCCACGTGCCCTAGCCAGATTCAACTAGTAATGGTAGACAGCAC
75+
GAGCTATTGACGGAAGCGGGCCTTTCTTAAACGTTACCAACGAAACGAAATTGAATACTA
76+
GTGGGACCGCCGAATTCTCAAGGCCTCACGGTGTCAATAGGGGACGGATAACCCTTCCAA
77+
CGCCGCGTTTGGGCAACATAATCTCCCAATATGAGGCAAGGATATCTCGAGCTACCTCCA
78+
TTGGTGCAGGATGCTTGATTCGTGTTGCTGACTCAGGATAACGTATGAGGGCTAAAAAGG
79+
AAGCGTCTACTTGACATGATTACTACATATTGCTATTACTTTGTGGTCGTCCGCCCCGTC
80+
TAGTCTTCCGCCGAGGTAGGGTACAAACAGCAGTAAGTTCTTTTAAGCTCCCCGGGTCCC
81+
TGCTCTTGTAAACAAAGGCCGGAGTCCTGTTAGCTAACGGCACGGCGGTTCAATCATGTA
82+
AAGCATCAGGGTGATGGCATGGATCATTGGTACGAGGAGTGGAAGTTAGTTCGTATTGCT
83+
GTGATAGTGTATTATTCTCCGTAGTATGAGGATCGCTCAGCCCCTCCACGTTTATCTCAG
84+
ACGCTCTCCGGTTCGGCCTGTCTTCAAGTCCCTACAGAAACTGTAGAACACCTAGCATGG
85+
CGAC
86+
>Rosalind_3616
87+
ACTGTTGTGACCGATTCAGAACCTGGTTTAAACTACCAATTGGCTATCCCAAACCAACTC
88+
GTGTCGTCACAATTGAAGCGACGGGCCGAGATCTAAACTGTGTTCTCAGGCGTGGGATGG
89+
CGGCGCAGTGGATACCATCGCGTGTATGTCATCTAGAACTTAGTCTGCATGCAGAAGAGT
90+
TCGGTTAGACCGCCGTATGCAATATTCTAAGCCACCGCTACGTTTTCCTGAGACACCTTA
91+
ATGACAAAGATCGGTTCCTGCATGTCCTAACCGGACCCATTGGATAGCGTTGGTCAGCAC
92+
AAGCTTGTGATAAATACGTACCCTTCTTAAGCACAATTCCAGTAAAGGAATTATGGACTC
93+
ATGGGACCTCCATTGTCTTATGGTTTCATGCCGTTAAGATGAGGCCGATAATCTGGAAAA
94+
GGATGTTTTTGGGCCGTACAGTCTGCGGACAAGTGACACGGATGCCCAGAGTCTAGCTGC
95+
TTGATTAGCCCTGCTCTATTGGCTTGACTAAACCAGGTGTAACTGCGATCAGTGAGAGAC
96+
ATGCGTCCTCGTGTGATAAAGTCTATGTATTGATACTACCCCGTTGCCATCCGGCCCGCA
97+
TGATATGCCGCTAGTATAGGATATAAGTAGGAGCACTTTATCTTAATCCCCTCGCGCCTC
98+
TGCACTTGTGGGGAACTATTGGAATCCTATTAATTGACTTCACCGTGGCTTACTTATGAA
99+
GAGCGCCAGAATGGTAGCACGGACAGGACTCGTAAAAGGCGTGAGTCAGTTCGCGGTACT
100+
GTGCCGTTGCGGGATGTATTTCGCTAAGGTGTTAGCCCTACTTCCGTGATCTTATCCCGA
101+
ACGTTCACTGAATTGGGCTGTCTTTAGGGCTCTACGGAAACGTTGGAGCGCCCGACGTTG
102+
GTAC
103+
>Rosalind_3072
104+
ACTTTCGTGCTTGACACTGTCATCGATCTAATAAGCGAATCGCGAATTGCATTGGACACT
105+
GTATTTTCCTAAGTTGTTGGAAGAAGCGAGATTGAGAATACGTACTGATGCATGAGAGAG
106+
TCAGCCTGCGACTGTCATCTCATCCGGGTTGTCTGACAACTGCCTTACACCCAGAATATT
107+
TCGGTGCGACTAGGATATATAACGTGCCCAAACATCCATGCGATTTGCTAAAACGTCTAC
108+
CCGGCGATAATTAGTCGCCGCGTGTCCTGGACATATTCTGTTACCAAAGGTAAACCGTAA
109+
GAGCTATTGACAGCAGCGGGCCCCCCTCAAATGTTTCCAATGGTAAAAAATTGGACACGA
110+
GTGGGACCGTCGAATGCCCAGAGCCTCTAGTTGTCAGTAAGAAACGAATAACACTTCTAA
111+
CTCCGGGCTCGGGCAACAGAATTTTCCGATAAAAGGCAGGGCCCATTCGGGGTACCTCAA
112+
TCGGTGCAGGATGCTTGATCCGTGTGGCTGGCAAAGGGTAATGTAAGAGAGTTGGAGATA
113+
AAGTACTTGTTCGATATGAGTACTACATATTGCCGTTCCGTTGTAGTTGTCCGCTCCGTT
114+
GTGTCTGCTGTTGAAGTCAGATCCAATCGGTAGTAAACTCCTGAAACTCCTTAGGGCTCT
115+
TGTTCTTGTAAGCAAATGCCGTGGCATCGTTCGCTATCGGTGTGGCGGTTCAATCATGTA
116+
AGACATCACGGTGATTGCGTGAATCGTTAGCACGGGAAGCGTGAATTAACCCGCATCGCT
117+
GTCCTAATGTATTATTCTCCGTGGTTGGGGGATCCCCCAGCTTCAGCATCTTTATCTCGG
118+
ACGTTGTCCGGTTCAGCCCGCCATCAAGCTCCTATACAAACTGTAGAACACCTCGCAAGA
119+
CGAA
120+
>Rosalind_9371
121+
TTTGTGAGCATGGGTTTGGAGTACACCTGCCGCTGCTAATCGAGAAGTTGATCAGGGGGT
122+
CCCTCGCTTCGGTCGGAACGTTAGGGCAGGGCTCGGATGTTACACTAACAGGATATGTGG
123+
AAGTTATGTCCCCGTTATTCATAGTGCGCTCTCAAGAATCCGTTTTCGCCACAAAAAGTC
124+
TGGTTTACGACGTCATGTGGGGTGGTCAAAGTAACGTCTGCAGTACCGCGCACGACTCCT
125+
ATAAGGAATACTGACTACCGGTACGCCTCATCACGTCCAGCGGCCCATGATCTAACGGCC
126+
AGGCTGATGACGGAGGCGGGCTTACTCGCGGTGTCTGTTGGGACGCAAGCATCCCCGTCG
127+
TTGTATCCTTAGAGATCTCACGAACTTATTAAGAGACCCCGGTTATGAGATCCCACAGGG
128+
ATTAGTTGTTATGCTTTAGCGAAATTCCAGTTGGAATCTCGATACCAGTGATGCCTTTCA
129+
ACGGCTACGGACTGTGTATTAAGTGGGCTGCTTTTACTTAGCTTGGGGGGGGTGATGGGG
130+
GTGCATACCGTACTCGCTTAGGTTCAGGACTTATACAATACTGCGAACGTCCCATTTTCT
131+
TGTCGTCTCAGTAGAATAGGGCGCAAACGGAACTGACTCCTTTAACACCTATCGCGCCCA
132+
AGTGTAGGCAGAGAGGAATCGGAGTCCTATTGACGGCTATTATAGGAAACCATATTTAAG
133+
ACACAGCGACTTTATTGTATACCCGTTACGGGTAGACAATATGACCGAAATCGACTCGTC
134+
GGGATGACCTGTCGCTCACCCCGACACAGTGGAAGTCCAGGTAGGTCGTGCTTATGTCGA
135+
ATTCTGTGTGAGATGCGCCAAGAAACAGACGTTGCAGGGCGGGCTGAACACCTAGCGCGT
136+
CCAT
137+
>Rosalind_5657
138+
ACTGCGACACCGAGTTCGAAACCCGCGTGAAGCCACTAGTTTGGAATCCCAGTAGCACCC
139+
ACTTCTTCGCAGGGGGAGCGTCAAAGCAAAGTTTGAAGAATGCATTCAGACGTTGAGTGG
140+
GAGTTCAGTTCTTGTCATTCTTTATATGTCACTAAGCAGTTATTCTAGCCATAGAAGATT
141+
TGGTCTAGATTGCCGTATACAACGTTTCAAGCTACGGGTGCAGTATTCCGCAAGGGTAGC
142+
GTGAAGAAAGCCGGTTACTACATTTCCTGGGTCGACCTAACGGCTCACGATGGAGAGCGC
143+
AAGCTGATGCCGAAAACGGGCCCTTCTGTAGCGCTAATGCAGGTACGGGATCATACGCCG
144+
GTGGAACCTTGACGATCTTACAGCTTCACAGGGGGAACTTGGATTTGAAATCTCGAAAAA
145+
GTCCGTTTCTAGACAGCAGCAATAACCCAAGAATGACTCGGACATCTGGGGTTACTCCCA
146+
TCGATTGCAGATCGTTTATTCGCGGACCCGGACCTGGTTAACCTGCGGGGGGTGGAGAGG
147+
ATGCACACCCAGCTCGCTGACGTTGCACATCTGTACAGCATCATGACCGTCCGACTCTTC
148+
TGACGCTTCGCTAGAACAGGACATGGGTGGGAATGACTCCCTTTACGCTCGTGGCGCTCC
149+
CGTATTCACAGGGAAAGTCCAGAGCCCTGTTAAGTGGTATCACAGAGGTTCACACCTAGA
150+
ATGCAACAGTGCAATTGCACCGCCATTACGCATAAAGAACATGACCCAGGTTGAGTTATT
151+
GTAGTGACGTGTCGCACATCCCACTAGGGTAATCGTCCGGCACGGTCAAATTTATTTTAG
152+
ACACTTTGCGAATTGAGCCGTCATGAAGGCTCTGTAAAGACTGTGAGGCACCGAATGTGA
153+
CCAC

resources/rosalind_pdst_out.txt

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
0.00000 0.63827 0.67146 0.49558 0.31969 0.55752 0.47345 0.67699 0.58960
2+
0.63827 0.00000 0.34956 0.47788 0.56195 0.56969 0.61283 0.32080 0.29978
3+
0.67146 0.34956 0.00000 0.58739 0.63053 0.63053 0.65929 0.51659 0.49779
4+
0.49558 0.47788 0.58739 0.00000 0.31195 0.30531 0.47345 0.55752 0.32633
5+
0.31969 0.56195 0.63053 0.31195 0.00000 0.46239 0.28761 0.60730 0.49336
6+
0.55752 0.56969 0.63053 0.30531 0.46239 0.00000 0.55531 0.62832 0.48230
7+
0.47345 0.61283 0.65929 0.47345 0.28761 0.55531 0.00000 0.65708 0.57080
8+
0.67699 0.32080 0.51659 0.55752 0.60730 0.62832 0.65708 0.00000 0.49004
9+
0.58960 0.29978 0.49779 0.32633 0.49336 0.48230 0.57080 0.49004 0.00000

src/rosalind/pdst.clj

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
(ns rosalind.pdst
2+
(:require [clojure.string :as str]
3+
[clojure.java.io :as io]
4+
[rosalind.core :as r]
5+
[rosalind.fasta :as fasta]))
6+
7+
(defn p-distance
8+
[s1 s2]
9+
(let [length (count s1)
10+
nr-eq (->> (map vector s1 s2)
11+
(remove (fn [[a b]] (= a b)))
12+
(count))]
13+
(/ nr-eq length)))
14+
15+
(defn distance-matrix
16+
[strings]
17+
(->> (for [s1 strings]
18+
(for [s2 strings]
19+
(p-distance s1 s2)))
20+
(map (fn [line] (->> line
21+
(map #(->> % (float) (format "%.5f"))))))))
22+
23+
(defn format-matrix
24+
[matrix]
25+
(->> matrix
26+
(map (partial str/join " "))
27+
(str/join "\n")))
28+
29+
#_(->> "rosalind_pdst.txt"
30+
io/resource
31+
io/reader
32+
line-seq
33+
fasta/parse-fasta
34+
(map :seq)
35+
distance-matrix
36+
format-matrix
37+
(spit "resources/rosalind_pdst_out.txt"))

0 commit comments

Comments
 (0)