Skip to content

Latest commit

 

History

History
3082 lines (3030 loc) · 41.7 KB

report.md

File metadata and controls

3082 lines (3030 loc) · 41.7 KB

Final Report: AncestralClust Alg

Author: Yu Sun
Last update 2021.04.30
Github: https://github.com/YuSunwisc/Phylo563_FinalProject
Contact: [email protected]

1.Introduction

This project is going to do a simulation about ancestral sequence reconstruction with less randomly selected samples. It's well known that clustering is a fundamental task in the analysis of nucleotide sequences, and traditional clustering methods have mostly focused on optimizing high speed clustering of highly similar sequences, but state-of-art clustering method starting with less samples may also led to a good result. There we're going to run an "homemade" version of AncestralClust alg with ssRNA data.

2.Theoretical Backgrpund

The main idea comes from recent paper about a new algorithm AncestralClust with paper and Github. The idea is nice and clear, but due to some technical issue, this new algrithom couldn't run on my local computer with an un solved issue. I reinterpreted the whole idea of by using a combination of Shell, R and Python.

3.Data

In this project we analyzed approximately 8GB of ssRNA data downloaded from National Library of Medicine (NIH): NCBI Virus: Severe acute respiratory syndrome coronavirus 2 data hub website. This data took the form of a single fasta file with 298,871 Nucleotide sequences, each sequence with length around 30,000 bp. Due to the limit power of my local computer, In this report we only show the result of around 3k Nucleotide sequences, and choose 1.5% of it (around 50 sequences) for initial MSA. The reason comes from 2 folder:

  1. MSA process is the one of the process takeing most time in clustering problem(the other one is tree reconstruction process, it depends on the method and modle you choose, i.e. MP, ML, Distance, etc.) A typical example of run 3k sequences in local computer is down blow. This result comes from MacOS 10.15.7 platform with 1.8 GHz Dual-Core Intel Core i5 processor and 8 GB 1600 MHz DDR3 memory. It's not hard to say it's irreasonable for spending 28 hours on processing the first step of a program with no reslut.
$ bash muscle.sh /Users/guestadmin/Desktop/563/Phylo563_FinalProject/data/sequence_3k.fasta 

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

sequence_3k 3000 seqs, max length 29915, avg  length 29693
00:19:44    540 MB(6%)  Iter   1  100.00%  K-mer dist pass 1
00:19:44    540 MB(6%)  Iter   1  100.00%  K-mer dist pass 2
28:21:21  3049 MB(35%)  Iter   1   91.43%  Align node   
  1. Even with slighly larger random sample size(i.e. 100 sequences) for the first step, it's easy to come with a Bus 10 error in Shell or R, which means the memory is limited. This reslut comes from running this repository on MacOS 10.15.5 platform with 2.9 GHz Dual-Core Intel Core i5 processor and 8 GB 1867 MHz DDR3 memory. This gives us the reason only choose 50 sequences as the initial step.

For more details please check data readme.

4. Data Analysis Procedure

Step-by-step instructions is descirbed down below.

STEP 0. Preparation

First download the fasta data into data folder. You fasta file must have all descired sequences into one single fasta, with header as >XXXX for each sequence with seperate lines. Example

>NHOHUHI.01
AGCTTGCAAGCATGC

The direct to script folder, run

bash 0_preparation.sh $1

for your terminal. $1 is your desired inital cluster size. This step install python and all required packages, and also generate all the empty .txt and .fasta file in data for further usage.

STEP 1. Randomlt select initial data and run muscle MSA

Stay in script folder, run

python 1_1_3k.py
python 1_2_initial_sequence_50.py

or replacing command with py or python3 depending on your platform. These two files should give you sequence_3k.fasta and initail_sequence_50.fasta files representing the reduced size data and another fasta with only 50 random selected seqs. Then run

bash 1_3_muscle.sh $1

$1 is the FULL path of your input file. For example, you should see

bash muscle.sh /Users/shuqi/Desktop/563/Phylo563_FinalProject/data/example_input.fasta 

MUSCLE v3.8.31 by Robert C. Edgar

http://www.drive5.com/muscle
This software is donated to the public domain.
Please cite: Edgar, R.C. Nucleic Acids Res 32(5), 1792-97.

example_MSA_input 2 seqs, max length 29903, avg  length 29877
00:00:00      2 MB(0%)  Iter   1  100.00%  K-mer dist pass 1
00:00:00      2 MB(0%)  Iter   1  100.00%  K-mer dist pass 2
00:00:40   970 MB(11%)  Iter   1  100.00%  Align node       
00:00:40   970 MB(11%)  Iter   1  100.00%  Root alignment

This should give you initail_sequence_50_MSA.fasta as an aligned fasta file. If you see any privacy issue, please follow the comment details in 1_3_muscle.sh.

STEP 2. Prepare R enviroment and run AncestralClust alg

We strongly suggest you download R envirnment without from terminal. If you have never installed R in your system and you full understand your system, direct to script folder and run

brew install r

If you have any issue with the further package installation, please check the details on comment of 2_1_AncestralClust.r. Then either from your R enironment or terminal, 2_1_AncestralClust.r. The detials of 2_1_AncestralClust.r can be broken down into following steps

    1. It uses the aligment file initail_sequence_50_MSA.fasta to generate a tree using Neighbor joining algorithm.
    1. It prunes the c-1 longest edges from above tree, with c is your desired initial cluster number.
    1. Picked a representative, which called the "centroid" for each cluster, and store the header for all centroids into initial_clust_i_centroid.txt, with i represents the index of initial clusters. And put all members for each cluster into initial_clust_i.txt. The choice of centroid can be creative, in the original paper they decide to use MAP estimator to get the ancestor. In this repository, due to the limit of computation, I choose the one in the center of the subtree, which is also the one furtherest to other cluster in average as centroid.
    1. Call python enviornment in R. By call 2_2_write_new_fasta.py in R, we're able to all centroid seqs from the original data sequence_3k.fasta, then we put these centrod sequences into initial_clust_i.fasta. Then run over all the unsampled points in origianl data file sequence_3k.fasta, put the sample and a centrid together into a fasta file [new_sample_j.fasta](/data/pairwise\ with\ centroids/i/new_sample_j.fasta), with j represents the j-th sample in original seqs file and i represents the centroid. In our example below you can see we will create 15k pairwise fasta file in [data/pairwise with centroids](data/pairwise with centroids) folders, but it will be much faster late do the pairwise comparison.
    1. Do pairwise aligment and compare all distance for a new sample with different centroid, then find the centroid with shortest distance with it. Then write the new point into corresponding initial_clust_i.txt file.
    1. Call Python and by 2_3_final_clust.py, clean the redundant clust members. All final clust member should be in final_clust_i.txt.

5. Reslut

We use 5 as our inital cluster size. The inital MSA process takes about 2-3 hours for 50 seqs

initail_sequence_50 50 seqs, max length 29903, avg  length 29829
00:00:01      9 MB(0%)  Iter   1  100.00%  K-mer dist pass 1
00:00:01      9 MB(0%)  Iter   1  100.00%  K-mer dist pass 2
00:23:10  1699 MB(20%)  Iter   1  100.00%  Align node       
00:23:10  1701 MB(20%)  Iter   1  100.00%  Root alignment
00:46:35  1710 MB(20%)  Iter   2  100.00%  Refine tree   
00:46:36  1710 MB(20%)  Iter   2  100.00%  Root alignment
00:46:36  1710 MB(20%)  Iter   2  100.00%  Root alignment
01:35:46  1710 MB(20%)  Iter   3  100.00%  Refine biparts
02:23:25  1710 MB(20%)  Iter   4  100.00%  Refine biparts

and get the inital NJ tree as

The Final clusters plot as below

Clust 1: 651 in total (Centroid: FR993099.1)
FR993099.1
OB993918.1
MW276569.1
MW667185.1
MW992425.1
OB999714.1
MW925412.1
MW896268.1
OB999166.1
OA971449.1
LR884125.1
MW191328.1
MW987352.1
MZ028843.1
MW972902.1
MW720024.1
MW114981.1
MT973403.1
MW831571.1
MW782912.1
MW896540.1
MW930918.1
MW751334.1
MW673111.1
FR998939.1
MW668966.1
MW153554.1
MW669409.1
MW156891.1
MW153403.1
MW749420.1
MW782896.1
MW987194.1
FR998631.1
MW666444.1
LR884082.1
MW521687.1
MW490773.1
MW965069.1
MT520516.1
MW539792.1
MW751121.1
OA983813.1
MT772483.1
MW617633.1
MW276723.1
MW065201.1
MT642386.1
MW575563.1
FR992936.1
MW666231.1
MW994660.1
MW134192.1
MW864402.1
MW206036.1
MT831289.1
MW864049.1
MW494079.1
OA964871.1
MW642024.1
MT706385.1
FR994412.1
MW719889.1
MZ007239.1
MW972640.1
MW626208.1
MW831563.1
MT772506.1
MW586696.1
MW816399.1
MT396246.2
MW728852.1
OA968488.1
MW763286.1
MZ013001.1
MW834654.1
MT772456.1
MW923953.1
MW518820.1
MW967623.1
MT973127.1
MW332709.1
MW733251.1
MW685583.1
MW586506.1
MW276822.1
MW714012.1
FR991803.1
MW687532.1
FR993084.1
MT750022.1
MW578012.1
MW763416.1
MW738273.1
MW877084.1
MW321305.1
FR991338.1
MW969468.1
MZ010151.1
MW583360.1
MW972589.1
MW890621.1
MW914533.1
MW633919.1
MZ022143.1
MW700692.1
MW928200.1
MZ000651.1
OA964950.1
MT412244.1
OA964658.1
MW891655.1
MW728817.1
MZ024652.1
MW666307.1
MW849171.1
MW930453.1
OA965801.1
MW666414.1
MW035526.1
MW223108.1
MW505233.1
MW831580.1
MW749473.1
MW796209.1
MW549582.1
FR998951.1
MT451629.1
MT635201.1
MW916432.1
MT703961.1
MW733409.1
MT750112.1
MZ009877.1
MW882492.1
MW928303.1
MW702940.1
MW504609.1
MW818108.1
MT412200.1
MW932067.1
FR994612.1
MT972787.1
MW891483.1
MW974097.1
MW519820.1
MW993817.1
MW685611.1
MT614521.1
MW905198.1
MW764076.1
MW876935.1
MW185396.1
MW925464.1
MW973722.1
MW848250.1
LR883171.1
MW804752.1
MW403762.1
OB998471.1
OA965838.1
MW460661.1
FR994893.1
MW751417.1
MT706371.1
MW704751.1
MW698153.1
MW064346.1
MZ030247.1
FR994753.1
MW184480.1
FR997073.1
MW583203.1
HG999315.1
MW734128.1
MW581613.1
MW986382.1
MW877000.1
MW419984.1
MW793275.1
FR998216.1
MW009055.1
MW750811.1
MW782786.1
MT646071.1
MT632828.1
MT706248.1
FR990933.1
MW808417.1
MZ006802.1
MW863924.1
MW277003.1
FR994903.1
FR992797.1
MW993808.1
MW972964.1
MT706376.1
OD911181.1
MW902895.1
OD912703.1
MW750065.1
MW805042.1
FR998623.1
MZ046937.1
MW816980.1
MW154873.1
MT928962.1
MZ036876.1
MW944636.1
MW191062.1
OA965815.1
MW647970.1
MW065210.1
OD898393.1
MW035374.1
MW834786.1
MT627254.1
MW460599.1
MZ023781.1
MW972368.1
MW782895.1
MW276571.1
MW880541.1
MW825530.1
MW864416.1
OD912411.1
MW840848.1
FR998495.1
MW065092.1
MW065065.1
OD910489.1
MW696983.1
MZ039118.1
MW505189.1
MW910495.1
OD911759.1
MW567474.1
OD910293.1
MW549224.1
MW793172.1
MW792688.1
MW928364.1
MZ013314.1
MW671638.1
MW779363.1
MW190546.1
MW064534.1
FR994485.1
MT750345.1
OA964940.1
MW190874.1
MW733404.1
MW292632.1
MW758809.1
MT451798.1
MW816387.1
HG998863.1
MW896143.1
OD909683.1
MW735236.1
OA984882.1
MW975939.1
MW505032.1
MW903974.1
MW114976.1
MW911421.1
MW505341.1
MW940528.1
MW749552.1
OA965991.1
OD906959.1
MW782844.1
MW864404.1
MT467250.1
MW403659.1
FR997071.1
MW959977.1
MW155409.1
MW666897.1
MW994974.1
LR878144.1
MW925303.1
OD910759.1
MW064568.1
MW403561.1
MW763818.1
MW590923.1
MW184083.1
MW645592.1
MW890597.1
MW154960.1
MT970526.1
MW849114.1
MW566898.1
MW720341.1
MW666754.1
MW849957.1
MZ048120.1
MW558340.1
MW673198.1
MW708495.1
MT706295.1
FR994300.1
MZ024759.1
MW667092.1
MW863970.1
MW054149.1
MW471658.1
MW972950.1
MZ032603.1
MW719927.1
MW973321.1
MW522420.1
MW064454.1
FR999282.1
MW992734.1
HG999926.1
MW972506.1
HG997207.1
MW795999.1
MW739819.1
MW804966.1
MW555840.1
MW039104.1
MW707658.1
FR993418.1
MZ023290.1
FR999689.1
FR994831.1
MW668975.1
MT856687.1
MW925190.1
HG999938.1
MW719684.1
MT913047.1
MT886440.1
MT772495.1
MW882536.1
MW666496.1
OD912679.1
MW671720.1
MT846547.1
HG996998.1
MZ020723.1
MW065405.1
MT997704.1
MT886398.1
MW707714.1
MW892133.1
MW986892.1
MW855153.1
OD911741.1
MT880778.1
OD910225.1
MZ012161.1
MW064651.1
MW728488.1
MW739891.1
MW807971.1
MW446829.1
MW792625.1
FR998923.1
MT929144.1
MW714028.1
MW708881.1
MW491172.1
MW565152.1
MW626414.1
MW921756.1
MT831373.1
MW681210.1
MW495150.1
MW184889.1
OB988806.1
FR996066.1
OA999941.1
MW698182.1
MW702204.1
MW877035.1
MW738918.1
FR996176.1
MW751387.1
MW667270.1
MW902607.1
MZ012270.1
FR995662.1
OD910482.1
MW733281.1
FR991424.1
MW183969.1
MW673563.1
FR996693.1
MW923431.1
MW987208.1
MT937319.1
MW945317.1
MW773407.1
MW825383.1
MW673101.1
OA964870.1
MW064780.1
MW617802.1
MW825391.1
MW805114.1
MW592611.1
MW696344.1
FR996276.1
MT951976.1
MW547440.1
MW064678.1
OA994294.1
MT970778.1
MW522481.1
MW403623.1
MW795986.1
MW763382.1
LC547527.1
MW849035.1
MW669092.1
MT451291.1
MW773241.1
FR994820.1
FR994521.1
MZ002658.1
OB997759.1
MW689799.1
MW667328.1
MW715522.1
MW941251.1
MW505244.1
MW549983.1
MW796025.1
MW763917.1
MW817003.1
MT795875.1
MT827818.1
OC996736.1
OD908304.1
MW873958.1
OB995842.1
MW905732.1
MW932713.1
MW816463.1
FR991470.1
FR998773.1
MW751346.1
MW906884.1
MW914458.1
MW084472.1
MW669437.1
OA981948.1
MW743299.1
MW810358.1
MW840804.1
MW963217.1
MW547507.1
FR993904.1
FR995804.1
MW673041.1
MT627305.1
FR997963.1
FR999355.1
FR996585.1
FR998266.1
FR996589.1
MW153906.1
OD908010.1
MW964767.1
MW064392.1
MW925623.1
MW866029.1
MW930118.1
MT627623.1
OA966001.1
MW190527.1
MT232688.1
MW749317.1
MW669715.1
MW868404.1
MW780797.1
MZ003555.1
MW671611.1
MT971713.1
MW923530.1
MT759588.1
MW877309.1
MW986774.1
MW860728.1
MW667522.1
FR994595.1
MT750124.1
MW813538.1
MW421941.1
HG997820.1
MT750041.1
MW804733.1
MW693492.1
MZ028530.1
FR990252.1
MW763977.1
MW763803.1
FR999428.1
OB989711.1
MW522478.1
MW731314.1
MW332670.1
MW771889.1
MW725754.1
MW975192.1
MW064502.1
MZ046874.1
MW751374.1
MZ010525.1
FR995071.1
MW923090.1
MW808067.1
MZ032636.1
MW778639.1
MW844284.1
MW869880.1
MW665686.1
FR999617.1
MW484835.1
MW966200.1
MW931598.1
MW704415.1
MW751208.1
MW864297.1
MW837449.1
FR995631.1
MW524947.1
MW064900.1
MW341840.1
MW403535.1
HG997233.1
MW617859.1
MW793077.1
MW280168.1
MW722230.1
MZ012607.1
MW720054.1
MW047018.1
MW893608.1
MW834986.1
MW893791.1
MW749607.1
MW065018.1
FR998581.1
MW795990.1
OD911305.1
MW973052.1
OB992862.1
MW735936.1
MW792838.1
FR994760.1
MW673327.1
MW666887.1
MW869265.1
MW184128.1
MW916714.1
MW923414.1
MW773863.1
MT467238.1
OB998375.1
MW699205.1
FR994833.1
OB986370.1
FR999820.1
MW852074.1
MW972598.1
MW696288.1
MW894335.1
MW860409.1
MW751611.1
MW326515.1
MZ038777.1
MW662141.1
MW796037.1
MW332671.1
MT577599.1
OD906886.1
OA972213.1
MW821865.1
LR963413.1
MW666399.1
FR994658.1
MW065197.1
MW994772.1
FR993954.1
MZ039315.1
FR996963.1
MW064874.1
MT496975.1
MW738901.1
MT890241.1
MW565541.1
OD912654.1
MZ032408.1
MW848887.1
MW893596.1
MW669485.1
MT973210.1
MW840451.1
MW156267.1
OA966046.1
MW023463.1
MT557570.1
MW751513.1
MW739606.1
OA966226.1
MW720453.1
MW734590.1
MT772559.1
MW882002.1
MW741669.1
MW944449.1
FR994373.1
MT632575.1
MW705466.1
MZ032506.1
MZ035531.1
MW191198.1
MW702442.1
FR991406.1
MW276654.1
MT970011.1
MW687488.1
OA964809.1
MT971553.1
MW701190.1
MT971008.1
MW064572.1
LR814190.1
MW793229.1
OD911314.1
MW728821.1
MT971146.1
OB998245.1
MW825912.1
FR994600.1
MW751240.1
MW681184.1
MW403701.1
MW684254.1
MW928384.1
MW545455.1
MW035492.1
MW930580.1
MW697926.1
MW518197.1
MZ029462.1



Clust 2: 204 in total (Centroid: MW783273.1)
MW893732.1
MW974486.1
MW783273.1
MW778567.1
MW986524.1
MT451657.1
MW759611.1
MW906855.1
MW896364.1
MW772012.1
FR996362.1
MW185643.1
MZ010868.1
MW549795.1
MW771962.1
MW156565.1
MT969998.1
MW986567.1
MW958134.1
MW850269.1
MW513511.1
MW863948.1
MW277128.1
MW993273.1
MW365023.1
MW276439.1
MW808677.1
MW581594.1
MW053827.1
MZ010713.1
MZ025891.1
MW565603.1
MW905139.1
MW934222.1
MW882792.1
MW779233.1
MW586537.1
MW767333.1
MW986494.1
MW817274.1
MW053877.1
MW715555.1
MW197488.1
MZ044595.1
MW694255.1
MW080312.1
MW631918.1
MZ024464.1
LR883413.1
MW991108.1
MT811473.1
FR993448.1
MW153755.1
MT811519.1
MT972471.1
MW851381.1
MW578052.1
MW942279.1
MT969552.1
MW808710.1
MW942524.1
MW406789.1
MT831444.1
MW154109.1
MW777631.1
MT683403.1
MW590796.1
MW157176.1
MW635955.1
MW986229.1
FR996733.1
MW783235.1
MW583310.1
MT811706.1
FR996046.1
MT971315.1
MT810884.1
MW715250.1
MW280492.1
MW586471.1
MW586144.1
MW993671.1
LR883161.1
MT972595.1
MZ002341.1
FR996466.1
MZ002402.1
MZ013580.1
MW276232.1
MW993665.1
MZ023780.1
MZ024189.1
MW519829.1
MW995119.1
MW994222.1
MW750942.1
MT843239.1
MW972286.1
MW276741.1
MW990456.1
MZ046629.1
MZ003019.1
MW155775.1
MW941819.1
MW700462.1
MW968455.1
MW990850.1
MW894485.1
MW994990.1
MT451470.1
MW154213.1
MT459841.1
MW578018.1
MW411906.1
LR878033.1
MZ038710.1
MW964190.1
MW778367.1
MW975388.1
MW540244.1
MT969691.1
MW974910.1
MW660891.1
MW185721.1
MZ006901.1
MZ011932.1
MW586633.1
MW406679.1
MT810823.1
MW870111.1
MW870045.1
MW369424.1
MW901636.1
MW846050.1
MW767239.1
MW994208.1
MW539848.1
MW583140.1
MZ029362.1
MW653636.1
MW868364.1
MW994094.1
MW582234.1
MW851538.1
MW783336.1
MT981108.1
MZ013386.1
MW906618.1
MW752499.1
MW626510.1
MW771922.1
MW778013.1
MW571147.1
MT971574.1
MW835119.1
MW844310.1
MW863803.1
MT973334.1
MW154730.1
MW777420.1
MT786866.1
MZ001013.1
MT810560.1
MZ039247.1
MZ039752.1
MT444558.1
MW777904.1
LR878053.1
MT956776.1
MT675954.1
MW796729.1
MT846453.1
MW084443.1
LR883707.1
MW777491.1
MT810981.1
MW771790.1
MW519803.1
FR993134.1
MW577917.1
MT919774.1
LR883446.1
MW985391.1
MW808674.1
MW993828.1
MT810772.1
MT632515.1
MZ012992.1
MT969716.1
MW750989.1
MW974903.1
MZ002831.1
MW645874.1
MZ011258.1
MW153727.1
MW913821.1
MZ034229.1
FR991085.1
MW677965.1
MW566938.1
MW772212.1
MW540179.1
MZ002881.1
MW519666.1



Clust 3: 847 in total (Centroid: OB986922.1)
OA970084.1
OB986922.1
OA968108.1
OA981410.1
OA981670.1
MZ029421.1
LR881550.1
MT970955.1
OB999160.1
OC996200.1
OA983334.1
OB998596.1
OA989794.1
HG999354.1
OC999017.1
OD899533.1
MW963667.1
MT656010.1
OA972056.1
MW976480.1
OC997521.1
OA982396.1
OB987834.1
OA995246.1
MW944056.1
OA990660.1
OD907694.1
FR992813.1
OA978031.1
MW912679.1
OA970462.1
OB983714.1
OB988756.1
MW421990.1
OB983702.1
MW739313.1
OB982675.1
OB997244.1
OD900943.1
OD899576.1
MW877103.1
FR990923.1
MW185172.1
OB999370.1
OB990729.1
OC996245.1
OC997431.1
MW923066.1
OA991203.1
OA992200.1
OB996644.1
MW708266.1
OA984435.1
MZ037400.1
OA981591.1
OA997317.1
OB994226.1
OA971258.1
LR884339.1
OD898244.1
OC998469.1
OA973541.1
MW932558.1
MW985057.1
OB982560.1
MW731097.1
MW991990.1
OA969165.1
OA981934.1
OA971433.1
OD900202.1
OA968179.1
OB992100.1
OB986072.1
MW714581.1
OA999985.1
OB987957.1
LR881776.1
MW758865.1
OA989897.1
MT482116.1
MW596033.1
OA967011.1
OB989853.1
MZ023541.1
OA990770.1
OC998301.1
OC996779.1
OB983901.1
OB989174.1
OA978210.1
OB997707.1
MW462662.1
OB999179.1
OA997607.1
OD909929.1
OA974081.1
OA997673.1
MW184503.1
OD907282.1
OB988588.1
OC996809.1
MT326081.1
OB992296.1
MW965803.1
OA969132.1
MZ007249.1
MW933689.1
MW850271.1
MT628089.1
OB990147.1
OA974173.1
OB992699.1
OA995512.1
OA970815.1
OB991500.1
OD907187.1
OA998693.1
OB990709.1
OA979573.1
OA972411.1
OA978814.1
OA966174.1
OA998432.1
MW993321.1
OA969446.1
OB987007.1
OA982174.1
OB991265.1
OC996156.1
MW792754.1
MW708602.1
OC997390.1
OB988996.1
MT821549.1
OB982720.1
MW778278.1
MW975639.1
OA973711.1
MW892369.1
MW731426.1
OD901252.1
LR824450.1
OB983979.1
OA994448.1
MW702716.1
MW709239.1
MW493860.1
MZ023014.1
OA980436.1
OA981178.1
OA979406.1
OA991221.1
OA981877.1
OD900223.1
LR884261.1
MW136857.1
OA999812.1
OA972359.1
OB989635.1
OA977421.1
MW969428.1
OB982504.1
OD900064.1
OD899434.1
MT375451.1
MW911021.1
OD898626.1
MW321377.1
MZ028489.1
OB997453.1
MT642104.1
OA990457.1
OB986077.1
MW321251.1
OB994207.1
OB985973.1
MW973102.1
OA995839.1
MW850021.1
MT956915.1
MW944932.1
OD912844.1
OB984867.1
OA972642.1
OA978771.1
OA967414.1
OB984241.1
OD900118.1
OA971923.1
OA991785.1
OC999128.1
OD912203.1
OD909097.1
OB988823.1
OB993182.1
OD898351.1
OD899718.1
MZ034006.1
MW648269.1
MW941271.1
OA973017.1
MW894600.1
MW730983.1
MT510643.1
OD900282.1
OA998605.1
MW519766.1
MW848151.1
OD900611.1
OB996282.1
MW993902.1
MW648055.1
OB995863.1
MW767328.1
MW938561.1
MW792702.1
OB986711.1
OB992503.1
OA989671.1
OB987928.1
MW155587.1
MW942602.1
OA981665.1
OC998538.1
OA992945.1
MW912844.1
OB999288.1
OB996029.1
MW708117.1
OD908078.1
MW912929.1
MW156413.1
OA980380.1
OA968482.1
MW577761.1
OA980054.1
MW861387.1
OD901073.1
OA990438.1
MW752040.1
OB984459.1
OB984903.1
OA975821.1
OA991859.1
OB999313.1
OD899280.1
LR862481.1
MZ035003.1
MW992840.1
OB991914.1
OB989706.1
MW808276.1
MW494314.1
OC996025.1
OA982221.1
OA973440.1
MZ029205.1
OC998529.1
OC997374.1
OA982422.1
MW975089.1
OC995779.1
OA990021.1
MW749440.1
OA974968.1
OA994439.1
MW749991.1
OA981060.1
OB995511.1
LR991997.1
OA989665.1
OB988576.1
OA996112.1
OC999453.1
OA964494.1
OD898662.1
OA998256.1
MW566941.1
MW976911.1
MT970886.1
OA982545.1
OB999277.1
MW673495.1
OB986230.1
OA971671.1
OA994596.1
OA981329.1
OA981212.1
OB989854.1
MW944129.1
OA978734.1
OA995515.1
MW933361.1
OB999103.1
OA999405.1
OB991055.1
OB982428.1
MW933878.1
OA968019.1
OB983433.1
MW586259.1
OB995447.1
OD910977.1
MW966366.1
MW778821.1
MW184328.1
MW550431.1
MW763111.1
OA997508.1
OB985790.1
OB982591.1
OB983999.1
OB991256.1
OD909773.1
OA984691.1
MZ028832.1
OB985159.1
OA969391.1
OA990735.1
MW985743.1
OA981386.1
OB985602.1
OA993251.1
MW709056.1
OA974275.1
MZ033392.1
OD909784.1
OA997537.1
OA964083.1
MW485289.1
OC999344.1
OA984986.1
OB991212.1
MW758632.1
MW796806.1
MW422039.1
OC997511.1
OA977842.1
OA982983.1
MW836876.1
MZ008267.1
OA990608.1
MZ011936.1
OA994081.1
OB991998.1
OC999139.1
OB987208.1
OA967686.1
OB992132.1
OA995029.1
MW154710.1
OB999407.1
OA990600.1
MW241168.1
OA969614.1
OB985217.1
MZ046909.1
OC997357.1
OA972115.1
MT642261.1
OA989593.1
OB997321.1
OB981930.1
MW596201.1
OB992769.1
MW812733.1
OB996341.1
OC996219.1
MZ029778.1
LR881613.1
OD907717.1
MW685632.1
OA999418.1
MW689115.1
OA999585.1
MT259275.1
LR824204.1
OB986268.1
MW932489.1
MW702274.1
MW550398.1
OD911037.1
MT558706.1
OA992707.1
MW578103.1
MZ028744.1
MW040510.1
MZ035574.1
OA993555.1
OD900633.1
MW931325.1
OD910068.1
OA974379.1
MW974420.1
OA996093.1
MW166163.1
OA993034.1
OA982904.1
OA991528.1
OA991215.1
MW911066.1
MW913203.1
MW731028.1
OD898369.1
MW901905.1
OB998593.1
HG999041.1
OD911287.1
OA970761.1
OB992830.1
OA973318.1
OA964235.1
OA984200.1
OA982958.1
MW796812.1
OA990232.1
LR881861.1
OB982107.1
MW776767.1
OB994520.1
OA991855.1
MW246610.1
MW942456.1
LR881927.1
OB988553.1
OA975578.1
OB984185.1
OA982610.1
MW777897.1
MW958096.1
OC999940.1
MW766623.1
LR824249.1
OA996376.1
OA982091.1
OD901232.1
OC995918.1
MW708868.1
OC996154.1
OA976484.1
OA991474.1
LR898967.1
OB999748.1
OB992514.1
OA989682.1
OA998103.1
OA992041.1
OB984863.1
MW482937.1
OA979304.1
OB991399.1
MW578051.1
OA984384.1
OA974557.1
OD907172.1
OA978406.1
OA991439.1
OB986428.1
MW771895.1
OC996178.1
OD898766.1
MW577059.1
OA992296.1
MW992406.1
MW565742.1
MW891278.1
OA994628.1
OB984851.1
OA984697.1
MW777934.1
MW779264.1
OA975570.1
OB998522.1
OB996762.1
MW708660.1
OA975891.1
MW968070.1
MZ033067.1
OA978351.1
OB990174.1
OA968850.1
MW910976.1
FR990346.1
HG994224.1
OB994430.1
OB989879.1
OB997409.1
OB985590.1
MZ023231.1
MW430965.1
OC997285.1
MW883355.1
OD910632.1
OD910375.1
MW702098.1
OA979906.1
MW731444.1
MW984686.1
OA977127.1
OC996041.1
MW915375.1
OB992657.1
MZ048122.1
MW994634.1
MW704408.1
OB993129.1
MW767315.1
OA971243.1
OA977338.1
OB985927.1
MT163737.1
OD899650.1
OA984119.1
OA982170.1
MW403691.1
OA971004.1
MW617537.1
OA992483.1
OA984364.1
OB997367.1
MW994981.1
MW276572.1
MW934035.1
MW349041.1
OA967920.1
MT994957.1
OD907067.1
MW758641.1
OB993407.1
OA970475.1
MW793023.1
OA994044.1
MT385416.1
OD908044.1
OA992528.1
OB989479.1
OA980263.1
OB986349.1
OA983083.1
MW963645.1
OA998754.1
LR860637.1
OA992225.1
OA999356.1
OB995308.1
MW184683.1
MW916014.1
MW708599.1
MW773947.1
OA972294.1
OA970797.1
OA991142.1
MZ020653.1
OA967302.1
OA996644.1
OA978597.1
MW674836.1
OB985834.1
MT703970.1
OA973908.1
MW976091.1
OD912266.1
MW702764.1
OA969971.1
OD899469.1
OB993240.1
OB996561.1
OB998829.1
MW988150.1
OA991307.1
OB983093.1
OA982430.1
OA975808.1
OA983676.1
OA995119.1
OB995824.1
OA977965.1
OD909038.1
OB995745.1
OA979646.1
OB999271.1
MW963634.1
OB988062.1
MW700640.1
OB994481.1
OA974443.1
MW989857.1
MW934149.1
MW877153.1
MW593708.1
OB985308.1
MW779192.1
LR882186.1
OB989604.1
MW881131.1
OA985040.1
MW491095.1
OD908236.1
MW184897.1
MW778662.1
MW933714.1
OA984292.1
MZ008759.1
MT772089.1
OB991063.1
OA989904.1
MT834188.1
MW766630.1
OB989376.1
MW202153.1
MW963282.1
OA970693.1
MW825892.1
OB996708.1
OA991582.1
MW738984.1
OD900144.1
LR962970.1
OA975426.1
OB991668.1
OC998615.1
OA967190.1
OA983755.1
OB989396.1
MW896223.1
MW846007.1
OA997965.1
OA983037.1
OA995380.1
OB989318.1
OD900021.1
OA984581.1
MT821736.1
OB998328.1
MZ013690.1
OA980791.1
OA994818.1
MW941076.1
OA976583.1
OD908651.1
MT886306.1
OA999461.1
OC997079.1
OC998055.1
MW969324.1
MZ009872.1
OB993754.1
MZ033315.1
LR824208.1
MW913299.1
OD909567.1
MW565472.1
MW976423.1
OA990178.1
OD907700.1
OA969085.1
OA994019.1
OA972923.1
MW963205.1
OA973220.1
MW943848.1
OB991062.1
MW505067.1
OB998764.1
OB994334.1
HG997102.1
OD898174.1
OA998409.1
OB996833.1
MW969215.1
LR882404.1
OB982294.1
MW966885.1
OA997241.1
MW565736.1
OD910590.1
OA995418.1
OB986617.1
OB988028.1
OA969466.1
MT972513.1
OB993012.1
OA967039.1
MT627415.1
OA993967.1
FR989829.1
OB986157.1
OA981139.1
OA977773.1
OD910579.1
OB985858.1
OC997647.1
OB991367.1
OC997225.1
OA996409.1
OC998548.1
OA968853.1
OA971332.1
OB995295.1
OB984217.1
MW909987.1
OA974971.1
OB996489.1
OB995287.1
OA982616.1
MW964552.1
MT843301.1
MW708153.1
OA984863.1
MZ034851.1
MT972999.1
MW792741.1
OD908157.1
MT451393.1
OA977776.1
OA983639.1
MZ006975.1
OD911290.1
OC997298.1
OA992054.1
OA980087.1
MW969246.1
MW708600.1
MW708929.1
MW964139.1
MW629369.1
OA991056.1
OB991311.1
OB992405.1
OC997028.1
OD909242.1
OB987806.1
MW863699.1
OA979835.1
OB998695.1
OA970998.1
OA978886.1
OC997987.1
OA994971.1
MW944310.1
MW969402.1
OB997188.1
MW708260.1
MT612314.1
OA979329.1
OA981412.1
MW778813.1
OA973821.1
OB992430.1
OB993774.1
MW933192.1
OA980098.1
OB981879.1
MW909853.1
OA992162.1
OA975344.1
OB987419.1
OA967458.1
OA968384.1
OB997242.1
OD912246.1
MW036038.1
OA999798.1
OA999292.1
MZ022726.1
OA974502.1
MW990270.1
MZ001193.1
OA980668.1
OC996021.1
MW972784.1
MW942967.1
LR882314.1
OA979267.1
MW963832.1
OC998147.1
OA997851.1
OC997687.1
MW518188.1
OB991944.1
OA990479.1
MT628122.1
OA967456.1
MW708888.1
OB992306.1
OA999143.1
MW731043.1
OA980455.1
OA972767.1
OA970460.1
LR898662.1
OA995238.1
OB991559.1
OD899567.1
OA977991.1
OA979057.1
OA972666.1
OC998438.1
OA998463.1
MW763196.1
MW909818.1
OA968921.1
OC996311.1
OD909437.1
MW966652.1
OD907705.1
OA976356.1
MW156684.1
MT972080.1
OA984626.1
MW932497.1
OD900909.1
MW467465.1
OB999415.1
OB994469.1
MZ033914.1
OA989943.1
OA994990.1
OB986584.1
OA998767.1
MW454721.1
OA979991.1
OB988253.1
OA969204.1
OB999591.1
OA992019.1
MZ008594.1
OA993605.1
OB986880.1
OA968877.1
MW629522.1
OA979029.1
HG994325.1
OA968764.1
OA998395.1
MT831299.1
OA999598.1
OB999190.1
MT730115.1
OA994392.1
OA980289.1
OA975334.1
MW695378.1
MW911130.1
OB992456.1
MW738434.1




Clust 4: 833 in total (Centroid: MW914205.1)
MW643973.1
MW635936.1
MW369424.1
MW964190.1
MW873958.1
MT506682.1
MW851307.1
MW973306.1
MW882728.1
MW190874.1
MW991776.1
MW156267.1
MW689115.1
MZ024183.1
MW932480.1
MW993147.1
MW932383.1
MW796729.1
MW914205.1
OB999714.1
OB986157.1
OB995447.1
OA981665.1
OB988576.1
OB993182.1
OA998432.1
MW617537.1
FR991228.1
MW156006.1
MT969791.1
MT831413.1
OA979991.1
FR998581.1
OA970797.1
MW065092.1
MZ035574.1
FR994753.1
MW636673.1
MW858974.1
MW639498.1
MW966648.1
MW634878.1
MW277364.1
MW855802.1
MW734965.1
MT831195.1
MW913767.1
MW736078.1
FR990431.1
MZ003428.1
MW984911.1
MZ008690.1
MT966184.1
OA974254.1
MW639981.1
MW964156.1
FR991650.1
MW172737.1
MW635771.1
MT412232.1
MW635880.1
MW986722.1
MW923145.1
MZ006891.1
MW565165.1
MW902183.1
MT940498.1
MW959527.1
MW959982.1
MT831834.1
MW861027.1
MW634662.1
MW548329.1
MZ034125.1
MW707662.1
MW640178.1
MW644384.1
MW705494.1
MZ022858.1
MW183988.1
MW634992.1
MW931507.1
MW999983.1
OB998102.1
MW600463.1
MZ047394.1
MW986657.1
MW972969.1
MW636360.1
MW871302.1
MW549570.1
MZ025247.1
MT325584.1
OA965900.1
MZ033984.1
MW904972.1
MZ024077.1
MW639184.1
MW634734.1
MT576563.1
MW638294.1
MW706981.1
MT973458.1
MW705246.1
MW942135.1
MW779661.1
MW637209.1
MT577611.1
MZ029143.1
MZ004903.1
MW286629.1
MZ000505.1
MT973447.1
MW701238.1
MW685770.1
MZ000455.1
MZ001498.1
MW778567.1
MW634725.1
MZ038697.1
MW812874.1
MW643641.1
MZ036216.1
MW635248.1
MT846524.1
MW923045.1
MW705377.1
MW903685.1
MW578167.1
MW882502.1
MW864393.1
MW859297.1
MW586324.1
MW921625.1
MW831667.1
MW693597.1
MW853568.1
MW739230.1
MW904002.1
MT511695.1
MW896354.1
MW891905.1
MW635601.1
MW596094.1
MW693442.1
MZ000144.1
MW134165.1
MW987502.1
MW738829.1
MW634044.1
MW598431.1
MW904497.1
MW876950.1
MT972899.1
MZ013872.1
MW893816.1
MW483114.1
MW600616.1
MW539751.1
MW813551.1
MW858971.1
MW813482.1
MW921696.1
MW157123.1
MW639637.1
MW637491.1
MZ037554.1
MW638608.1
7CXM_I
MW634690.1
MW643662.1
MW813769.1
MW578124.1
MW974232.1
MW637816.1
MW634264.1
MW816564.1
MW672680.1
MT969600.1
MT598157.1
MW593485.1
MW986302.1
OB993918.1
MW638914.1
MW903501.1
MW942077.1
MZ038948.1
MW838439.2
OA964690.1
MW941486.1
MT880884.1
MW639423.1
MW634896.1
MW903906.1
MW941699.1
MW067729.1
MW968441.1
MW705398.1
MW640261.1
MW593278.1
MZ032524.1
MT259256.1
FR991691.1
MW865936.1
MW972424.1
MW911983.1
MW134158.1
MT911813.1
MW560583.1
MW809196.1
MT967931.1
MZ013034.1
MW485280.1
MW725825.1
MW871035.1
MW896636.1
MT679160.1
MW693043.1
MW812868.1
MW871289.1
MW320795.1
MW792882.1
MW880340.1
MW913674.1
MT827245.1
MW276582.1
MW915966.1
MW913866.1
MW635945.1
MW704945.1
MZ011410.1
MW660931.1
MW134317.1
MW913051.1
MW053697.1
MZ022677.1
FR995105.1
MW700730.1
MT911534.1
MW440388.1
MW902703.1
MT711875.1
MW990941.1
MW617838.1
MW780962.1
MW638883.1
MW286688.1
MZ032653.1
MW626814.1
MW593013.1
MW637562.1
MW637438.1
MW555595.1
MZ001286.1
MT642312.1
MW901956.1
MW202168.1
MW707092.1
MW813654.1
MW700344.1
MW991824.1
MW639587.1
MW638446.1
FR993473.1
MZ003827.1
MW735606.1
MW636547.1
MZ002925.1
MW540165.1
MW280494.1
MW860664.1
MW813239.1
MW706272.1
MW969277.1
MW640294.1
MW550690.1
MW634164.1
MW578164.1
MW636537.1
MW942643.1
MW860044.1
MW934087.1
MW578181.1
MW084384.1
MW903025.1
MZ038514.1
MW156214.1
MW859653.1
MW960469.1
MW420003.1
MW907343.1
MW864744.1
MT642381.1
MW739907.1
MW181496.1
MW640957.1
MZ013785.1
OB984671.1
FR991273.1
MW640140.1
MW704356.1
MW914317.1
MW976409.1
MW523778.1
MW660653.1
MW812514.1
MZ007188.1
MW963653.1
MW992917.1
MT970596.1
MT641741.1
MZ035362.1
MW993688.1
MW521399.1
FR990767.1
MW707108.1
MZ039723.1
MW635622.1
LR883245.1
MW958152.1
MW644094.1
MW577904.1
MW935737.1
MW915200.1
MW680612.1
MT520441.1
MW836840.1
MW780206.1
MW870552.1
MW870747.1
MZ006900.1
MW799681.1
MW913531.1
MW942060.1
MW519765.1
MW966051.1
MW706430.1
MW184924.1
MW643516.1
MW725605.1
MW965829.1
MT972767.1
MW966915.1
MZ006287.1
MW134446.1
MZ039006.1
MW910849.1
MW739877.1
MW903825.1
MW914593.1
MZ012661.1
FR992414.1
MW975976.1
MW796694.1
MW639245.1
MW986713.1
MW813020.1
MW642861.1
MW640586.1
MW734164.1
MW641894.1
MZ034544.1
MW812971.1
MW641738.1
MW943774.1
MW636142.1
MT873809.1
MZ026885.1
FR990972.1
MT971406.1
MW084556.1
MW985256.1
FR991159.1
MW735398.1
MW053862.1
MW705428.1
MW991766.1
MW634812.1
MT345828.1
MW626343.1
MW309096.1
FR991581.1
MW991200.1
MW021470.1
MT820463.1
FR993164.1
MZ033742.1
MW813003.1
MW735585.1
MW910014.1
MW555926.1
MW808451.1
MW750900.1
MW640486.1
MW894303.1
MT831791.1
MW642832.1
MW735474.1
MW642639.1
MW796606.1
MW986506.1
MW779901.1
MZ012779.1
MW639501.1
MT811644.1
MW865432.1
MW735208.1
MW912350.1
MW184179.1
MW707941.1
MW365042.1
MZ015504.1
MW586250.1
MW635022.1
MW913436.1
FR993232.1
MW912289.1
MZ037465.1
MW865470.1
MW974551.1
MZ012596.1
MW944565.1
MW715171.1
MW705802.1
MW799664.1
MW640739.1
MT612281.1
MW771791.1
MW635303.1
MZ000222.1
MW634569.1
MW780805.1
MW779326.1
MW863992.1
MZ037473.1
MW751127.1
MW985935.1
MW626203.1
MW870379.1
MT972279.1
MW894024.1
MW637728.1
MW641856.1
MZ013256.1
OD898953.1
MW739176.1
MZ039227.1
MW869956.1
MW631911.1
MW706836.1
MW035479.1
FR989830.1
OB985789.1
FR995571.1
OA967604.1
MW944159.1
MP929159.1
MW560592.1
MW885887.1
MW706269.1
MW147513.1
MW626014.1
MT972334.1
MW974026.1
MT451096.1
MZ025045.1
MW812700.1
MW903056.1
OD909397.1
MW590778.1
MW240725.1
FR990723.1
MW590389.1
MW781311.1
MW593827.1
MW894550.1
MW595912.1
MW907776.1
MW130879.1
MW406557.1
MW705870.1
MW903642.1
MW967334.1
MT972512.1
MW707175.1
MW639831.1
MW643479.1
MW908489.1
MW276265.1
MT745631.1
MW968541.1
FR994446.1
MW941645.1
MW704747.1
MW493764.1
MW181470.1
MT345874.1
MW454390.1
MW860817.1
MW777538.1
MW704746.1
MW860753.1
MW912526.1
MT439301.1
MW964015.1
MT350264.1
MW991825.1
MW974374.1
MW734231.1
MW991865.1
MT970903.1
MW964970.1
MW276441.1
MW809238.1
MZ036107.1
MW861146.1
MW644053.1
LR883540.1
MW586391.1
MW944325.1
MW903931.1
MW593287.1
MW279317.1
MW642681.1
MZ022124.1
MW643886.1
MZ021071.1
MZ028900.1
MW564813.1
MW859499.1
MW813148.1
MW736701.1
MT892987.1
MW902759.1
MW634815.1
MW903494.1
MZ001412.1
MW866440.1
MW860826.1
MZ024908.1
MW694018.1
MW540308.1
MZ025846.1
MW680404.1
MW635504.1
MW687328.1
MW871406.1
MW858513.1
MZ006320.1
MW986341.1
MZ000081.1
MW640301.1
MW963505.1
MW973000.1
MW639614.1
MT641731.1
MW945141.1
MW894561.1
MW816559.1
MW943189.1
MW972370.1
MW643978.1
MW859863.1
MW780329.1
MZ022207.1
MW941216.1
MW564979.1
MW967377.1
MW406525.1
FR992201.1
MW706524.1
MW799777.1
MZ036620.1
OA980492.1
MT834727.1
MW638943.1
MW715534.1
MW707832.1
MW973289.1
MW739830.1
MZ006875.1
MW848038.1
MW702644.1
MW813897.1
MW912939.1
MZ000575.1
MW593721.1
MW365136.1
MW115065.1
MW491117.1
MW191249.1
MZ035413.1
FR993454.1
MW365200.1
MT940482.1
MW707356.1
MW586159.1
MW933895.1
MW941450.1
MW780118.1
MZ038204.1
MW693862.1
MW635219.1
MZ035288.1
MW896460.1
MW474144.1
MW705063.1
MZ039261.1
MW640070.1
MW780626.1
MW577966.1
MW985231.1
MW707761.1
MW706794.1
MZ038406.1
MW540237.1
MW635540.1
MW813267.1
MZ010036.1
MT970807.1
MW735464.1
MZ021157.1
MW640724.1
MW290948.1
MW636215.1
MT970947.1
MW565686.1
MW904164.1
MW521583.1
MW643231.1
MZ007304.1
MW639285.1
MW866276.1
LR883879.1
MZ048013.1
MW930160.1
MW812654.1
MZ009391.1
MZ038216.1
MW861190.1
MW006556.1
MW903289.1
MW491063.1
MW577884.1
MW912561.1
MW766889.1
MW858598.1
MW565540.1
MW707348.1
MW933497.1
MT451251.1
MW643447.1
MW696362.1
MW933881.1
MZ012996.1
MW986807.1
MW890988.1
MW629446.1
MW894075.1
MW851778.1
MW436754.1
FR990863.1
MW944024.1
MW905078.1
MW813577.1
MZ021216.1
MZ012565.1
MT412164.1
MW966627.1
MW591207.1
MZ036624.1
MW813433.1
OA976275.1
MW944490.1
MW521487.1
MW644122.1
MZ001844.1
MW277387.1
MW942573.1
MW154797.1
MW638720.1
MW871137.1
MW578192.1
MW991740.1
MW704620.1
MW859025.1
MT972407.1
MW870995.1
MT451648.1
MW707544.1
MW932872.1
MW693214.1
MW735263.1
MW626178.1
MT831556.1
MW944659.1
MW449298.1
MT890327.1
MW524922.1
MW866062.1
MT821666.1
MW850967.1
OA979739.1
MW643062.1
MW986524.1
MW887976.1
FR993327.1
MZ007042.1
MW942840.1
FR993081.1
MW637543.1
MW586030.1
MW642813.1
MW752054.1
MT506201.1
MW933072.1
MW635683.1
MW822027.1
MZ011864.1
MW706489.1
MW635959.1
MW893981.1
MW687240.1
MW865813.1
MW861317.1
MW276627.1
MW190290.1
MW909529.1
MW641043.1
MW870989.1
MW565195.1
MW987856.1
MZ030482.1
MW644012.1
MW967101.1
MW808515.1
MW166126.1
MW365456.1
OD912170.1
MW698798.1
MT679166.1
MW987748.1
MW734556.1
MW973653.1
MW643522.1
MZ022752.1
MW902179.1
MW641554.1
MZ009933.1
MZ039086.1
MW912209.1
MW525028.1
MZ038593.1
MW990596.1
MW738139.1
MW639954.1
MW640509.1
MW813100.1
MZ024872.1
MW641916.1
MW767579.1
MW735358.1
MW860513.1
MT969455.1
MW631759.1
MZ002267.1
MW975673.1
MW779879.1
MT843740.1
MW706780.1
MW859307.1
MW591208.1
MW599460.1
MW909250.1
MZ023771.1
MW369376.1
MW861016.1
MZ025595.1
MW914413.1
MW759620.1
MW277266.1
MW540017.1
MW780390.1
MW907519.1
MT325595.1
MW673574.1
MW635397.1
MW813101.1
MW903161.1
MW942393.1
MT834707.1
MW700404.1
MW813515.1
MZ012106.1
MW644393.1
MW816605.1
LR992125.1
MW637495.1
MW565433.1
MW639848.1
MW704969.1
MT614453.1
MW941455.1
MW736108.1
MW643635.1
MT612287.1
MW870729.1
MW640241.1
MT972694.1
MW705518.1
MW796801.1
MW634750.1
MW700794.1
MW915224.1
MW639633.1
MZ036764.1
MZ010724.1
MZ035453.1
MW593174.1
MW987106.1
OA978585.1
MW157170.1
MW645995.1
MW639355.1
MW809216.1
MZ038147.1
MW813713.1
MW796726.1
MW906719.1
MW637672.1
MW155741.1
MT834651.1
MW781088.1




Clust 5: 380 in total (Centroid: MW868813.1)
MW868813.1
MT940498.1
MZ030598.1
MW976556.1
MW995062.1
MW913894.1
MW721490.1
OB990175.1
MW844178.1
MZ030254.1
MW975734.1
MW411632.1
MW989326.1
MW586404.1
MW894605.1
MZ009169.1
MW844817.1
MW865064.1
MW894257.1
MT972278.1
MW988823.1
MZ008575.1
MW933737.1
MW869914.1
MW907633.1
MZ036324.1
MW893732.1
MW943383.1
MW748234.1
MW738110.1
MW420560.1
MZ013560.1
MW684560.1
MW909939.1
MW758987.1
MZ001904.1
MW908597.1
MW653480.1
MT844026.1
FR991813.1
MW153940.1
MW907120.1
MZ024183.1
MW449383.1
MW738777.1
MW555882.1
LR883118.1
MW908604.1
MZ047073.1
MW868980.1
MW783326.1
MW986728.1
MW054004.1
MW909957.1
MW975712.1
MW906423.1
MW844682.1
MZ038566.1
MW420094.1
MW420482.1
MW913370.1
MW704459.1
MW976326.1
MW738667.1
MW993621.1
MW813460.1
MZ032559.1
MW930672.1
MW988880.1
MW914365.1
MW153320.1
MZ006878.1
MW783645.1
MW626758.1
MW155077.1
MW912201.1
OB989137.1
MW932660.1
MW869425.1
OB991599.1
MW966757.1
MW052649.1
MW905855.1
MW865639.1
MW906444.1
MW865127.1
MW994381.1
MZ001033.1
MW689758.1
MW673421.1
MZ003272.1
MZ030328.1
MW865000.1
MW906196.1
MZ009592.1
MW944763.1
MW869835.1
MW905594.1
OA983934.1
MW865646.1
MW154212.1
MW964905.1
MW891209.1
MZ001041.1
MW988722.1
MW864970.1
MZ021846.1
MW505842.1
MZ021161.1
MZ009371.1
MW865035.1
MW773576.1
MZ009704.1
MW914544.1
MW904696.1
MZ038699.1
MW904416.1
MW941859.1
MW558312.1
MZ029276.1
MW767519.1
MZ000695.1
MW206171.1
OD901126.1
MW286755.1
OA966368.1
MW842227.1
OD907593.1
MW914265.1
MW990839.1
MW911967.1
LR883213.1
MW454616.1
MW994947.1
MW965944.1
MW645773.1
MZ009906.1
MT499172.1
MZ012469.1
MW844782.1
MW993076.1
MW944776.1
MW906191.1
MW848237.1
FR992455.1
MZ002144.1
MW993023.1
MW420484.1
MW944180.1
MW932512.1
MW905007.1
MW968943.1
MW932383.1
MW988782.1
MW420170.1
MW842069.1
FR992890.1
MW921842.1
MW891327.1
MW702267.1
MW964935.1
MZ025621.1
MW969023.1
MW964488.1
MZ013347.1
MW914534.1
MT970875.1
MW904483.1
MZ038494.1
LR878297.1
MW988957.1
MW783594.1
MW904918.1
MW705196.1
MW911237.1
MZ001497.1
MW869837.1
OB995275.1
MW976563.1
MZ009047.1
MW989241.1
MW905845.1
MW989218.1
MW985956.1
MW989043.1
FR999924.1
MW909149.1
MW932408.1
MW154408.1
OA998533.1
MZ001831.1
MW992403.1
MW966994.1
MW904895.1
MW473696.1
MW906230.1
MW932068.1
MW523849.1
MW972330.1
FR992575.1
MW852033.1
MW735424.1
MW912051.1
FR998352.1
MZ010257.1
MW153261.1
MW420602.1
MW991776.1
MW942154.1
MW565746.1
MW941929.1
MW975130.1
MW868575.1
MW964555.1
MT969791.1
MW153171.1
MW912579.1
MW869542.1
MW914247.1
MZ048058.1
MW988473.1
MT969932.1
MW910741.1
MW153088.1
MZ048190.1
MZ001367.1
MW913710.1
MW513669.1
MW966827.1
MW779216.1
MW907222.1
MW623391.1
MZ002175.1
MZ038642.1
MZ036476.1
MW988668.1
MT972929.1
MW156820.1
FR991228.1
MW729296.1
FR994072.1
MW154546.1
MT971940.1
MW454536.1
MW976181.1
OA966070.1
MW565539.1
MW830981.1
MW923579.1
MW869183.1
MW773573.1
MZ034115.1
MZ037205.1
MW914053.1
MW842004.1
MZ030188.1
MW910810.1
FR991096.1
MW973390.1
MT969651.1
MW964793.1
MW932377.1
MW892276.1
MW830961.1
MW991666.1
MW864015.1
MZ038918.1
MW906319.1
MW865193.1
MW420197.1
MW891776.1
MW913195.1
MZ030092.1
OB984947.1
MW689915.1
MW777893.1
MW844622.1
MZ001433.1
MW749909.1
MW912157.1
MW702088.1
MW869751.1
MW922999.1
MW673062.1
MW910005.1
MZ001081.1
MZ000395.1
MW155088.1
MZ048104.1
FR992563.1
MW904287.1
LR883031.1
MW932242.1
MW539759.1
MW906385.1
MW540340.1
MW738424.1
MZ003658.1
MW904607.1
MW986778.1
MZ002781.1
FR999625.1
MW593045.1
MW420072.1
MW937195.1
MW738397.1
MZ013019.1
MW420439.1
MW545505.1
MW157185.1
MW880648.1
7CTT_Q
OA965559.1
OC997341.1
MZ009021.1
MW719503.1
MZ030409.1
MT520274.1
MW844734.1
OA980261.1
MW913477.1
MZ033889.1
MW892246.1
OA998695.1
MW851646.1
MZ045010.1
FR990551.1
MW959507.1
MW909435.1
MW908438.1
MT612200.1
MW524903.1
MW420766.1
MZ030577.1
MZ037502.1
MW959460.1
MW941633.1
MW989142.1
MW154676.1
MW943471.1
MW994846.1
MW719486.1
MW739845.1
MW975777.1
FR996991.1
MZ044549.1
MW868574.1
MW868916.1
MW865697.1
MW851910.1
MW156108.1
MW494288.1
MW689710.1
MZ036741.1
MZ022083.1
MW992197.1
MW988863.1
MW891393.1
MZ001345.1
MW904613.1
MZ033839.1
OB992168.1
OC996618.1
MZ002079.1
MZ021761.1
MW988681.1
MW986045.1
MW868997.1
MW054000.1
OA997278.1
MW835084.1
MT972324.1
MW783523.1
MZ007283.1
MZ021883.1
MW873966.1
OC999870.1
MW912393.1
MW975371.1
MW989225.1

6. Conclusion

From above we can see AncestralClust is a relative fast clustering alg and it gives a relative convincing reslut of clust, which means the size of the cluster are relatively even, which is a good indication of good clusters.

7. Future discussion

Even though AncestralClust works, but our recent analysis shows that, on some simplified models and extreme cases, there is a chance that this method could be wrong--some nucleotide sequences will be assigned into wrong clusters.

In the future, I'll focus on two aspects: for a fixed dataset, I will try to show when this method is working and when it gives a unreliable clusters. And I will also try to see how likely those extreme events happen, in different dataset. Moreover I will think about the comparison with other clustering method.

8. Hardware and soferware

  • Plateform: MacOS10.15.5 (19F101)

  • Processor: 2.9 GHz Dual-Core Intel Core i5

  • Memory: 8 GB 1867 MHz DDR3

  • R: R version 4.0.5 (2021-03-31) -- "Shake and Throw" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-apple-darwin17.0 (64-bit)

  • Python: Python 3.9.4 (v3.9.4:1f2e3088f3, Apr 4 2021, 12:32:44) [Clang 6.0 (clang-600.0.57)] on darwin