-
Notifications
You must be signed in to change notification settings - Fork 1
Description
While I am in the wrapping process, I tried to retrieve the groups associated with a kmer color, but I couldn't find a direct way.
So, I will put what I understood so far, and please correct me if I'm wrong.
After indexing, we will have a kDataFrame with key(hashVal):Val(kmerOrder). Then we can get the color associated with that kmer through the following getKmerColumn function getKmerColumn("color", hashVal)
kProcessor/include/kProcessor/kDataFrame.hpp
Line 414 in 6fa6857
| T getKmerColumnValue(const string& columnName,uint64_t kmer); |
Or by kmer Order like here,
kProcessor/include/kProcessor/kDataFrame.hpp
Line 421 in 6fa6857
| T getKmerColumnValueByOrder(const string& columnName,uint64_t kmerOrder); |
Now I have the color. How can I get to the color->group_IDs through the kDataFrame in an easy way, if possible?
Here's the corresponding Python code for this.
import kProcessor as kp
kf_map = kp.kDataFramePHMAP(21)
fasta_file = "seq.fa"
names_file = "seq.fa.names"
kp.index(kf_map, {"kSize": 21}, fasta_file, 1, names_file)
print(f"total size: {kf_map.size()}")
print(f"Column names: {kf_map.getColumnNames()}")
hash_to_color = dict()
it = kf_map.begin()
while it != kf_map.end():
kmer_hash = it.getHashedKmer()
kmer_color = kf_map.getKmerColumnValue_int("color", it.getHashedKmer())
hash_to_color[kmer_hash] = kmer_color
it.next()
print("kmer to colors")
for _hash, color in hash_to_color.items():
print(f"hash({_hash}) : color({color})")