-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test suite? #74
Comments
Yes, I am working in on a test suite here: https://github.com/refgenie/refget?tab=readme-ov-file#compliance-testing-of-sequence-collections-api But are you talking about a testing of the API? It seems more likely you may be looking for a test that just compares a digest for a collection, is that right? Also, in other news; I've also written a rust helper library that will compute seqcol digests. I didn't (and am not planning to) implement the web API in rust, but if all you need is the digest computations in rust, then you might be able to use that... It will be available in the v0.2.0 release of gtars planned for later this month. |
I have tried to implement the seqcol calculations independently, and these are the results I get for the test examples at level 1: {
'base.fa': {
'names': 'Fw1r9eRxfOZD98KKrhlYQNEdSRHoVxAG',
'lengths': 'cGRMZIb3AVgkcAfNv39RN7hnT5Chk7RX',
'sequences': '0uDQVLuHaOZi1u76LjV__yrVUIz9Bwhr',
'name_length_pairs': 'B9MESWM8k-hK_OeQK8bZNAG74pLY0Ujq',
'sorted_name_length_pairs': 'zjM1Ie9m0zFbqsAnZ6jAJSXuFpKTr40J'
},
'different_names.fa': {
'names': 'lrCv6NNXom7AC9tKFWqhcLLZsrcgJIqq',
'lengths': 'cGRMZIb3AVgkcAfNv39RN7hnT5Chk7RX',
'sequences': '0uDQVLuHaOZi1u76LjV__yrVUIz9Bwhr',
'name_length_pairs': 'm88geMfgGBZ7VpYgQKCB4P9z-mhKJ-nj',
'sorted_name_length_pairs': '1FQEGOQQ-m0NmZ0R-eeJEfnH1ayqJQ0T'
},
'different_order.fa': {
'names': 'dOAOfPGkf3wAf3CUsbjVTKhY9Wq2DL6f',
'lengths': 'x5qpE4FtMkvlwpKIzvHs3a02Nex5tthp',
'sequences': '7t6Ulz6OeUWu6FBxntbvFKOl8w3icl2h',
'name_length_pairs': 'a6JbVltjGqj5fEr01M0qjqCmlLLQ_P7N',
'sorted_name_length_pairs': 'zjM1Ie9m0zFbqsAnZ6jAJSXuFpKTr40J'
},
'pair_swap.fa': {
'names': 'gSWbV6khfIsnlQTyw1PmlQ8G7VRfIWbU',
'lengths': 'cGRMZIb3AVgkcAfNv39RN7hnT5Chk7RX',
'sequences': '0uDQVLuHaOZi1u76LjV__yrVUIz9Bwhr',
'name_length_pairs': 'yjUFKuKCURANxHar4JDF5ABOn6FJ-T8m',
'sorted_name_length_pairs': 'rL5OQOnFba8yyz7lS-0-hgZvwcQsiajN'
},
'subset.fa': {
'names': 'iyNUhtfR0TALytlmxK1Zx1_q3frkZyAd',
'lengths': '7-_HdxYiRf-AJLBKOTaJUdxXrUkIXs6T',
'sequences': '3ZP38SZcoc9wN7jsRyNSP9mQ1a3TUoUF',
'name_length_pairs': 'b_TLfweI2gfClgj57gcTFtEOMJ_daWd4',
'sorted_name_length_pairs': 'AvGYsdgtJpTLKYil3eLddJxwrE5OfKhE'
},
'swap_wo_coords.fa': {
'names': 'QX5ur-faw5nXis8HXUK2kMxgY5MTGVRn',
'lengths': 'cGRMZIb3AVgkcAfNv39RN7hnT5Chk7RX',
'sequences': '0uDQVLuHaOZi1u76LjV__yrVUIz9Bwhr',
'name_length_pairs': 'suXpFjcxpyUDOkBgNEakNEXtLlyxtjJr',
'sorted_name_length_pairs': 'zjM1Ie9m0zFbqsAnZ6jAJSXuFpKTr40J'
}
} Top level digests are fine: {
'base.fa': 'XZlrcEGi6mlopZ2uD8ObHkQB1d0oDwKk',
'different_names.fa': 'QvT5tAQ0B8Vkxd-qFftlzEk2QyfPtgOv',
'different_order.fa': 'Tpdsg75D4GKCGEHtIiDSL9Zx-DSuX5V8',
'pair_swap.fa': 'UNGAdNDmBbQbHihecPPFxwTydTcdFKxL',
'subset.fa': 'sv7GIP1K0qcskIKF3iaBmQpaum21vH74',
'swap_wo_coords.fa': 'aVzHaGFlUDUNF2IEmNdzS_A8lCY0stQH'
} But the [
{
"name": "base.fa",
"digest": "XZlrcEGi6mlopZ2uD8ObHkQB1d0oDwKk",
"sorted_name_length_pairs_digest": "wwE4PUok50YyEF2Ne8BBA5__zk92CZH8"
},
{
"name": "different_names.fa",
"digest": "QvT5tAQ0B8Vkxd-qFftlzEk2QyfPtgOv",
"sorted_name_length_pairs_digest": "BJx73ZGp4zfRKaf6i6j9cJH02tWW5PFb"
},
{
"name": "different_order.fa",
"digest": "Tpdsg75D4GKCGEHtIiDSL9Zx-DSuX5V8",
"sorted_name_length_pairs_digest": "wwE4PUok50YyEF2Ne8BBA5__zk92CZH8"
},
{
"name": "pair_swap.fa",
"digest": "UNGAdNDmBbQbHihecPPFxwTydTcdFKxL",
"sorted_name_length_pairs_digest": "2prbncALoH4cFy1Nhlu_e3all-VVuUZr"
},
{
"name": "subset.fa",
"digest": "sv7GIP1K0qcskIKF3iaBmQpaum21vH74",
"sorted_name_length_pairs_digest": "NQw5zO172a6q9OHBGwOrZCWl1pzyx94s"
},
{
"name": "swap_wo_coords.fa",
"digest": "aVzHaGFlUDUNF2IEmNdzS_A8lCY0stQH",
"sorted_name_length_pairs_digest": "wwE4PUok50YyEF2Ne8BBA5__zk92CZH8"
}
] Also, there are no
The issue might very well be on my side, as I have no other digests to compare with either. I did try to calculate the main example in the specification, but I also realized that the digests of the
|
Btw: I created the seqcols based on the first three columns in the |
Another TODO:
|
we're matching on all the level1 digests except
|
Rob and I are starting to use |
@mikelove A demo of this would be great once you have it working. Sorry for spamming you with notifications. Decided to take over this issue with a tangential issue instead of creating a new one, but didn't consider the notifications. |
No worries, good for me to watch along. |
@sveinugu : Can you point me at where the data are that you are using for you testing? I'd be interested in cross-checking our rust implementation as well, to ensure that everyone is getting the same answers before we really start relying on the |
See also:
|
@nsheff I think the issue is here: https://github.com/refgenie/refget/blob/9feb67191a48c16e627140733938c8fa6c7886c8/refget/models.py#L203 Basically, it seems you have not reverted from the |
Sounds right, I don't think I ever went back and changed that. I'll update that. great, we're almost there, then. |
Great. I also added |
Thanks for the help! The separate implementation I've made lives in the To that end, I've also made a crate Let me know if you have any feedback, thoughts or suggestions on either of the above! |
I’m trying to build a (minimal) seqcol implementation in rust here. Right now, the interface and API are as minimal as can be for what we need (building a seqcol digest from a SAM header for our long read quantification tool, oarfish). However, it may be generally useful to the community to have such a library in rust as that language continues to quickly gain adoption in bioinformatics.
to this end, is there some sort of “test suite” against which one can run an implementation to check compatibility with the reference implementation?
Thanks!
Rob
Cc @mikelove
The text was updated successfully, but these errors were encountered: