-
Notifications
You must be signed in to change notification settings - Fork 441
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add checkm2 #6542
base: main
Are you sure you want to change the base?
add checkm2 #6542
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent timing: One of my users just asked for the tool :)
Could contribute a data manager.
remove dbkey column rename tables
and add working output assertions as comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good from my side.
#The <version> column indicates the checkm2 version that generated the database | ||
|
||
# | ||
#diamond_db_1.0.2 Diamond database 1.0.2 /mnt/galaxyIndices/Checkm2_database/uniref100.KO.1.dmnd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this really a diamond DB?
If so, this is interesting ... should we have a general Diamond location file and DM? with some tag
for different tools?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think so. And I agree that it would be interesting.
But it would be good to know and store the diamond version that has been used to generate it, or? Seems difficult to find out from the sources. The tool just downloads the latest version from zenodo (and I could not even find the link). Let me check if diamond dbinfo
could help.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice:
> diamond dbinfo -d uniref100.KO.1.dmnd
diamond v2.0.4.142 (C) Max Planck Society for the Advancement of Science
Documentation, support and updates available at http://www.diamondsearch.org
Database format version = 3
Diamond build = 142
Sequences = 6518230
Letters = 2584051404
Should we do this? Add columns tool
, db_format_version
, diamond_build
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need diamond_build? But yes, we should do that :)
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@astrovsky01 do you think you can work on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. I had to get something up and running somewhere I could test it, but I just got it going
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some digging, I found that Checkm2 doesn't actually work with all diamond databases. It has an internal checksum to make sure it's the specific one from the database download command:
as such, I think that while it would be good to have the general Diamond db data manager, having a specific one for checkm2 is also a good idea
FOR CONTRIBUTOR: