-
Notifications
You must be signed in to change notification settings - Fork 83
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Step 10. Mapping Samples - excessive execution time? #909
Comments
Related, I had my job bumped on our HPC and it looks like progress is not preserved during this step and so it has restarted at sample 1. |
Is that the exact command you used? Note that you need to add |
Hey, this is the exact command: SqueezeMeta.pl -p SqueezeExtBins -m coassembly -extbins /dRepGroupsPB-Final/MAGs The output shows the job being distributed across all the threads, it just seems to take a very long time for each sample. I am at sample 30 of 54 now after 4 days of running on step 10 alone. |
As an update, the workflow did complete but it took about 5 days for Step 10 compared to 10 hours previously when external bins were not provided in v. 1.6.3, all other things would have been the same I believe. |
Hum... this is weird, because providing external bins should not interfere with step 10. External bins are not used there. Any chance this could be a behavior related to your system? For instance lots of I/O load? |
It's possible, they have had some issues with the file storage system on our HPC cluster as that caused a write error that previously killed this step a couple days in. However, isn't the mapping step using the external bins as the contigs for mapping as there is no assembly step? |
Yes exactly. Using external bins should have no influence on mapping times, only the number of contigs and the number of reads. |
Closing due to lack of activity, but let us know if you got any new insights here |
Hello,
I executed the SqueezeMeta 1.70beta8 workflow on 98 MAGs using the -extbins flag. It appears the sqm_counter read mapping step is taking an excessively long period of time, about 5 hours or so per paired file of ~20 million reads using 64 threads on a fairly new AMD Epyc HPC cluster. I have 52 samples so this will take an excessive amount of time if this trend continues. Any suggestions on troubleshooting would be appreciated, unless this is expected behaviour?
Using this same short read set with SQM 1.6.3
SqueezeMeta.pl -p SqueezeCoassemblyNF_eukaryotes --euk -taxbinmode "s+c" -m coassembly -b 50 -c 500 -binners "maxbin,metabat2,concoct"
The mapping step took about 10 hours out of ~ 3 days for the entire workflow.
The text was updated successfully, but these errors were encountered: