Step 10. Mapping Samples - excessive execution time? #909

MicroSeq · 2024-11-21T14:56:00Z

Hello,

I executed the SqueezeMeta 1.70beta8 workflow on 98 MAGs using the -extbins flag. It appears the sqm_counter read mapping step is taking an excessively long period of time, about 5 hours or so per paired file of ~20 million reads using 64 threads on a fairly new AMD Epyc HPC cluster. I have 52 samples so this will take an excessive amount of time if this trend continues. Any suggestions on troubleshooting would be appreciated, unless this is expected behaviour?

Using this same short read set with SQM 1.6.3

SqueezeMeta.pl -p SqueezeCoassemblyNF_eukaryotes --euk -taxbinmode "s+c" -m coassembly -b 50 -c 500 -binners "maxbin,metabat2,concoct"

The mapping step took about 10 hours out of ~ 3 days for the entire workflow.

MicroSeq · 2024-11-22T16:02:16Z

Related, I had my job bumped on our HPC and it looks like progress is not preserved during this step and so it has restarted at sample 1.

fpusan · 2024-11-29T06:18:01Z

Is that the exact command you used? Note that you need to add -t 64 for SqueezeMeta to actually use the 64 threads.

MicroSeq · 2024-11-29T15:01:45Z

Is that the exact command you used? Note that you need to add -t 64 for SqueezeMeta to actually use the 64 threads.

Hey, this is the exact command:

SqueezeMeta.pl -p SqueezeExtBins -m coassembly -extbins /dRepGroupsPB-Final/MAGs
-f /Squeeze -b 25 -s samples.tsv -t $SLURM_CPUS_PER_TASK --restart

The output shows the job being distributed across all the threads, it just seems to take a very long time for each sample. I am at sample 30 of 54 now after 4 days of running on step 10 alone.

MicroSeq · 2024-12-02T15:25:55Z

As an update, the workflow did complete but it took about 5 days for Step 10 compared to 10 hours previously when external bins were not provided in v. 1.6.3, all other things would have been the same I believe.

jtamames · 2024-12-03T09:54:35Z

Hum... this is weird, because providing external bins should not interfere with step 10. External bins are not used there. Any chance this could be a behavior related to your system? For instance lots of I/O load?
Best,
J

MicroSeq · 2024-12-03T20:30:30Z

Hum... this is weird, because providing external bins should not interfere with step 10. External bins are not used there. Any chance this could be a behavior related to your system? For instance lots of I/O load? Best, J

It's possible, they have had some issues with the file storage system on our HPC cluster as that caused a write error that previously killed this step a couple days in.

However, isn't the mapping step using the external bins as the contigs for mapping as there is no assembly step?

fpusan · 2024-12-19T09:20:09Z

Yes exactly. Using external bins should have no influence on mapping times, only the number of contigs and the number of reads.
It could be that the file system was slower in the second run, hence the excessive time in sqm_counter, but it's hard to tell. If the filesystem causing the difference between both runs you should be seeing lower CPU usage in the second run than in the first. Not sure if you can check that...

fpusan · 2025-02-18T08:55:11Z

Closing due to lack of activity, but let us know if you got any new insights here

fpusan closed this as completed Feb 18, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step 10. Mapping Samples - excessive execution time? #909

Step 10. Mapping Samples - excessive execution time? #909

MicroSeq commented Nov 21, 2024 •

edited

Loading

MicroSeq commented Nov 22, 2024

fpusan commented Nov 29, 2024

MicroSeq commented Nov 29, 2024 •

edited

Loading

MicroSeq commented Dec 2, 2024

jtamames commented Dec 3, 2024

MicroSeq commented Dec 3, 2024 •

edited

Loading

fpusan commented Dec 19, 2024

fpusan commented Feb 18, 2025

Step 10. Mapping Samples - excessive execution time? #909

Step 10. Mapping Samples - excessive execution time? #909

Comments

MicroSeq commented Nov 21, 2024 • edited Loading

MicroSeq commented Nov 22, 2024

fpusan commented Nov 29, 2024

MicroSeq commented Nov 29, 2024 • edited Loading

MicroSeq commented Dec 2, 2024

jtamames commented Dec 3, 2024

MicroSeq commented Dec 3, 2024 • edited Loading

fpusan commented Dec 19, 2024

fpusan commented Feb 18, 2025

MicroSeq commented Nov 21, 2024 •

edited

Loading

MicroSeq commented Nov 29, 2024 •

edited

Loading

MicroSeq commented Dec 3, 2024 •

edited

Loading