Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqm_mapper: reuse mapped reference genome possible? #905

Open
dutchscientist opened this issue Nov 15, 2024 · 5 comments
Open

sqm_mapper: reuse mapped reference genome possible? #905

dutchscientist opened this issue Nov 15, 2024 · 5 comments

Comments

@dutchscientist
Copy link

I am using sqm_mapper.pl to clean up samples from host sequences, in the current case a combined human and pig genome. That works great, but every time I start a new batch of samples, the sqm_mapper.pl starts re-making the reference from the fasta file, which takes quite a bit of time for a 5.4 GB haploid sequece.

I had a look at the options with sqm_mapper.pl -h, and don't see the option to use an already generated reference. Is it possible to do that? If not, a suggestion to add that to any new version, it is a really useful tool :)

@jtamames
Copy link
Owner

jtamames commented Dec 3, 2024

Hello
Rename your sqm_mapper.pl to sqm_mapper.pl.original
Then copy the attached new sqm_mapper.pl script (gunzip it first) to the sqm_mapper.pl directory (utils)
The script should now accept an option "-idx ", where path is the location of your previously created references.
Let me know if it works
Best,
J

sqm_mapper.pl.gz

@dutchscientist
Copy link
Author

You're a star! I will give that a try, much appreciated.

@dutchscientist
Copy link
Author

Unfortunately it does not work. Sqm_mapper still demands a -r input, and then starts making a new one, but now in the folder where it was started instead of the name/temp folder where it did it previously.

This was run first:
sqm_mapper.pl -r human_pig_genome.fna -s Trento_decontamination_part3.samples -f raw -o sqm_mapper_human_pig_trento3 --filter -t 20

And then to re-use the index:
sqm_mapper.pl -r human_pig_genome.fna -idx human_pig_genome_indexed -s Trento_decontamination_part3.samples -f raw -o sqm_mapper_human_pig_trento3_reuse --filter -t 20

With the latter setting, it still starts indexing, but now with the name of the folder but in the place where it is initiated.

Just to check, would the fna file need to be in the folder with the index files?

@jtamames
Copy link
Owner

jtamames commented Dec 4, 2024

Ok, I will check it. Quick fixes often don´t work :)
Sorry about that
Best,
J

@dutchscientist
Copy link
Author

No worries, really appreciate your efforts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants