Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question about error in grid multiplex-(ERR): bowtie2-align died with signal 6 (ABRT) #19

Open
mutantjoo0 opened this issue Aug 4, 2020 · 4 comments

Comments

@mutantjoo0
Copy link

Hello,

I got an error (ERR): bowtie2-align died with signal 6 (ABRT) while I was running grid multiplex with my dataset. I installed GRiD on conda environment and downloaded databases. Running grid single and grid multiplex with example data set was successful.
However, I only got error when I tried to run grid multiplex with my dataset. My inputs are MAGs from DAS_tool in fa format and I converted fa files to fq files, I tested both seqtk (seqtk seq bin.fa > bin.fq -F30) and bbmap (reformat.sh in=bin.fa out=bin1.fq -qfake=30) to prepare input files for GRiD. The followings are summary of my commands and error.

##running grid multiplex with seqtk-converted fq files:
#work dir/input: ./input_fa_fq_seqtk/
#db (-d): ../../grid_db/comperehensive_db/
#output (-o): ../20200803_multiplex_output_fq_seqtk
#run grid multiplex:
$ grid multiplex -r . -e fq -d ../../grid_db/comperehensive_db/ -o ../20200803_multiplex_output_seqtk -c 0.2 -p -n 10

(grid) -bash-4.2$ grid multiplex -r . -e fq -d ../../grid_db/comperehensive_db/ -o ../20200803_multiplex_output_seqtk -c 0.2 -p -n 10
multiplex option activated
/mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/input_fa_fq_seqtk is present directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/input_fa_fq_seqtk is the reads directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/20200803_multiplex_output_seqtk is the output directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comperehensive_db is GRiD database directory
coverage cutoff is 0.2
 ################ Checking for dependencies ########
parallel found
R found
bowtie2 found
seqtk found
samtools found
bedtools found
bamtools found
blastn found
pathoscope found
mosdepth found
All required packages found
 ################ Checking for required R libraries ########
R libraries ok
Output directory ok
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
(ERR): bowtie2-align died with signal 6 (ABRT)

##running grid multiplex with reformat-converted fq files:
#work dir: /input_Std_fq_converted
#db (-d): ../../grid_db/comperehensive_db/
#output (-o): ../20200803_multiplex_output_fq_reformat
#run grid multiplex:
$ grid multiplex -r . -e fq -d ../../grid_db/comperehensive_db/ -o ../20200803_multiplex_output_reformat -c 0.2 -p -n 10

(grid) -bash-4.2$ grid multiplex -r . -e fq -d ../../grid_db/comperehensive_db/ -o ../20200803_multiplex_output_reformat -c 0.2 -p -n 10
multiplex option activated
/mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/input_Std_fq_converted is present directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/input_Std_fq_converted is the reads directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/20200803_multiplex_output_reformat is the output directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comperehensive_db is GRiD database directory
coverage cutoff is 0.2
 ################ Checking for dependencies ########
parallel found
R found
bowtie2 found
seqtk found
samtools found
bedtools found
bamtools found
blastn found
pathoscope found
mosdepth found
All required packages found
 ################ Checking for required R libraries ########
R libraries ok
Output directory ok
terminate called after throwing an instance of 'std::bad_alloc'
  what():  std::bad_alloc
(ERR): bowtie2-align died with signal 6 (ABRT)

#output(error):

total 65K
-rw-r----- 1 leejooy5 Reguera_Kashefi_Lab 255 Aug  3 16:42 reads_1596487374.txt
-rw-r----- 1 leejooy5 Reguera_Kashefi_Lab 41K Aug  3 16:42 Std_maxbin2_bin.10.BOWTIE_database.00.fa.main.sam

Also, I tried running grid multiplex with using environ_specific_database (skin and Stool) and by submitting job (sbatch), but got the same errors.
Please let me know if you need more detail. Thank you for your time and support.

Best,
Joo-Young

@aemiol
Copy link

aemiol commented Aug 5, 2020

Hi Joo-Young,
The error appears to be memory-related. Confirm you have sufficient space on your machine. You can also use a subset of reads and see if you get same error. Please note that the intermediary SAM files generated can be quite large.

Cheers,
Tunde

@mutantjoo0
Copy link
Author

Hi Tunde,

Thank you for your response.
I have learned that running bowtie2 possibly meet memory issue and was worrying about this issue might be caused similarly. I once submit job running grid multiplex as shown below and it failed.

#!/bin/bash --login

###### SBATCH lines for resource request ######
#SBATCH --time=4:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=20
#SBATCH --mem=40G
#SBATCH --job-name=test_comp

###### command lines for job running ######
cd /mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test

conda activate grid

grid multiplex -r input -e fq -o sb_output_comprehensive -d ../grid_db/comprehensive_db -c 0.2 -p -m

slurm.output:

(base) -bash-4.2$ cat slurm-28299.out
multiplex option activated
/mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test is present directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/input is the reads directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test/sb_output_comprehensive is the output directory
/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comprehensive_db is GRiD database directory
coverage cutoff is 0.2
 ################ Checking for dependencies ########
parallel found
R found
bowtie2 found
seqtk found
samtools found
bedtools found
bamtools found
blastn found
pathoscope found
mosdepth found
All required packages found
 ################ Checking for required R libraries ########
R libraries ok
Output directory ok
cat: /mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comprehensive_db/bowtie.txt: No such file or directory
[main_samview] fail to open "Std_maxbin2_bin.10.*.main.sam" for reading.

Although I am not sure whether this error happened due to memory issue or not, I can see bowtie.txt file in the directory unlike error message. FYI, 224 files accounting for 155G exist in /grid_db/comperehensive_db/. I wonder if this error resulted from memory issue.

(base) -bash-4.2$ ls -lh ../grid_db/comperehensive_db/bowtie.txt
-rw-r----- 1 leejooy5 Reguera_Kashefi_Lab 814 Apr 22  2018 ../grid_db/comperehensive_db/bowtie.txt
(base) -bash-4.2$ ls -1 ../grid_db/comperehensive_db/ | wc -l
224
(base) -bash-4.2$ du -sh ../grid_db/comperehensive_db/
155G    ../grid_db/comperehensive_db/

I submitted another jobs allocating more CPU and memory resources as follows. Could you correct if these are not enough for processing grid multiplex?

50 CPU, 50G memory:

###### SBATCH lines for resource request ######
#SBATCH --time=10:00:00
#SBATCH --nodes=1-5
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=50
#SBATCH --mem=50G
#SBATCH --job-name=Std_Cdb

###### command lines for job running ######
cd /mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test

conda activate grid

grid multiplex -r input_Std_fq_converted -e fq -o Std_output_Cdb_20200806 -d ../grid_db/comprehensive_db -c 0.2 -p -m

50CPU, 100G memory:

#!/bin/bash --login

###### SBATCH lines for resource request ######
#SBATCH --time=10:00:00
#SBATCH --nodes=1-5
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=50
#SBATCH --mem=100G
#SBATCH --job-name=Std_Cdb_mem100

###### command lines for job running ######
cd /mnt/research/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/test

conda activate grid

grid multiplex -r input_Std_fq_converted -e fq -o Std_output_Cdb_20200806_mem100 -d ../grid_db/comprehensive_db -c 0.2 -p -m

Also, I wonder if you have specification for running grid multiplex, for example, minimum requirements in CPU and memory. Thank you for your support.

Cheers
Joo-Young

@aemiol
Copy link

aemiol commented Aug 7, 2020

Hi, There is a typo in your path to the GRiD DB.
In your first comment, the path was "/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comperehensive_db" whereas now, the path is "/mnt/ufs18/rs-002/Reguera_Kashefi_Lab/JYL/grid_otic_MAGs/grid_db/comprehensive_db"

Confirm the spelling for "comprehensive" as it pertains to your directory.

Another suggestion to increase runtime, use the -n flag to increase the number of threads to be used by bowtie. (e.g. -n 20).

Tunde

@mutantjoo0
Copy link
Author

Hi Tunde,

Thank you for your support. I submitted corrected scripts to re-run jobs. I will posted here after running complete.

Thanks,
Joo-Young

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants