Skip to content
Peter Maxwell edited this page Apr 2, 2015 · 58 revisions

Table of Contents

Details

The Basic Local Alignment Search Tool (BLAST) finds regions of local similarity between sequences. The program compares nucleotide or protein sequences to sequence databases and calculates the statistical significance of matches. BLAST can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

Usage

Database: /share/db/blast/db/

Databases are located in /share/db/blast/db and will be updated on the 1st of the month. Please ensure there are no jobs accessing the databases at that time. If you need to access the database while it is being updated, you can make a copy of the relevant files in your home directory before the update and reference these instead.

Example SLURM jobs

multi-core

Job description (download)
#!/bin/bash
#SBATCH --job-name       BLAST
#SBATCH --cpus-per-task  16               # or 12 or 24
#SBATCH --time           02:30:00         # Allow 100 CPU hrs / GB of query sequences.
#SBATCH --mem-per-cpu    5G               # Include allowance for RAM disk containing database.
#SBATCH --constraint     avx              # Avoid the slow "bigmem" machines.

# Takes one argument, the FASTA file of query sequences.
QUERIES=$1
FORMAT="6 qseqid qstart qend qseq sseqid sgi sacc sstart send sseq staxids sscinames scomnames sblastnames sskingdoms stitle length evalue bitscore"
BLASTOPTS="-evalue 0.05 -max_target_seqs 10"
BLASTAPP=blastn
DB=nt
#BLASTAPP=blastx
#DB=nr

# Keep the database in RAM, which is important for databases over 4GB in the shared GPFS filesystem.
cp $BLASTDB/{$DB,taxdb}* $SHM_DIR/ 
export BLASTDB=$SHM_DIR

# Single node multithreaded BLAST.
srun $BLASTAPP $BLASTOPTS -db $DB -query $QUERIES -outfmt "$FORMAT" -out $QUERIES.$DB.$BLASTAPP -num_threads $SLURM_CPUS_PER_TASK
Input file(s)
Clone this wiki locally