|BLAST: Basic Local Alignment Search Tool|
NCBI's BLAST is included in the Rocks Bio Roll and in the GCG package on all HPC nodes. It is the world's most popular sequence similarity search tool, and can be used to find similarity between protein or nucleotide sequence queries and sequence databases, including various translated searches.
Using BLAST on HPC
The Rocks implementation of BLAST is described in the the Bio Roll documentation and through the BLAST --help command:
The same BLAST databases are avaliable to the Rocks and GCG versions of BLAST:
RefSeq RNA and Protein (rs_rna and rs_protein)
The environment variable $BLASTDB is set for you to point to these databases when you run the bio.sh or bio.csh initialization script. Therefore, launch the program with some variation of the following type of command (preferably within the context of the MOAB job submission mechanism):
blastall -d uniprot -p blastp -i test.pep.fsa -o test.pep.blastp
Here's a sample MOAB submission script showing how it could be done with tBLASTn to search a protein query against the Genome Survey Sequence (gss) section of GenBank:
#!/bin/bash # See for other MOAB msub options: # http://www.clusterresources.com/ #MOAB -j oe #MOAB -m abe #MOAB -N tBLASTn source /usr/local/profile.d/bio.sh blastall -d gss -p tblastn -i $HOME/test/test.pep.fsa \ -o $HOME/test/test.pep.tblastn
Note: the backslash (\) continuation character does not work within MOAB scripts and is just used here as a convenience. The command must be all on one line. Consider using BLAST inside of GCG for ease of use, unless you need multiple nodes and/or scripting capability.