Getting Started

Connecting
Documentation
FAQ

Login Form






Lost Password?
MEME: Expectation Maximization for de novo motif discovery
MEME is a suite of programs used for discovering, manipulating, and searching with profiles of previously unknown motifs within sequence datasets. The two main MEME programs, MEME itself, and MAST (known as MotifSearch in GCG) are included in the Wisconsin Package and may be easier to run from that interface.

Using MEME on HPC


All MEME programs are located in "/opt/Bio/meme/bin" and a MEME environment variable script is sourced by the biouser initialization script. Primarily you will want to use MEME's parallel implementation, "meme_p" from that directory. Otherwise MEME and MAST (known as MotifSearch in GCG) are far simpler to run from within the Wisconsin Package. However, de novo motif discovery with large datasets can be very time consuming; therefore, using "meme_p" may be desirable in many instances. Parallel MEME was compiled with GNU OpenMPI, so that environment needs to be sourced in your MOAB script. Furthermore, the MEME initialization script needs to be sourced as well when using MOAB. MEME requires FastA format input, and output is sent to "stdout." Here's an example MOAB script:

#!/bin/bash
#MOAB -l nodes=16
#MOAB -j oe
#MOAB -m abe
#MOAB -N MEME-OPENMPI
source /usr/local/profile.d/openmpi-gnu.sh
. /opt/Bio/meme/etc/meme.sh
mpirun /opt/Bio/meme/bin/meme_p $HOME/working/combined.fsa \
-p 16 -dna -maxsize 1000000 > $HOME/working/combined.meme_p

Note that the backslash continuation character does not work within MOAB scripts and is only used here to clarify that the entire command needs to be on one line. This script will run MEME on 16 processors (-p) using the training dataset "combined.fsa," specifying that the sequences are DNA with a maximum dataset size of 1,000,000 characters, and will output a file named "combined.meme_p." See additional options and command line examples in the MEME and MAST man pages.