Motiph scans a DNA multiple sequence alignment for occurrences of given motifs, taking into account the phylogenetic tree relating the sequences. Note: any sequences containing nothing but gaps are removed from the alignment. The core of the algorithm is a routine (site_likelihood) for scoring a particular column of the multiple alignment using a given evolutionary model and a given phylogenetic tree. The alignment site provides the observed nucleotides at the base of the tree, and we sum over the unobserved nucleotides in the rest of the tree, conditioning on the equilibrium distribution from the evolutionary model at the root of the tree (Felsenstein Pruning Algorithm). The tree must be a maximum likelihood tree, of the kind generated by DNAml from Phylip or by FastDNAml. Branch lengths in the tree are converted to conditional probabilities using the specified evolutionary model.
Parent program: meme
MEME is a tool for discovering motifs in a group of related DNA or protein sequences. MEME takes as input a group of DNA or protein sequences and outputs as many motifs as requested up to a user-specified statistical confidence threshold. MEME uses statistical modeling techniques to automatically choose the best width, number of occurrences, and description for each motif.