MAFFT aligns sequences together with homologues automatically collected from SwissProt via NCBI BLAST. The accuracy of an alignment of a few distantly related sequences is considerably improved when being aligned together with their close homologs. The reason for the improvement is probably the same as that for PSI-BLAST. That is, the positions of highly conserved residues, those with many gaps and other additional information is brought by close homologs. According to Katoh et al. (2005), the improvement by adding close homologs is 10% or so, which is comparable to the improvement by incorporating structural information of a pair of sequences. Mafft-homologs works like this: 1.Collect a number (50 by default) of close homologs (E=1e-10 by default) of the input sequences. 2.Align the input sequences and homologs all together using the L-INS-i strategy. 3.Remove the homologs.
Parent program: mafft
MAFFT is a multiple sequence alignment program for amino acid or nucleotide sequences that uses fast Fourier transform and simplified scoring system to rapidly identify homologous regions between sequences. MAFFT performs well in reducing CPU time required for multiple sequence alignment and allows an increased the accuracy of alignments even for sequences with large insertions or extensions as well as distantly related sequences of similar length. It offers a range of multiple alignment methods, L-INS-i (accurate; for alignment of < 200 sequences), FFT-NS-2 (fast; for alignment of < 30,000 sequences), etc. The type of input sequences (amino acid or nucleotide) is automatically recognized.