Mgaps clusters MUMs based on diagonals and separation. Mgaps was introduced into the MUMmer pipeline in an effort to better handle large-scale rearrangements and duplications. Unlike gaps, mgaps is a full clustering algorithm that is capable of generating multiple groups of consistently ordered matches. Clustering is controlled by a set of parameters that adjust the minimum cluster size, maximum gap between matches, etc. Only matches that were included in clusters will appear in the output, so by adjusting the parameters it is possible to filter out many of the spurious matches, thus leaving only the larger areas of conservation between the input sequences. The major advantage of mgaps is its ability to identify these 'islands' of conservation. This frees the user from the single LIS restraints of the gaps program and allows for the identification of large-scale rearrangements, duplications, gene families and so on.
Parent program: MUMmer
MUMmer or 'Maximal Unique Matches' is a bioinformatics software system for sequence alignment. It is based on the suffix tree data structure and is one of the fastest and most efficient systems available for this task, enabling it to be applied to very long sequences. It has been widely used for comparing different genomes to one another. In recent years it has become a popular algorithm for comparing genome assemblies to one another, which allows scientists to determine how a genome has changed after adding more DNA sequence or after running a different genome assembly program.