Show-coords parses the delta alignment output of NUCmer and PROmer, and displays summary information such as position, percent identity and so on, of each alignment. It is the most commonly used tool for analyzing the delta files. The -b option alters the output table to only display the location of the aligning regions, not their identity, direction, frame, etc. Also, for protein data, the -b option will collapse all overlapping frames, and list a single encompassing region. -B switches the output format to 'btab' (Blast tablature) which is a tab-delimited table with a different layout than the standard show-coords format. The coverage information added with the -c option is equal to the length of the alignment divided by the length of the sequence. The -k option will select the 'best' reading frame by choosing the alignment that is longest, or has the highest percent identity and is within 75% of the length of the longest alignment; only alignments that overlap each other by greater than 50% of their length will be considered for knockout. The -T option is different than the -B option because it retain the normal ordering of output columns. The output of the -d option for NUCmer data will appear under the FRM column, just like the reading frame info from PROmer data. The -o annotations will appear in the final column of the output. The descriptions reference the reference sequence, e.g. END means the overlap is on the end of the reference sequence and CONTAINED means the reference sequence is contained by the query sequence. The -c and -l options are useful when comparing two sets of assembly contigs, in that these options help determine if an alignment spans an entire contig, or is just a partial hit to a different sequence. The -b option is useful when the user wishes to identify syntenic regions between two genomes, but is not particularly interested in the actual alignment similarity or appearance. This option also disregards match orientation, so should not be used if this information is needed. The -g option comes in handy when comparing sequences that share a linear alignment relationship, that is there are no rearrangements. Large insertions, deletions and gaps can then be identified by the break between two adjacent alignments in the output. If there are more than one global alignment that share the same score, then one of them is picked at random to display. This is useful when mapping repetitive reads to a finished sequence. When run with the -B option, output format will consist of 21 tab-delimited columns. These are as follows: (1) query sequence ID (2) date of alignment (3) length of query sequence (4) alignment type (5) reference file (6) reference sequence ID (7) start of alignment in the query (8) end of alignment in the query (9) start of alignment in the reference (10) end of alignment in the reference (11) percent identity (12) percent similarity (13) length of alignment in the query (14) 0 for compatibility (15) 0 for compatibility (16) NULL for compatibility (17) 0 for compatibility (18) strand of the query (19) length of the reference sequence (20) 0 for compatibility (21) and 0 for compatibility.
Parent program: MUMmer
MUMmer or 'Maximal Unique Matches' is a bioinformatics software system for sequence alignment. It is based on the suffix tree data structure and is one of the fastest and most efficient systems available for this task, enabling it to be applied to very long sequences. It has been widely used for comparing different genomes to one another. In recent years it has become a popular algorithm for comparing genome assemblies to one another, which allows scientists to determine how a genome has changed after adding more DNA sequence or after running a different genome assembly program.