ESTScan is a program that can detect coding regions in DNA sequences, even if they are of low quality. ESTScan will also detect and correct sequencing errors that lead to frameshifts. ESTScan is not a gene prediction program , nor is it an open reading frame detector. In fact, its strength lies in the fact that it does not require an open reading frame to detect a coding region. As a result, the program may miss a few translated amino acids at either the N or the C terminus, but will detect coding regions with high selectivity and sensitivity. ESTScan takes advantages of the bias in hexanucleotide usage found in coding regions relative to non-coding regions. This bias is formalized as an inhomogeneous 3-periodic fifth-order Hidden Markov Model (HMM). Additionally, the HMM of ESTScan has been extended to allows insertions and deletions when these improve the coding region statistics.
Parent program: estscan
ESTScan is a program that could recognize potential coding regions in poor quality sequences, reconstruct these coding regions in their proper reading frame, and discriminate between ESTs with coding potential and those derived from non-coding regions. The use of ESTScan implies the creation of scores matrices which reflect the codons preferences in the studied organisms. These matrices can be obtained by using some scripts, that can be found in the new estscan-3.0 tar ball, or in the estscan-devel RPM packages.