Estimate_abundance analyze CLARK results of a metagenomic sample, and can provide for each target identified, its lineage, the count and proportion of objects assigned to it (with/without the count of identified reads). This script also allows to apply some filtering conditions (based on the confidence score or gamma score of assignments) to obtain a stricter estimation. The output format of estimate_abundance.sh is CSV.
Parent program: CLARK
CLARK is a versatile, fast and accurate sequence classification tool, especially useful for metagenomics and genomics applications. ClARK uses a novel approach to classify metagenomic reads at the species or genus level with high accuracy and high speed. Extensive experimental results on various metagenomic samples show that the classification accuracy of CLARK is better or comparable to the best state-of-the-art tools and it is significantly faster than any of its competitors. In its fastest single-threaded mode CLARK classifies, with high accuracy, about 32 million metagenomic short reads per minute. CLARK can also classify BAC clones or transcripts to chromosome arms and centromeric regions.