0 citations
359 runs

geneCoverage

By Kostikova A., Last update 1494698999
All tools Run this tool

geneCoverage description

geneCoverage tool provides a simple way to estimate how many unique species were sequenced for each unique gene or gene product within a given taxon range. As an output it produces a csv file with all product names (genes, transcripts, etc), reference sequences and counts of unique and non-unique species. Once such csv file is ready, it is very simple to count frequencies of all product names and have a reference sequence for each product. The tool parses NCBI databases for all sequenced sequences for a given taxon range: for example, you can specify limit down to species, genus, family, order, etc. A very useful application for this tool is a large phylogeny reconstrucion. Imagine you have sequenced some of the species within a given group and now would like to reconstruct a large phylogeny for entire familty. But to do this you need to know which genes have been sequenced the most for entire family and which reference sequences can be used for BLAST download of each gene or gene product. This script allows you to easily collect this information. Normally, this script would be used as an input for geneCoverage2fasta script.


Parent program: geneCoverage

geneCoverage tool provides a simple way to estimate how many unique species were sequenced for each unique gene or gene product within a given taxon range. As an output it produces a csv file with all product names (genes, transcripts, etc), reference sequences and counts of unique and non-unique species. Once such csv file is ready, it is very simple to count frequencies of all product names and have a reference sequence for each product. The tool parses NCBI databases for all sequenced sequences for a given taxon range: for example, you can specify limit down to species, genus, family, order, etc. A very useful application for this tool is a large phylogeny reconstrucion. Imagine you have sequenced some of the species within a given group and now would like to reconstruct a large phylogeny for entire familty. But to do this you need to know which genes have been sequenced the most for entire family and which reference sequences can be used for BLAST download of each gene or gene product. This script allows you to easily collect this information. Normally, this script would be used as an input for geneCoverage2fasta script.