Bowtie2-build can index reference genomes of any size. For genomes less than about 4 billion nucleotides in length, bowtie-build builds a 'small' index using 32-bit numbers in various parts of the index. When the genome is longer, bowtie-build builds a 'large' index using 64-bit numbers. Small indexes are stored in files with the .ebwt extension, and large indexes are stored in files with the .ebwtl extension. The user need not worry about whether a particular index is small or large; the wrapper scripts will automatically build and use the appropriate index.
Parent program: bowtie
Bowtie2 is one of the most popular and efficient sequencing read aligners. It maps reads to a reference genome or other sequences and uses full-text minute index and Burrows-Wheeler transform. Moreover, it allows local, paired-end and gapped-read alignment unlike its predecessor Bowtie. Bowtie2 creates so-called FM index (a special data structure) to keep its memory footprint small. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes. For the human genome, its memory footprint is typically around 3.2 GB.