Ancient DNA generally preservers in form of short fragments comparable in length with length of read made by modern sequencers. Therefore, it could be useful to merge overlapping paired-end reads of short fragment into single sequence. A special tool called BBMerge can help us with this task.
In this tutorial we will work with reads of thousand year old tuberculosis strain, which was explored during this research. You can download original files here, but in this tutorial we will work with a small subset of these reads to make processing faster. You can download smaller files using this link.
Log into InsideDNA application, navigate into Files tab and create a folder called Ancient tuberculosis.
Upload files with reads into this folder.
We will use fastqc tool to check out the quality of our raw data. To run this tool, navigate to Terminal Tab, connect virtual Terminal and enter the following command into it:
isub -t fastqc -c 4 -r 3.6 -e "/srv/dna_tools/fastqc/fastqc
/data/userXXX/Ancient_tuberculosis/R2.fastq -o /data/userXXX/Ancient_tuberculosis/"
Here and further you should replace XXX with your own userID, which you can find in the header of Terminal tab.
Press Enter to submit your task.
This task will produce fastqc reports for both files with reads and save these reports into Ancient tuberculosis folder. You can monitor progress of your task in Tasks folder.
When fastqc has done its job, you can move onto Files tab and Ancient tuberculosis folder and download files with reports – R1_fastqc.html and R2_fastc.html. Open these files in any browser and explore their content.
The main problem with these reads is the contamination with adapter sequences, which we need to discard. We will use trimmomatic tools for thispurpose. Additional use of this tool is the separation of paired reads from unpaired ones, which are often present even in files, obtained as paired-end reads. For bbmerge tool we need to have only reads with pairs.
To clean up reads from adapters we need file with Illumina TrueSeq adapters for paired-end reads. Which you can download here. Upload this file into Ancient tuberculosis folder.
Enter the following command for trimmomatic tool into Terminal:
isub -t trimmomatic -c 4 -r 3.6 -e "java -jar /srv/dna_tools/trimmomatic_0.33/trimmomatic-0.33.jar PE -threads 4 -phred33
When the task is done, you will find files with processed reads in Ancient tuberculosis folder. We will take ones with index “paired” as input for bbmerge tool.
To run bbmerge, enter the following command into Terminal:
isub -t bbmerge -c 4 -r 3.6 -e "/srv/dna_tools/bbmap-36.02/bbmerge.sh
When this task is finished, file merged.fastq in Ancient tuberculosis will contain merged reads. You can view the statistics of merging by running the following command via Terminal:
As you can see, in our case about 70% or input reads were successfully merged, which is a rather good result.
Well done, now you learned some basics of ancient reads processing!
Follow us on Facebook and Twitter to be the first to read our new tutorials!Run this tool More tutorials