Team III Genome Assembly Group

From Compgenomics 2020
Revision as of 14:30, 19 February 2020 by Dkundnani3 (talk | contribs) (→‎Lectures)
Jump to navigation Jump to search

Introduction/Background

[put picture of final pipeline here]

Lectures

Background/Strategy: Team3 Genome Assembly Background and Strategy

Results: Team3 Genome Assembly Results

Quality Control/Trimming

For quality control, we compared two tools, FastQC and fastp. We first proved that the two programs generated identical information when run on identical fastq files, after which we compared the information displayed in the reports for both. FastQC color codes its per-base sequence quality graphs, which are the ones we will most likely use most in this project. However, fastp generates interactive graphs, and it also has the option to solely output results into a json file, increasing its speed. Furthermore, fastp runs significantly faster than FastQC.

Afterwards, we compared fastp's trimming features with those of Trimmomatic. Our data did not contain adapters, so we did not need an additional tool like CutAdapt to remove them. We showed that most, if not all, of Trimmomatic's trimming features can be replicated in fastp. Fastp has the added advantage of combining both quality control and trimming into a single step, increasing the speed and usability of our pipeline.

Assembly Tools

Reference-based vs. de novo assembly

Parameters considered when selecting assembly tool

Tools considered

AbySS

MaSuRCA

SPAdes

SKESA

STRiDe

Post-Assembly Validation

quast

BUSCO

References