Team III Genome Assembly Group: Difference between revisions

From Compgenomics 2020
Jump to navigation Jump to search
Amaddala3 (talk | contribs)
Created page with "== Introduction/Background == [put picture of final pipeline here] === Lectures === Background/Strategy: [] == Quality Control/Trimming == For quality control, we compare..."
 
Amaddala3 (talk | contribs)
mNo edit summary
Line 5: Line 5:
=== Lectures ===
=== Lectures ===


Background/Strategy: []
Background/Strategy: [[Media:Team3_genome_assembly_1.pdf]]


== Quality Control/Trimming ==
== Quality Control/Trimming ==

Revision as of 19:27, 1 February 2020

Introduction/Background

[put picture of final pipeline here]

Lectures

Background/Strategy: Media:Team3_genome_assembly_1.pdf

Quality Control/Trimming

For quality control, we compared two tools, FastQC and fastp. We first proved that the two programs generated identical information when run on identical fastq files, after which we compared the information displayed in the reports for both. FastQC color codes its per-base sequence quality graphs, which are the ones we will most likely use most in this project. However, fastp generates interactive graphs, and it also has the option to solely output results into a json file, increasing its speed. Furthermore, fastp runs significantly faster than FastQC.

Afterwards, we compared fastp's trimming features with those of Trimmomatic. Our data did not contain adapters, so we did not need an additional tool like CutAdapt to remove them. We showed that most, if not all, of Trimmomatic's trimming features can be replicated in fastp. Fastp has the added advantage of combining both quality control and trimming into a single step, increasing the speed and usability of our pipeline.

Assembly Tools

Reference-based vs. de novo assembly

Parameters considered when selecting assembly tool

Tools considered

AbySS

MaSuRCA

SPAdes

SKESA

STRiDe

Post-Assembly Validation

quast

BUSCO

References