Team II Comparative Genomics Group: Difference between revisions
No edit summary |
|||
Line 4: | Line 4: | ||
== '''Introduction'''== | == '''Introduction'''== | ||
===='''What is Comparative Genomics?'''==== | ===='''What is Comparative Genomics?'''==== | ||
Once genomes are fully assembled and annotated, outbreak analysis can begin via comparative genomics. Generally, metadata ascertained from gene prediction and annotation can be used to map the relatedness of multiple isolates. Combined with epidemiological data, a given outbreak can be mapped back to a particular source (patient zero), and tracked to determine which strains are outbreak isolates and which are sporadic cases. Furthermore, phenotypic features such as virulence, antibiotic resistence, and pathogenicity can be determined. | |||
Compilation of these data allow for recommendations to be made on behalf of human impact, treatment strategy, and management methods to address further spread. | |||
== '''Objectives''' == | == '''Objectives''' == |
Revision as of 13:26, 2 March 2020
Team 2: Comparative Genomics
Team Members: Kara Keun Lee, Courtney Astore, Kristine Lacek, Ujani Hazra, Jayson Chao
Introduction
What is Comparative Genomics?
Once genomes are fully assembled and annotated, outbreak analysis can begin via comparative genomics. Generally, metadata ascertained from gene prediction and annotation can be used to map the relatedness of multiple isolates. Combined with epidemiological data, a given outbreak can be mapped back to a particular source (patient zero), and tracked to determine which strains are outbreak isolates and which are sporadic cases. Furthermore, phenotypic features such as virulence, antibiotic resistence, and pathogenicity can be determined. Compilation of these data allow for recommendations to be made on behalf of human impact, treatment strategy, and management methods to address further spread.
Objectives
- Identify kinds of strains (outbreak vs. sporadic)
- Construct phylogeny demonstrating which isolates are related and which differ
- Determine source of outbreak
- Map virulence and antibiotic resistence features of outbreak isolates
- Compile recommendations for outbreak response and treatment
Overview of Techniques
When performing phylogenomics, there are many options by which one can classify similarities and differences across the genome. Our approach utilizes tools from three different techniques.
Hierarical Clustering
MLST
SNP-based
SNP stands for Single Nucleotide Polymorphism, meaning that certain alleles have two or three possibilities as to which base is at a given locus. As SNPs accumulate through de novo mutations and are passed down through generations, comparing a given isolate's SNPs to other isolates and a reference genome allow ascertainment of phylogenetic distance between samples. Tools have been developed to compare bases position by position (SNP-calling) and create matrices to compute relatedness between samples based on common SNPs. Generalized Algorithm Overview:
- Pre-processing and read cleaning
- Mapping
- SNP calling against reference genome
- Phylogeny generation based on SNP profiles