Team I Webserver Group
Members: Devishi Kesar, Shuheng Gan, Winnie Zheng, Priya Narayanan, Aaron Pfennig
Introduction
Background
Objective
- Provide a comprehensive, automated platform to analyze E.coli isolates in order to predict virulence factors and outbreak cluster
- Functionalities of the webserver:
- Identify virulence factors/microbial resistance and outbreak response for provided isolates
- Allow data upload at each step of outline pipeline
- Visualize findings in a comprehensible way
- Design
- Intuitive usage
- Provide only essential options
WebServer
- Structure
- Access to Webserver
Here is Link to access our webserver:
Functionalities
Genome Assembly
- Performs de-novo assembly with FastQ files as input
- Runs following tools by default:
- fastp: read pre-processing
- Unicycler: Genome assembly
- Options:
- Perform read preprocessing
- Kmer-size
- Spades as alternative assembly method
- The input FastQ files must be paired-end reads
- Outputs as FASTA file
- Visualisation: Quast output
- For more details to visit: Team1_Genome_Assembly
Gene Prediction
- Gene finding in assembled isolates or provided FASTA fileTakes FastQ files as input
- Runs following tools by default:
- CDS: Prodigal
- tRNA: Aragorn
- rRNA: barrnap
- Options:
- GeneMarkS-2 as alternative tool for CDS predictions
- tRNAscan-SE as alternative tool for tRNA predictions
- RNAmmer as alternative tool for rRNA predictions
- Outputs as *.gff file, *_cds.fna file, *_protein.faa file and *_rna.fna file
- For more details to visit: Team1_Gene_Prediction
Functional Annotation
- Obtain functional information about predicted genes
- Input: FASTA file
- Cluster Tool: usearch
- Output: centroid.fasta
- Homology Tools:
- General annotation: InterProScan, EggNOGmapper
- Antibiotic resistance gene: DeepARG
- Abinitio Tools:
- Signal Peptides: SignalP 5.0
- Transmembrane Proteins: TMHMM
- CRISPR Sites: PilerCR
- Output: *.tsv file
- For more details to visit: Team1_Functional_Annotation
Comparative Genomics
- Comparison of genomic features of input files to identify outbreak cluster
- Input: FASTA file, prodigal training file(chewBBACA)
- Tools used:
- MUMmer 4.0
- chewBBACA
- kSNP 3.0
- FigTree
- Options:
- Parsimony tree, maximum likelihood and neighbour joining trees as option for kSNP
- k-mer size option for kSNP
- Output: .tsv file(for chewBBACA, MUMmer), .png(kSNP)
- Visualisation: Phylogenetic tree for identified SNP’s, phylogenetic tree for MLST, graph for epidemiological data visualisation
- For more details to visit: Team1_Comparative_Genomics
Method
How to build web server
Webserver Demo
- Choice One: Running General Pipline
- Choice Two: Running each step separately