Team II Webserver Group: Difference between revisions

From Compgenomics 2020
Jump to navigation Jump to search
Pparekh32 (talk | contribs)
No edit summary
Pparekh32 (talk | contribs)
No edit summary
Line 68: Line 68:
*Outputs as '''FASTA file'''
*Outputs as '''FASTA file'''
*Visualisation: Quast output
*Visualisation: Quast output


===Gene Prediction===
===Gene Prediction===
Line 80: Line 79:


*'''Outputs''':   
*'''Outputs''':   
For CDS: *.gff file, *.fna file, *.faa file  
**For CDS: *.gff file, *.fna file, *.faa file  
For tRNA: *.fa file
**For tRNA: *.fa file
for rRNA: *.gff file, *.fa file
**for rRNA: *.gff file, *.fa file


== '''Website Architecture''' ==
== '''Website Architecture''' ==

Revision as of 16:26, 21 April 2020

Members: Paarth Parekh, Shivam Sharma, Sooyoun Oh, Jayson Chao, Hanchen Wang

Introduction

Background

  • Purpose: 
    • Investigate an unknown outbreak pathogen using raw genome sequence data from the Centers for Disease Control and Prevention (CDC) foodborne illness surveillance outbreak investigations
  • Goal:
    • Create a Predictive Web Server that automates the process of characterizing the Campylobacter jejuni and make recommendations for the outbreak control.

Objective

  • Assemble the input reads
  • Analyze the assembly and predict annotated genes
  • Identifying the strain as a phylogenetic tree( or heatmap)
  • Calculate distance from the strain in the existing database
  • Virulence factor and antimicrobial resistance profiling
  • Visualize results in an effective manner

Design Goals

  • Mobile friendly
  • Easy to use
  • Minimal

Basic Pipeline Structure

This is the Basic Image for our Pipeline which describes the Input each part of the functionality takes in and the output.

Framework

DJANGO Back-end development connects the server side of our pipeline and database with the browser. We have used Django, a python web framework as it can integrate hardware at any level, and it can handle large amounts of traffic. It is also easy to implement and can enable the user to focus on the seperate functionality, without getting into the complexities of it.

Why Django?

  • Compatibility with python code: Django easily incorporates backbone scripts from each other group.
  • Database integration: Django has built-in support for many popular databases, while PHP must use outside packages to handle databases.
  • Security: Django is more secure than PHP.

  • Database accessibility: Django has an ORM system, which makes database manipulation easier than using SQL.
  • Scalability: Django is designed for bigger projects than Flask.
  • Community support: Django has a larger following, and it is easier to find troubleshooting support.

Front End

For Front end programming we have used:

  • Bootstrap, a popular framework for building responsive websites
  • HTML 5 doctype (the latest design and development standard)
  • CSS stylesheet: style of website
  • Javascript plugin support (jQuery): Alerts,Buttons, dropdowns, tooltips

Database

Django provides connection to MySql, Sqlite, PostgreSQL. We’re using Sqlite for our database, for its lightweight structure, and doesn’t need a heavy server (as in MySQL).


Features

Genome Assembly

  • Performs de-novo assembly with FastQ files as input
  • Runs the following tools:
    • fastp: read pre-processing
    • Spades: For Genome assembly
  • The input FastQ files must be paired-end reads
  • For Information on the tools visit: Team2_Genome_Assembly
  • Outputs as FASTA file
  • Visualisation: Quast output

Gene Prediction

  • Gene finding in isolates assembled from Genome Assembly or user provided fasta file as input
  • Runs the following tools:
  • GeneMarkS-2 or Prodigal for CDS prediction
  • Aragon for tRNA prediction
  • Barrnap for rRNA prediction
  • For more Information on tools visit: Team2_Gene_Prediction
  • Outputs:
    • For CDS: *.gff file, *.fna file, *.faa file
    • For tRNA: *.fa file
    • for rRNA: *.gff file, *.fa file

Website Architecture

  • Server
    • We’ll have used nginx in reverse proxy with gunicorn for our predictive web server.
    • Gunicorn is appropriate for python based web applications and projects and directly interacts with our django project.
    • Nginx sits on the outer layer and interacts directly with clients and manages security protocols.
    • Nginx deals with large-sized files and manages the server load efficiently.
  • Async Structure for Long Processes
    • Celery (python) is an async task/job queue ideal for running long jobs in the background and update the user once the job is done. Celery can be integrated with Django and efficient error-handling can be performed as well.
    • Email: We are using SendGrid as a cloud based platform to send emails to the user once their job is finished, using the wrappers in Django around the SNTP protocol.


  • Webpage Workflow

This is the entire workflow of our webpage with the blue indictor showing the parts of the pipeline stored in our database.


Access to Webserver

Here is Link to access our webserver: Cabunicrisis-Team2_webserver.

Here is our final presentation for Webserver: File:Team-2 Web Server Final.pdf

Reference