Southwest-University and EEB BootCamp 2020: Difference between pages

From QiuLab
(Difference between pages)
Jump to navigation Jump to search
imported>Weigang
 
imported>Weigang
 
Line 1: Line 1:
<center>'''Biomedical Genomics'''</center>
<center>Bioinformatics Boot Camp for Ecology & Evolution: '''Pathogen Evolutionary Genomics'''</center>
<center>July 8-19, 2019</center>
<center>Thursday, Aug 6, 2020, 2 - 3:30pm</center>
<center>'''Instructor:''' Weigang Qiu, Ph.D.<br>Professor, Department of Biological Sciences, City University of New York, Hunter College & Graduate Center<br>Adjunct Faculty, Department of Physiology and Biophysics,
<center>'''Instructors:''' Dr Weigang Qiu & Ms Saymon Akther</center>
Institute for Computational Biomedicine, Weil Cornell Medical College</center>
<center>'''Office:''' B402 Belfer Research Building, 413 East 69th Street, New York, NY 10021, USA</center>
<center>'''Email:''' weigang@genectr.hunter.cuny.edu</center>
<center>'''Email:''' weigang@genectr.hunter.cuny.edu</center>
<center>'''Lab Website:''' http://diverge.hunter.cuny.edu/labwiki/</center>
<center>'''Lab Website:''' http://diverge.hunter.cuny.edu/labwiki/</center>
<br>
<center>
<center>'''Host''': Shunqin Zhu (祝顺琴), Ph.D.<br>Associate Professor, School of  Life Science, South West University</center>
----
[[File:Lp54-gain-loss.png|300px|thumbnail|Figure 1. Gains & losses of host-defense genes among Lyme pathogen genomes ([https://www.ncbi.nlm.nih.gov/pubmed/24704760 Qiu & Martin 2014])]]
==Course Overview==
Welcome to BioMedical Genomics, a computer workshop for advanced undergraduates and graduate students. A genome is the total genetic content of an organism. Driven by breakthroughs such as the decoding of the first human genome and next-generation DNA -sequencing technologies, biomedical sciences are undergoing a rapid and irreversible transformation into a highly data-intensive field.
 
Genome information is revolutionizing virtually all aspects of life sciences including basic research, medicine, and agriculture. Meanwhile, use of genomic data requires life scientists to be familiar with concepts and skills in biology, computer science, as well as data analysis.
 
This workshop is designed to introduce computational analysis of genomic data through hands-on computational exercises, using published studies.
 
The pre-requisites of the course are college-level courses in molecular biology, cell biology, and genetics. Introductory courses in computer programming and statistics are preferred but not strictly required.
 
==Learning goals==
By the end of this course successful students will be able to:
* Describe next-generation sequencing  (NGS) technologies & contrast it with traditional Sanger sequencing
* Explain applications of NGS technology including pathogen genomics, cancer genomics, human genomic variation, transcriptomics, meta-genomics, epi-genomics, and microbiome.
* Visualize and explore genomics data using RStudio
* Replicate key results using a raw data set produced by a primary research paper
 
==Web Links==
* Install R base: https://cloud.r-project.org
* Install R Studio (Desktop version): http://www.rstudio.com/download
* Download: [http://www.r4all.org/books/datasets R datasets]
* A reference book: [https://r4ds.had.co.nz/ R for Data Science (Wickharm & Grolemund)]
 
==Quizzes and Exams==
Student performance will be evaluated by attendance, three (4) quizzes and a final report:
* Attendance: 50 pts
* Assignments: 5 x 10 = 50 pts
* Quizzes: 2 x 25 pts = 50 pts
* Mid-term: 50 pts
* Final presentation: 50 pts
Total: 250 pts
 
==Course Schedule==
{| class="wikitable"
{| class="wikitable"
|-
|-
! Date & Hour !! Tutorials !! Assignment !! Quiz & Exam
! Lyme Disease (Borreliella) !! CoV Genome Tracker !! Coronavirus evolutuon
|-
|-
| July 8 (Mon), 8:40-12:10 || Introduction; R Tutorial I;
| [[File:Lp54-gain-loss.png|300px|thumbnail| Gains & losses of host-defense genes among Lyme pathogen genomes (Qiu & Martin 2014)]] ||  
[[File:R-part-1-small.pdf|thumbnail|Lecture slides]]
[[File:Cov-screenshot-1.png|300px|thumbnail| [http://cov.genometracker.org/ Haplotype network] ]]
||  
Assignment #1 (create a WORD document including scripts & graphs (i.e., compile your work into a lab report, due tomorrow)
* Install R/R studio and the "tidyverse" package on your own computer
* Recreate Script 1 & Mini-Practical
* Show help page for function "seq"
* Download dataset
** Create a new folder (e.g., Desktop/rtutor)
** Create a sub-folder (e.g., Desktop/rtutor/data/)
** Download from http://www.r4all.org/the-book/datasets
** Save to the sub-folder
** Unzip the file
 
  ||
|-
| July 9 (Tu), 8:40-12:10 || NGS; R Tutorial II ||
Assignment #2
* List pros & cons of Sanger vs NGS
* Compare accuracy, read length, and error rate between Illumina and PacBio
* Describe sequence information captured with each of the following file formats: FASTA, FASTQ, SAM, VCF
* Wide vs Tall data frames
* Variable names (informative, case sensitive)
* Read file
||  
||  
|-
  [[File:Cov-screenshot-2.png|300px|thumbnail| Spike protein alignment ]]
| July 10 (Wed), 8:40-12:10 || Microbiome I; R Tutorial III ||
Assignment #3
|| Quiz I
|-
| July 11 (Thur), 8:40-12:10 || Microbiome II; R Tutorial IV ||
Assignment #4
||
|-
| July 12 (Fri), 8:40-12:10 ||  || || Mid-term Exam
|-
| Weekend || Break
|-
| July 15 (Mon), 8:00-12:10 || Transcriptome; R Tutorial V ||
Assignment #5
  ||
|-
| July 16 (Tu), 8:00-12:10 || Proteome ||
||
|-
| July 17 (Wed), 8:00-12:10 || Genomics I ||
|| Quiz II
|-
| July 18 (Thur), 8:00-12:10 || Genomics II  || ||
|-
| July 19 (Fri), 8:00-12:10|| Presentations
|}
|}
</center>
----


==Papers & Datasets==
==Case studies from Qiu Lab==
{| class="wikitable sortable"
* [http://borreliabase.org Comparative genomics of worldwide Lyme disease pathogens]
|-
* [http://cov.genometracker.org Covid-19 Genome Tracker]  
! Omics Application !! Paper link !! Data set !! NGS Technology
 
|-
==Bioinformatics Tools & Learning Goals==
| Microbiome || [https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0193652 Rimoldi_etal_2018_PlosOne] || [https://doi.org/10.1371/journal.pone.0193652.s004 S1 Dataset] || 16S rDNA amplicon sequencing
* BpWrapper: commandline tools for sequence, alignment, and tree manipulations (based on BioPerl).
|-
** [https://github.com/bioperl/p5-bpwrapper Github Link]
| Transcriptome || [https://science.sciencemag.org/content/350/6264/1096 Wang_etal_2015_Science] || Tables S2 & S4 || RNA-Seq
** [https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-018-2074-9/figures/1 Flowchart from publication]
|-
* Haplotype network with TCS [https://pubmed.ncbi.nlm.nih.gov/11050560/ PubMed link]
| Transcriptome & Regulome || [https://bmcmedgenomics.biomedcentral.com/articles/10.1186/s12920-019-0477-8 Nava_etal_2019_BMCGenomics] || Tables S2 & S3 || RNA-Seq & CHIP-Seq
* Web-interactive visualization with [http://D3js.org D3js]
|-
** [https://github.com/sairum/tcsBU Github link]
| Proteome || [https://www.ncbi.nlm.nih.gov/pubmed/28232952 Qiu_etal_2017_NPJ] || (to be posted) || SILAC
** [https://cibio.up.pt/software/tcsBU/index.html Web tool]
|-
** [https://academic.oup.com/bioinformatics/article/32/4/627/1744448 Paper]
| Population genomics (Lyme) || [https://jcm.asm.org/content/56/11/e00940-18.long Di_etal_2018_JCM] || [https://github.com/weigangq/ocseq Data & R codes] || Amplicon sequencing (antigen locus)
 
|-
==Tutorial==
| Population genomics/GWAS (Human) || [https://science.sciencemag.org/content/351/6274/737.long Simonti_etal_2016_Science] || [https://science.sciencemag.org/highwire/filestream/673591/field_highwire_adjunct_files/1/aad2149-Simonti-SM.Table.S2.xlsx Table S2] || whole-genome sequencing (WGS); [http://www.internationalgenome.org/ 1000 Genome Project (IGSR)]
* 2-2:30: Introduction on pathogen phylogenomics
|-
* 2:30-2:45: data pre-processing with BpWrapper
| TB surveillance || [https://jcm.asm.org/content/53/7/2230 Brow_etal_2015]  || [https://www.ebi.ac.uk/ena/data/view/PRJEB9206 Sequence Archives]|| Whole-genome sequencing (WGS)
* 2:45-3:00: build haplotype network with TCS
|-
* 3:00-3:15: interactive visualization with BuTCS
| Example || Example || Example || Example
* 3:15-3:30: Q & A
|-
| Example || Example || Example || Example
|-
| Example || Example || Example || Example
|}

Revision as of 06:50, 26 July 2020

Bioinformatics Boot Camp for Ecology & Evolution: Pathogen Evolutionary Genomics
Thursday, Aug 6, 2020, 2 - 3:30pm
Instructors: Dr Weigang Qiu & Ms Saymon Akther
Email: weigang@genectr.hunter.cuny.edu
Lab Website: http://diverge.hunter.cuny.edu/labwiki/
Lyme Disease (Borreliella) CoV Genome Tracker Coronavirus evolutuon
Gains & losses of host-defense genes among Lyme pathogen genomes (Qiu & Martin 2014)
Spike protein alignment

Case studies from Qiu Lab

Bioinformatics Tools & Learning Goals

Tutorial

  • 2-2:30: Introduction on pathogen phylogenomics
  • 2:30-2:45: data pre-processing with BpWrapper
  • 2:45-3:00: build haplotype network with TCS
  • 3:00-3:15: interactive visualization with BuTCS
  • 3:15-3:30: Q & A