Southwest-University

From QiuLab
Revision as of 21:30, 14 July 2019 by imported>Weigang (→‎Course Schedule)
Jump to navigation Jump to search
Biomedical Genomics
July 8-19, 2019
Instructor: Weigang Qiu, Ph.D.
Professor, Department of Biological Sciences, City University of New York, Hunter College & Graduate Center
Adjunct Faculty, Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weil Cornell Medical College
Office: B402 Belfer Research Building, 413 East 69th Street, New York, NY 10021, USA
Email: weigang@genectr.hunter.cuny.edu
Lab Website: http://diverge.hunter.cuny.edu/labwiki/


Host: Shunqin Zhu (祝顺琴), Ph.D.
Associate Professor, School of Life Science, South West University

Figure 1. Gains & losses of host-defense genes among Lyme pathogen genomes (Qiu & Martin 2014)

Course Overview

Welcome to BioMedical Genomics, a computer workshop for advanced undergraduates and graduate students. A genome is the total genetic content of an organism. Driven by breakthroughs such as the decoding of the first human genome and next-generation DNA -sequencing technologies, biomedical sciences are undergoing a rapid and irreversible transformation into a highly data-intensive field.

Genome information is revolutionizing virtually all aspects of life sciences including basic research, medicine, and agriculture. Meanwhile, use of genomic data requires life scientists to be familiar with concepts and skills in biology, computer science, as well as data analysis.

This workshop is designed to introduce computational analysis of genomic data through hands-on computational exercises, using published studies.

The pre-requisites of the course are college-level courses in molecular biology, cell biology, and genetics. Introductory courses in computer programming and statistics are preferred but not strictly required.

Learning goals

By the end of this course successful students will be able to:

  • Describe next-generation sequencing (NGS) technologies & contrast it with traditional Sanger sequencing
  • Explain applications of NGS technology including pathogen genomics, cancer genomics, human genomic variation, transcriptomics, meta-genomics, epi-genomics, and microbiome.
  • Visualize and explore genomics data using RStudio
  • Replicate key results using a raw data set produced by a primary research paper

Web Links

Quizzes and Exams

Student performance will be evaluated by attendance, three (4) quizzes and a final report:

  • Attendance: 50 pts
  • Assignments: 5 x 10 = 50 pts
  • Open-book Quizzes: 2 x 25 pts = 50 pts
  • Take-home Mid-term: 50 pts
  • Final presentation: 50 pts

Total: 250 pts

Course Schedule

Date & Hour Tutorials & Lectures Assignment Quiz & Exam
July 8 (Mon), 8:40-12:10 Introduction; R Tutorial I;

Assignment #1 (create a WORD document including scripts & graphs (i.e., compile your work into a lab report, due tomorrow)

  • Install R/R studio and the "tidyverse" package on your own computer
  • Recreate Script 1 & Mini-Practical
  • Show help page for function "seq"
  • Download dataset
July 9 (Tu), 8:40-12:10 R Tutorial II,
File:R-part-2.pdf
Lecture slides

Assignment #2

  • The following is a portion of the dataset of Mycobacterium growth (kindly shared by Aswad from Dr Xie's lab). It shows OD (optical density) values. Transform this table ("wide" format) into the "tall/tidy" format (use paper & pen, no need to use R studio or any computer program):
Hour Control Gene Control.with.Arg Gene.with.Arg
0 0.06 0.022 0.031 0.01
4 0.087 0.102 0.082 0.081
8 0.113 0.185 0.086 0.135
  • In R studio, read the dataset from the file "FlowerColourVisits.csv" and save it into an object named as "flower"
    • Show head, tail, dimension of the data frame "flower"
    • Show data summary with "summary" & "glimpse" commands. Which column is a categorical data type?
    • Select the column named "colour"
    • Select rows from the 3rd to the 20th
    • Select the 3rd, 10th, and 20th rows
    • Select only the rows that have the colour of "red" (hint use colour=="red"
    • Create a new column, named "logVisit", that is log(1+number.of.visit)
    • Sort the "flower" data by the column "number.of.visit"
    • Perform the following data transformation using the chaining operator (i.e., "%>%"): Select rows from the 3rd to the 20th, then filter by colour of "red", and then show head
    • Obtain the mean number of visit for each colour as a group (Hint: use "group_by" & "summarise")
July 10 (Wed), 8:40-12:10 R Tutorial III
File:R-part-3.pdf
Lecture slides

Assignment #3

Task Graph
Use the "iris" dataset to reproduce the plot shown at right (Hint: load data with data(iris))
Iris-1.png
Use the "flower" dataset (see Assignment #2 on how to load data) to reproduce the plot shown at right
Flower-1.png
Quiz I
July 11 (Thur), 8:40-12:10 Take-home mid-term (50 pts):
July 12 - 14 (Fri, Sat & Sun) (Weekend break; No class)
July 15 (Mon), 8:00-12:10 Case Study 1. Fish microbiome Assignment #4
July 16 (Tu), 8:00-12:10 Case Study 2. Transcriptome Assignment #5
July 17 (Wed), 8:00-12:10 Case Study 3. Lyme Disease Quiz II
July 18 (Thur), 8:00-12:10 Final presentations (4 slides, 5 minute)
  1. Slide 1. (1 min; 5 pts). Introduction: background, question, & signficance
  2. Slide 2. (1 min; 10 pts). Material & Methods: sample size, replicates, control, sequencing technology, software tools, statistical analysis
  3. Slide 3. (2 min; 15 pts). Results: a graph: title, legends, caption, main R commands
  4. Slide 4. (1 min; 5 pts). Discussion, conclusion & questions
  5. Slide style (10 pts). Use more figures, less words; no need for complete sentences
  6. Presentation style (5 pts). Speak loudly, slowly, and clearly. Do not read from slides.

Papers & Datasets

Omics Application Paper link Data set NGS Technology
Microbiome Rimoldi_etal_2018_PlosOne 16S rDNA amplicon sequencing
Transcriptome Wang_etal_2015_Science Tables S2 & S4 RNA-Seq
Transcriptome & Regulome Nava_etal_2019_BMCGenomics Tables S2 & S3 RNA-Seq & CHIP-Seq
Proteome Qiu_etal_2017_NPJ (to be posted) SILAC
Population genomics (Lyme) Di_etal_2018_JCM Data & R codes Amplicon sequencing (antigen locus)
Population genomics/GWAS (Human) Simonti_etal_2016_Science Table S2 whole-genome sequencing (WGS); 1000 Genome Project (IGSR)
TB surveillance Brow_etal_2015 Sequence Archives Whole-genome sequencing (WGS)
Example Example Example Example
Example Example Example Example
Example Example Example Example