BIOL200 2013: Difference between revisions

From QiuLab
Jump to navigation Jump to search
imported>Cmartin
No edit summary
imported>Cmartin
Line 323: Line 323:
#List and describe the key steps of constructing a phylogenetic tree.
#List and describe the key steps of constructing a phylogenetic tree.
#Why do we use 18S rRNA information for yeast and 16S for prokaryotes?  Could we use other molecules as phylogenetic markers?  What constitutes a “good” phylogenetic marker for building a tree of life?
#Why do we use 18S rRNA information for yeast and 16S for prokaryotes?  Could we use other molecules as phylogenetic markers?  What constitutes a “good” phylogenetic marker for building a tree of life?
 
#'''Bonus Question'''
Of course, the ID portion is itself not standardized, and the sequence can also be an amino acid sequence. For simplicity, let's assume that in the ID field, you have a "Strain" name followed by a "protein" name, separated by an underscore (_). You will write a program to read a FASTA file with the ID format described above, and a nucleotide sequence. For both novice-level and experienced level programmers, your program will:
#*Define 16S “phylo-species” and “metagenomics”. Describe how PCR amplification and sequencing of 16S rRNA molecules from environmental microbial samples (e.g., sea water, soil, human gut, hot springs) can be used to define species composition of an environment.
 
# Pick out the strain name, the protein name, and the nucleotide sequence.
# Calculate he length of each sequence.
# Calculate the GC content (in percent) of each sequence.
# Calculate the percent composition of each nucleotide (base composition).
 
'''Novice-level task:'''
 
Your program will just print the above information '''for all sequences''', in a readable form. Sample output could be:
<pre>Strain: B31
Protein: ospA
Seq Length: 819
GC content: 33.58%
Base composition: A 42.98 %, T 23.44 %, C 14.77 %, G 18.80 %</pre>
If your percentages have more than 2 decimal places, '''that's OK.'''
 
'''Experienced-level task:'''
 
The only difference from novices is that your program will '''ask the user for the name of a strain and protein, separated by an underscore''' (ie, B31_opsA). Once given that input, it will print the exact same output as above, but only for the sequence described by that input. If the input doesn't exist, it will say so and exit. Your program will '''continue to ask the user for the sequence ID''' until the user types 'quit' or they give an invalid sequence ID. You can do this by using a while loop.
 
'''Notes'''
 
Calculating the GC content and the base composition is easy if you make use of the tr (transliterate) function as described at the bottom of page 232, and divide the result by the sequence length. GC content is just the sum of total G and C nucleotides, divided by the sequence length. I do want '''percents''', so remember to multiply the results by 100 and to append a '%' at the end.
 
Getting the strain name and the protein name separately can be accomplished with the split() function (check new slides or search on the internet).
 
You will test your program the with the file /data/yoda/b/student.accounts/bio425_2011/data/Borrelia_osp.dna.fasta as input. You don't have to include the file itself with your homework, but I do still want you to copy the program output and submit it with your assignment.
 
Again, the program cannot use any outside dependencies/modules such as BioPerl (supposing you know how to use it.) Besides that, you can implement it however you like. If you know about references, '''it is possible to do this assignment without using them.'''
|}
|}



Revision as of 18:58, 4 March 2013

EXPERIMENT # 4

BIOL 200 Cell Biology II LAB, Spring 2013

Hunter College of the City University of New York

Course information

Instructors: TBD

Class Hours: Room TBD HN; TBD

Office Hours: Room 830 HN; Thursdays 2-4pm or by appointment

Contact information:

  • Dr. Weigang Qiu: weigang@genectr.hunter.cuny.edu, 1-212-772-5296


Experiment #4

The Tree of Life and Molecular Identification of Microorganisms

Objective

To classify microorganisms and determine their relatedness using molecular sequences.

LAB REPORT GRADING GUIDE

CELL BIO II Experiment #4:

  • Introduction 1 point :
 Statement of objectives or aims of the experiment in the student’s own words.
 (not to be copied from the Lab Manual)
  • MATERIALS AND METHODS 0 points :
 This should be a brief synopsis and must include any changes or deviations 
 from the procedures outlined in the Lab Manual. Specify which organisms were 
 used to create the phylogram.
  • RESULTS 4 points :
 A print out of the phylogram will suffice.
  • DISCUSSION 4 points :
 Responses to discussion questions.
  • SUMMARY |CONCLUSION 1 point :
 Two sentence summary of your findings.
  • REFERENCES 1 point :
 Credit is given for pertinent references obtained from sources other than the Lab Manual.
 This point is in addition to the 10 for the lab report..

INTRODUCTION

MATERIALS

  • Required hardware: Computer

Table 1

Volume 1A (Gram-negative bacteria)

Escherichia coli

ACCESSION #174375

Helicobacter pylori

ACCESSION #402670

Salmonella typhi

ACCESSION #2826789

Serratia marcescens

ACCESSION #4582213

Treponema pallidum

ACCESSION #176249

Additional species: Agrobacterium tumefaciens, Boredetella pertussis, Thermus aquaticus, Yersinia pestis, Borrelia burgdorferi. (Note: To search for unlisted 16S sequences, type key words such as “yersinia AND 16S [gene]” in the NCBI GenBank search box.)

Volume 1B (Rikettsias and endosymbionts)

Baronella bacilliformis

ACCESSION #173825

Chlamydia trachomatis

ACCESSION #2576240

Rickettsia rickettsii

ACCESSION #538436

Additional species: Coxiella burnetii, Thermoplasma acidophilum

Volume 2A (Gram-positive bacteria)

Bacillus subtilis

ACCESSION #8980302

Dinococcus radiodurans

ACCESSION #145033

Staphylococcus aureus

ACCESSION #576603

Additional species: Bacillus anthracis, Clostridium botulinum, Lactobacillus acidophilus, Streptococcus pyogenes

Volume 2B (Mycobacteria and nocardia)

Mycobacterium haemophilum

ACCESSION #406086

Mycobacterium tuberculosis

ACCESSION #3929878

Additional species: Mycobacterium bovis, Nocardia orientalis

Volume 3A (Phototrophs, chemolithotrophs, sheathed bacteria, gliding bacteria)

Anabaena sp.

ACCESSION #39010

Cytophaga latercula

ACCESSION #37222646

Nitrobacter wiogradskyi

ACCESSION #402722

Additional species: Heliothrix oregonensis, Myxococcus fulvus, Thiobacillus ferrooxidans

Volume 3B (Archeobaceria)

''Methanococcus jannaschii

ACCESSION #175446

Thermotoga subterranean

ACCESSION #915213

Additional species: Desulfurococcus mucosus, Halobacterium salinarium, Pyrococcus woesei

Volume 4 (Actinomycetes)

Actinomyces bowdenii

ACCESSION #6456800

Actinomyces neuii

ACCESSION #433527

Actinomyces turicensis

ACCESSION #642970

Eukaryotic representative (used as outgroup for rooting the phylogenetic tree)

Saccharomyces cerevisiae

ACCESSION #172403

ANALYSIS

DISCUSSION

March 5

March 12

March 19

  • REVIEW Session for MID-TERM EXAMS

March 26

  • MID-TERM

April 2

April 9

April 16

  • Topic: Relational Database and SQL
  • Tutorial: the Borrelia Genome Database
  • Homework: SQL-embedded PERL

April 23

NO CLASSES (Spring recess)

April 30

May 7

  • Chapter 6 (Gene Expression) & Chapter 8 (Proteomics)
  • Tutorial: Array Data Visualization and Analysis ( Micro-Array Analysis Slides)
  • Homework:Data Analysis using R

May 14

  • Chapter 7. Protein Structure Prediction

May 21

  • Final Project Due (TBA)

Useful Links

Unix Tutorials

Perl Help

  • Professor Stewart Weiss has taught CSCI132, a UNIX and Perl class. His slides go into much greater detail and are an invaluable resource. They can be found on his course page here.
  • Perl documentation at perldoc.perl.org. Besides that, running the perldoc command before either a function (with the -f option ie, perldoc -f substr) or a perl module (ie, perldoc Bio::Seq) can get you similar results without having to leave the terminal.

Bioperl

SQL

R Project

  • Install location and instructions for Windows
  • Install location and instructions for Mac OS X
  • For users of Ubuntu/Debian:
sudo apt-get install r-base-core
  • For users of Fedora/Red Hat:
su -
yum install R

Utilities

Other Resources


© Weigang Qiu, Hunter College, Last Update Jan 2013