Mini-Tutorals: Difference between revisions

From QiuLab
Jump to navigation Jump to search
imported>Saymon
imported>Weigang
mNo edit summary
Line 1: Line 1:
==bp-utils: bioseq==
=bp-utils: bioseq=
* Use accession "CP002316.1" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb".
* Use accession "CP002316.1" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb".
<div class="toccolours mw-collapsible">
<div class="toccolours mw-collapsible">
Line 61: Line 61:
</div>
</div>


==bp-utils: bioaln==
=bp-utils: bioaln=
*Go to '''/home/shared/LabMeetingReadings/Test-Data''' and find the sequence alignment file “'''bioaln_tutorial.aln'''”. Name the format of the alignment file. Use it to answer all the questions below.  
*Go to '''/home/shared/LabMeetingReadings/Test-Data''' and find the sequence alignment file “'''bioaln_tutorial.aln'''”. Name the format of the alignment file. Use it to answer all the questions below.  
*Find the length of the alignment.  
*Find the length of the alignment.  
Line 73: Line 73:
*Remove the gaps and show the final alignment in codon view for an alignment slice “1-100”.  
*Remove the gaps and show the final alignment in codon view for an alignment slice “1-100”.  
*Add a 90% consensus sequence and then show the final alignment in match plus codon view for an alignment slice “20-80”. (Hint: First try match view followed by codon view)
*Add a 90% consensus sequence and then show the final alignment in match plus codon view for an alignment slice “20-80”. (Hint: First try match view followed by codon view)
=BLAST+: search("google") for homologs/pariwise alignment=
=Programs for producing multiple alignments=
==MUSCLE==
==CLUSTALW==
==MAFT==
==TCOFFEE==
=Programs for producing phylogeny & phylogenetic analysis=
==FastTree==
==PHYLIP==
==MrBayes==
==RaXML==
==PhyloNet==
=R packages for phylogenetics=
==APE==
==phengorn==
==phytools==
==Population genetics==
==ms: coalescence simulation==
==SFS: forward simulation==

Revision as of 12:27, 2 July 2015

bp-utils: bioseq

  • Use accession "CP002316.1" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb".
bioseq -f "CP002316.1" -o'genbank' > cp002316.gb
  • Use the above file as input, extract FASTA sequences for each genes and save the output to a new file called "cp002316.nuc". Use this file for the following questions.
bioseq -i "genbank" -F cp002316.gb > cp002316.fas
  • Count the number of sequences.
bioseq -n cp002316.fas
  • In a single command, pick the first 10 sequences and find their length
bioseq -p "order:1-10" cp002316.fas | bioseq –l
  • In a single command, pick the third and seventh sequences from the file and do the 3-frame translation. Which reading frame is the correct on both? Specify
bioseq -p "order:3,7" cp002316.fas | bioseq -t3
  • Find the base composition of the last two sequences
bioseq -p "order:25-26" cp002316.fas| bioseq –c
  • Pick the sequence with id "Bbu|D1_B11|8784|9302|1" and count the number of codons present in this sequence
bioseq -p "id:BbuJD1_B11|8784|9302|1" cp002316.fas | bioseq –C
  • Delete the last 10 sequences from the file and save the output to cp002316-v2.nuc
bioseq -d "order:17-26" cp002316.fas > cp002316-v2.nuc
  • In a single command, pick the first sequence, then get the 50-110 nucleotides and make reverse complement of the sub-sequences
bioseq -p "order:1" cp002316.fas | bioseq -s "50,110" | bioseq –r
  • In a single command, get the first 100 nucleotides of all the sequences present in the file and do 1-frame translation of all sub-sequences.
bioseq -s "1,100" cp002316.fas | bioseq -t1

bp-utils: bioaln

  • Go to /home/shared/LabMeetingReadings/Test-Data and find the sequence alignment file “bioaln_tutorial.aln”. Name the format of the alignment file. Use it to answer all the questions below.
  • Find the length of the alignment.
  • Count the number of the sequences present in the alignment.
  • How do you convert this alignment in phylip format? Save the output.
  • Pick “seq2, seq5, seq7, seq10” from the alignment and calculate their average percent identity.
  • Get an alignment slice from “50-140” and find the average identities of the slice for sliding windows of 25.
  • Extract conserved blocks from the alignment.
  • Find the unique sequences and list their ids.
  • Extract third sites from the alignment and show only variable sites in match view.
  • Remove the gaps and show the final alignment in codon view for an alignment slice “1-100”.
  • Add a 90% consensus sequence and then show the final alignment in match plus codon view for an alignment slice “20-80”. (Hint: First try match view followed by codon view)

BLAST+: search("google") for homologs/pariwise alignment

Programs for producing multiple alignments

MUSCLE

CLUSTALW

MAFT

TCOFFEE

Programs for producing phylogeny & phylogenetic analysis

FastTree

PHYLIP

MrBayes

RaXML

PhyloNet

R packages for phylogenetics

APE

phengorn

phytools

Population genetics

ms: coalescence simulation

SFS: forward simulation