Mini-Tutorals: Difference between revisions

Revision as of 22:01, 19 June 2015

bp-utils: bioseq

Use accession "CP002316.1" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb".

bioseq -f "CP002316.1" -o'genbank' > cp002316.gb

Use the above file as input, extract FASTA sequences for each genes and save the output to a new file called "cp002316.nuc". Use this file for the following questions.

bioseq -i "genbank" -F cp002316.gb > cp002316.fas

Count the number of sequences.

bioseq -n cp002316.fas

In a single command, pick the first 10 sequences and find their length

bioseq -p "order:1-10" cp002316.fas | bioseq –l

In a single command, pick the third and seventh sequences from the file and do the 3-frame translation. Which reading frame is the correct on both? Specify

bioseq -p "order:3,7" cp002316.fas | bioseq -t3

Find the base composition of the last two sequences

bioseq -p "order:25-26" cp002316.fas| bioseq –c

Pick the sequence with id "Bbu|D1_B11|8784|9302|1" and count the number of codons present in this sequence

bioseq -p "id:BbuJD1_B11|8784|9302|1" cp002316.fas | bioseq –C

Delete the last 10 sequences from the file and save the output to cp002316-v2.nuc

bioseq -d "order:17-26" cp002316.fas > cp002316-v2.nuc

In a single command, pick the first sequence, then get the 50-110 nucleotides and make reverse complement of the sub-sequences

  bioseq -p "order:1" cp002316.fas | bioseq -s "50,110" | bioseq –r

In a single command, get the first 100 nucleotides of all the sequences present in the file and do 1-frame translation of all sub-sequences.

bioseq -s "1,50" cp002316.fas | bioseq -t1

@@ Line 1: / Line 1: @@
 ==bp-utils: bioseq==
-* Use accession "CP002316" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb"
+* Use accession "CP002316.1" to retrieve the Genbank file from NCBI. Save the output (in genbank format) to a file named as "cp002316.gb".
 <div class="toccolours mw-collapsible">
 <syntaxhighlight lang=bash">
@@ Line 7: / Line 7: @@
 </div>
 * Use the above file as input, extract FASTA sequences for each genes and save the output to a new file called "cp002316.nuc". Use this file for the following questions.
-* Count the number of sequences
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -i "genbank" -F cp002316.gb > cp002316.fas
+</syntaxhighlight>
+</div>
+* Count the number of sequences.
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -n cp002316.fas
+</syntaxhighlight>
+</div>
 * In a single command, pick the first 10 sequences and find their length
-* In a single command, pick the third and seventh sequences from the file and do the 3-frame translation. Which reading frame is the correct or both? Specify
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -p "order:1-10" cp002316.fas | bioseq –l
+</syntaxhighlight>
+</div>
+* In a single command, pick the third and seventh sequences from the file and do the 3-frame translation. Which reading frame is the correct on both? Specify
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -p "order:3,7" cp002316.fas | bioseq -t3
+</syntaxhighlight>
+</div>
 * Find the base composition of the last two sequences
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -p "order:25-26" cp002316.fas| bioseq –c
+</syntaxhighlight>
+</div>
 * Pick the sequence with id "Bbu|D1_B11|8784|9302|1" and count the number of codons present in this sequence
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -p "id:BbuJD1_B11|8784|9302|1" cp002316.fas | bioseq –C
+</syntaxhighlight>
+</div>
 * Delete the last 10 sequences from the file and save the output to cp002316-v2.nuc
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -d "order:17-26" cp002316.fas > cp002316-v2.nuc
+</syntaxhighlight>
+</div>
 * In a single command, pick the first sequence, then get the 50-110 nucleotides and make reverse complement of the sub-sequences
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+  bioseq -p "order:1" cp002316.fas | bioseq -s "50,110" | bioseq –r
+</syntaxhighlight>
+</div>
 * In a single command, get the first 100 nucleotides of all the sequences present in the file and do 1-frame translation of all sub-sequences.
+<div class="toccolours mw-collapsible">
+<syntaxhighlight lang=bash">
+bioseq -s "1,50" cp002316.fas | bioseq -t1
+</syntaxhighlight>
+</div>

Mini-Tutorals: Difference between revisions

Revision as of 22:01, 19 June 2015

bp-utils: bioseq

Navigation menu

Search