-
Notifications
You must be signed in to change notification settings - Fork 2
/
annotation_practical.Rmd
56 lines (38 loc) · 1.68 KB
/
annotation_practical.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
---
title: "Annotation practical"
author: "Liz Batty"
date: '`r format(Sys.time(), "%d %B, %Y")`'
---
<style>
body {
text-align: justify}
</style>
```{r settings, include = FALSE}
switch <- FALSE
```
# Learning objectives {-}
To run annotation on a genome and understand the output files.
# Exercise 1 {-}
First we need to update a program on the servers. Log in and run the following command:
<pre class="bash">
brew upgrade && brew upgrade -v tbl2asn
</pre>
This will take a couple of minutes.
Now we will run the [prokka](https://github.com/tseemann/prokka) program which we learned about in the lecture. We will annotate the file `contigs.fasta`, which are the assembled *Staphyloccocus aureus* contigs from yesterday's assembly practical.
<pre class="bash">
prokka --outdir s_aureus --prefix s_aureus --cpus 4 --mincontiglen 500 --locustag saureus contigs.fasta
</pre>
This will take a couple of minutes. When it is complete, the output files will be in the `s_aureus` directory. There will be 10 output files.
Look at the .txt file, which has summary statistics about the annotation.
***Q1. How many coding sequences were predicted? ***
<textarea id="name" name="name" cols="100" rows="3" placeholder="You can input your answer here...">
</textarea>
***Q2. Which file contains the protein sequences for every predicted coding sequence?***
<textarea id="name" name="name" cols="100" rows="3" placeholder="You can input your answer here...">
</textarea>
Now we will visualise the output of `prokka` using a tool called Artemis. From here we will take you through the Artemis visualisation with a live demo.
<script>
$(".toggle").click(function() {
$(this).toggleClass("open");
});
</script>