[Bio] / FigTutorial / why_use_the_SEED.html Repository:
ViewVC logotype

Annotation of /FigTutorial/why_use_the_SEED.html

Parent Directory Parent Directory | Revision Log Revision Log

Revision 1.1 - (view) (download) (as text)

1 : overbeek 1.1 <h1>Why Use the SEED (a VERY basic tutorial)?</h1>
2 :    
3 :     <h2>Introduction</h2>
4 :    
5 :     Many of us think that the SEED is a very rich environment for studying
6 :     genomic data. Indeed, we think that it offers many features
7 :     unavalable through other systems. However, up to now it has always
8 :     been viewed as a system that was almost impossible to use without
9 :     extensive guidance. In this tutorial, I argue that there is actually
10 :     a very small subset of the overall functionality that is very useful,
11 :     and that subset can be learned in a very short time with relatively
12 :     little effort. The functionality that you need to learn involves four
13 :     steps:
14 :     <ol>
15 :     <li><b>Finding a specific gene</b>. Suppose that you know a gene that
16 :     you wish to study. You may have a gene name from a research article,
17 :     an ID from a genomic database, or a piece of sequence. Whatever the
18 :     starting point, you need to learn how to locate the SEED protein page
19 :     corresponding the the gene or protein you are interested in.
20 :     Hopefully, we can convey the basic steps that will get you there in
21 :     about 10 minutes of tutorial or less.
22 :    
23 :     <li><b>Finding similar genes that occur in clusters on prokaryotic
24 :     genomes</b>. Functionally related genes tend to cluster on
25 :     prokaryotic genomes. In most prokaryotes 50% or more of the genes are
26 :     clustered with related genes. For any gene thagt you wish to study,
27 :     either it will occur in a cluster, or there will be a corresponding
28 :     gene in another genome that does occur in a cluster (this is, of
29 :     course, an overstatement; but it is essentially true). Once you have
30 :     <i>located the gene/protein in the SEED</i>, the next step is to
31 :     <i>find the relevant clusters</i>. It should take only about 5
32 :     minutes to learn how to do that.
33 :    
34 :     <li><b>Getting a display that shows the relevant clusters in a number
35 :     of genomes</b>. Once you have a cluster that includes a set of
36 :     functionally related genes, you need to get a visual overview of
37 :     different versions of this cluster as it exists in other sequenced
38 :     genomes. It should take less than five minutes for you to figure out
39 :     how to do this.
40 :    
41 :     <li><b>Finally, you need to study these clusters in the visual
42 :     display</b>. This is an endlessly satisfying experience, so it is
43 :     pointless to think of a minimal time required to perform the task.
44 :    
45 :     </ol>
46 :     There are many, many things that you cannot do with just these four
47 :     steps, but the functionality provided (locating relevant clusters of
48 :     genes) is a capability that is far more important than you might
49 :     realize. And, this is the easiest way to do it.
50 :     <br><br>
51 :     In the rest of this tutorial, we will cover these four steps.
52 :    
53 :     <h2> Step 1: Finding the Gene/Protein You Want to Study</h2>
54 :    
55 :     Go to the <a href=http://theseed.uchicago.edu/FIG/index.cgi target=tutorial>initial page of the SEED</a>.
56 :     <br><br>
57 :     First, fill in your ID. Use something of the for <b>master:FirstL</b>,
58 :     where "FirstL" should be your first name and the first initial of your
59 :     last name. You can use anything you wish, but do try to make it
60 :     descriptive and unique.
61 :     <br><br>
62 :     If you have one or more keywords (e.g., <b>dnaK</b> or
63 :     <b>gi|23016701</b>), you put them in the <b>Search Pattern:</b> field
64 :     and click on <b>Search</b>.
65 :     <br><br>
66 :     If you get a list of matched <i>protein-encoding genes</i>, you can
67 :     take any of the links to a specific gene that meets your criteria.
68 :     <br><br>
69 :     Do this now for <b>gi|23016701</b>, and verify that you can get to the
70 :     gene/protein page.
71 :     <br><br>
72 :     Now suppose that you wanted to find <i>dnaK</i> in <i>Bacillus
73 :     subtilis</i>.
74 :     To do this, fill ib the search pattern with <i>dnaK</i>, select the
75 :     organism using the pull-down menu, and click on <b>Search genome
76 :     selected below</b>.
77 :     <br><br>
78 :     Verify that you can actually get to the gene/protein page for
79 :     <i>dnaK</i>.
80 :     <br><br>
81 :     Now, suppose that you have a piece of DNA or protein sequence, and you
82 :     wish to find the genes within a genome that contain the same or
83 :     similar sequences. You can do this quite simply. First, patch your
84 :     sequence into the provided text window. Then select <b>blastp</b> if the
85 :     provided sequence was a protein sequence or <b>blastn</b> if the
86 :     provided sequence was DNA. Finally, select the organism you wish
87 :     to search from the pull-down menu.
88 :     Then click on <b>Search for Matches</b>.
89 :     You should get blast output, with links set to get you to the desired
90 :     gene/protein page.
91 :     <br><br>
92 :     You should now verify that "NDAERQATKDAGKIAGLEVERIINEPTAAALAYGLDKT" could be used to
93 :     locate <i>dnaK</i> in <i>Bacillus subtilis</i>.
94 :     <br><br>
95 :     That ends our 10-minute discussion of how to find the gene/protein you
96 :     are interested in. Clearly, there is much more that could be said
97 :     about how to use the SEED search facilities, but this should cover
98 :     the vast majority of your search needs.
99 :    
100 :     <h2>Step 2: Finding similar genes that occur in clusters on prokaryotic genomes</h2>
101 :    
102 :     Suppose that you have found a desired gene/protein page. We have not
103 :     told you how to interpret it. Nor do we intend to. It is a page full
104 :     of information, links, and possible services. Our strategy in this
105 :     simple tutorial is to just show you how to find <i>relevant clusters</i> of
106 :     genes, by which we mean clusters of functionally related genes that
107 :     include either the gene you are "positioned on" or a corresponding
108 :     gene in another organism.
109 :     <br><br>
110 :     First, position yourself on the gene/protein page for
111 :     <b>gi|21283241</b>.
112 :     <br><br>
113 :     The table at the top of the page describes the genes in the region of
114 :     the chromosome surrounding the gene you are positioned on
115 :     (<i>fig|196620.1.peg.1512</i>, which is the SEED ID for the gene
116 :     encoding <i>gi|21283241</i>). The entry for the gene you are
117 :     positioned on is shown in green. Just below the table is a small
118 :     graphical display of the region. The gene you are positioned on is
119 :     shown in green. Genes that are believed to be "functionally related"
120 :     (based on the fact that they occur close to each other in a number of
121 :     genomes) is shown as blue. Others are red.
122 :     <br><br>
123 :     It so happens that the gene you are positioned on is in a cluster.
124 :     The cluster contains 7 genes. Each of the genes in the cluster has a
125 :     little <b>Pins</b> link to the side.
126 :     <br><br>
127 :     To find any larger clusters (occuring in other genomes) that contain
128 :     genes similar to the one you are positioned on, you can click on the
129 :     <b>CL</b> link just to the left of the gene. Which genomes contain
130 :     larger clusters? Were you able to locate the corresponding gene in
131 :     <i>Bacillus subtilis subsp. subtilis str. 168</i> or in <i>Bacillus
132 :     cereus ATCC 14579</i>? In each of those genomes the cluster is
133 :     slightly larger.
134 :     <br><br>
135 :     Note that you can find these largest cluster, even when you are on a
136 :     gene that is not in a cluster (or even one from a eukaryotic genome).
137 :    
138 :     <h2>Step 3: Getting a display that shows the relevant clusters in a number of genomes</h2>
139 :    
140 :     Once you are positioned on a gene in a cluster (which may or may not
141 :     be one of the largest clusters), you should click on the <b>Pins</b>
142 :     button just to the left of the shaded green area. Try it.
143 :     <br><br>
144 :     In a separate window, you should see a portrayal of different versions
145 :     of the same (or closely related) clusters as they occur in other
146 :     genomes. The red genes are aligned in the center of the page, and
147 :     then all of the genes around this central "pin" are shown. Similar
148 :     genes will have the same color. You should be able to mouse-over
149 :     genes in the display and see the functions of the genes.
150 :     Finally, if you choose to click on the <b>Commentary</b> button,
151 :     another window will pop up containing information about each
152 :     of the colored sets of genes.
153 :    
154 :     <h2> An Exercise: Do Clusters Really Mean Anything?</h2>
155 :    
156 :     Pick a pathway from central metabolism (i.e., a pathway that you know
157 :     exists in several organisms). Then pick a gene from that pathway in an
158 :     organism that you know has the gene. Now, find the gene/protein page
159 :     corresponding to the gene.
160 :     <br><br>
161 :     Now, the question we pose is <i>"Can you now find large clusters for
162 :     the gene, and if you can do the large clusters contain other
163 :     functional roles from the same pathway?"</i>
164 :     <br><br>
165 :     If you perform this exercise ten times, you should get a pretty
166 :     accurate feel for why we believe the study of gene clusters is of
167 :     central importance.

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3