[Bio] / FigTutorial / bioinf_first_class_part2.html Repository:
ViewVC logotype

View of /FigTutorial/bioinf_first_class_part2.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (download) (as text) (annotate)
Mon Aug 8 21:01:50 2005 UTC (14 years, 4 months ago) by overbeek
Branch: MAIN
part of class design

<h1>The Initial Attempt to Produce a Metabolic Reconstruction</h1>

A metabolic reconstruction refers to an attempt to infer the metabolic
machinery of an organism from the sequenced genome and available
literature.  The term was introduced by Evgeni Selkov in his early
work on the first sequenced genomes.  Selkov made available his
substantial collection of encoded metabolic pathways, and those along with
existing encodings (most notably the wonderful pathway charts created
by Gerhard Michal and distributed by Boehringer Mannheim) launched
numerous efforts to encode the metabolism of sequenced organisms.
The major effort by <a href=http://www.genome.jp/kegg/>KEGG</a> has
become, perhaps, the most well known, and is what the SEED effort has
tended to utilize.

<p>
Different groups have created slightly differing notions of what is
meant by <i>metabolic reconstruction</i>.  Within the context of this
course, we might draw the following distinctions:
<ol>
<li>By an <b>informal metabolic reconstruction</b> we refer to
<ul>
<li>
taking the genes of an organism and dividing them into small groups
that each perform some well-defined cellular function,
<li>
identifying the overall function of each of these groups, and
<li>
attaching to each gene a list of the abstract functions implemented by
each gene.
</ul>
With informal metabolic reconstructions, it is common to include not
only metabolic subsystems (i.e., pathways), but nonmetabolic
subsystems, as well.
<li>
By a <b>form al metabolic reconstruction</b> we refer to a detailed
encoding of the metabolic reaction network of the organism.
</ol>
That is, the informal reconstruction attempts to represent as much of
the cellular machinery as possible, while the formal is usually
limited to metabolic reactions (and those reactions involving
generation or degradation of polymers are normally left out).  The
output of a formal metabolic reconstruction will include detailed
encodings of both the reactions and the compounds that appear in the
metabolic network.
These distinctions are ours, and are not commonly used.
We consider them unimportant, but useful.

<p>
In this section of the course, we are asking the student to build both
an informal and a formal metabolic reconstruction for some sequenced
organism.  Clearly this is an ambitious task.  It would have been
largely impossible to do anything significant 10 years ago, but with
the new tools we believe that this effort can be quite productive as
an amazing crash course in biochemistry and microbial physiology.

<p>
Rather than break this part of the course up into weekly assignments
(at least for now), we list the detailed steps we would like the
student to work through.

<p>
We are going to suggest that each student be assigned a distinct
organism (alternatively, groups of students can work jointly on a
single organism).  We sugesst choosing an organism that fulfills the
following criteria:

<ul>
<li>It should be a small to moderately large sequenced, prokaryotic genome
(450-2500 genes).

<li>It should be a genome for which metabolic reconstructions have not
already been done or are known to be in progress.

<li>It should be in the public domain,

<li>The genome should be included in both the KEGG collection and in
the SEED collection.
</ul>

<h2>Steps in the Process of Developing an Informal Metabolic Reconstruction</h2>

<h3>Getting summaries of what is in the genome</h3>

First, you should get two estimates of what cellular machinery is
present in the organism:
<ol>
<li>You should get a list of the subsystems with operational variants
from a SEED installation.  Note that the subsystems and genes that you
get back may include both well-curated subsystems and
poorly-constructed subsystems.
<li>You should get colored versions of the KEGG maps (showing which
functions are believed to be present in the genome).
</ol>

<h3>Begin with the Common Machinery</h3>

There is a subset of the cellular machinery that will be present in
some form in whichever genome you picked.  The ribosomal RNA,
ribosomal proteins, tRNAs, tRNA synthetases, and so forth must all be
there.  Look through the set of subsystems that are present, decide
what aspects appear to be essential machinery relating to
transcription and translation, and begin with that.  Create a detailed
summary of which topics you have selected, which variants exist, and
which genes implement those variants.  Which rRNAs and tRNAs exist?
How many copies of the rRNA cluster exist?

<h3>Studying Amino Acid Synthesis</h3>

Next, we suggest <i>amino acid metabolism</i>, or even more restricted
<i>the synthesis of amino acids</i>.  Identify which of the KEGG maps
address this section of metabolism, and then which subsystems from the
SEED are relevant.  Now prepare a list of the amino acids that can be
synthesized, along with the starting point in each case.  Make sure
that you compose a detailed list of outstanding questions.

<h3>Synthesis of Nucleic Acids</h3>

We suggest that you next turn your attention to synthesis of nucleic
acids.  Locate the appropriate KEGG charts and the relevant
subsystems.  Again, summarize the situation, along with outstanding
questions.

<h3>Systematically Work Through the Central Cellular Machinery</h3>

Between the SEED hierarchy, the KEGG maps, and the numerous examples
of metabolic reconstructions published in genome papers, you have
numerous examples of the basic components of a functional hierarchy.
You should choose a reasonable organizational style and produce
an HTML document comprising your best effort at an informal metabolic
reconstruction.


<h2>The Basic Steps in Building a Formal Metabolic Reconstruction</h2>

You should begin by studying exactly how Bernhard Palsson and his team
have built formal metabolic reconstructions:
<ul>
<li><a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12952533&query_hl=1>Escherichia coli</a>,
<li><a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=15752426&query_hl=2"><i>Staphylococcus
aureus</i></a> and 
<li><a
href="http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?cmd=Retrieve&db=pubmed&dopt=Abstract&list_uids=12142428&query_hl=5"><i>Helicobacter
pylori</i></a>
</ul>

You are being asked to construct a list of several hundred reactions,
where each reaction includes precise substrates, products and
(possibly) a required enzyme.

<h2>Begin from the Informal Metabolic Reconstruction</h2>

You should begin from the informal metabolic reconstruction and
accumulate the reactions and compounds implied by the operational
variants of the subsystems.This can be done using
<a href=http://TheSEED.uchicago.edu/FIG/build_formal_reconstruction.cgi>
a tool to produce an initial estimate</a>.  You will be asked to
select a genome, and the result will be a list of compounds and
reactions implied by the subsystems that have been developed.

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3