[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.49, Fri May 11 06:37:38 2007 UTC revision 1.56, Tue Sep 9 21:02:10 2008 UTC
# Line 1  Line 1 
1  <?xml version="1.0" encoding="utf-8" ?>  <?xml version="1.0" encoding="utf-8" ?>
2  <Database>  <Database>
3      <Title>Sprout Genome and Subsystem Database</Title>      <Title>Sprout Genome and Subsystem Database</Title>
4        <Notes>The Sprout database contains the genetic data for all complete organisms in the SeedEnvironment.
5        The data that is not in Sprout-- attributes, similarities, couplings-- is stored on external
6        servers available to the Sprout software. The Sprout database is reloaded approximately once
7        per month. There is significant redundancy in the Sprout database because it has been
8        optimized for searching. In particular, the Feature table contains an extra copy of the
9        feature's functional role and a list of possible search terms.</Notes>
10      <Entities>      <Entities>
11          <Entity name="Genome" keyType="name-string">          <Entity name="Genome" keyType="name-string">
12              <Notes>A [i]genome[/i] contains the sequence data for a particular individual organism.</Notes>              <DisplayInfo theme="nmpdr" col="3" row="1" />
13                <Notes>A Genome contains the sequence data for a particular individual organism.</Notes>
14              <Fields>              <Fields>
15                  <Field name="genus" type="name-string">                  <Field name="genus" type="name-string">
16                      <Notes>Genus of the relevant organism.</Notes>                      <Notes>Genus of the relevant organism.</Notes>
# Line 21  Line 28 
28                      by a period and a string of digits.</Notes>                      by a period and a string of digits.</Notes>
29                  </Field>                  </Field>
30                  <Field name="access-code" type="key-string">                  <Field name="access-code" type="key-string">
31                      <Notes>The access code determines which users can look at the data relating to this genome.                      <Notes>The access code field is deprecated. Its function has been replaced by
32                      Each user is associated with a set of access codes. In order to view a genome, one of                      the account management system developed for the [[RapidAnnotationServer]].</Notes>
                     the user's access codes must match this value.</Notes>  
33                  </Field>                  </Field>
34                  <Field name="complete" type="boolean">                  <Field name="complete" type="boolean">
35                      <Notes>TRUE if the genome is complete, else FALSE</Notes>                      <Notes>TRUE if the genome is complete, else FALSE</Notes>
# Line 32  Line 38 
38                      <Notes>number of base pairs in the genome</Notes>                      <Notes>number of base pairs in the genome</Notes>
39                  </Field>                  </Field>
40                  <Field name="taxonomy" type="text">                  <Field name="taxonomy" type="text">
41                      <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements                      <Notes>The taxonomy string contains the full [[Wikipedia:taxonomy]] of the organism, while individual elements
42                      separated by semi-colons (and optional white space), starting with the domain and ending with                      separated by semi-colons (and optional white space), starting with the domain and ending with
43                      the disambiguated genus and species (which is the organism's scientific name plus an                      the disambiguated genus and species (which is the organism's scientific name plus an
44                      identifying string).</Notes>                      identifying string).</Notes>
45                  </Field>                  </Field>
46                  <Field name="primary-group" type="name-string">                  <Field name="primary-group" type="name-string">
47                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group
48                      (either based on the organism name or the default value "Supporting"), whereas there can be                      per organism (either based on the organism name or the default value =Supporting=). In general,
49                      multiple named groups or even none.</Notes>                      more data is kept on organisms in NMPDR groups than on supporting organisms.</Notes>
50                  </Field>                  </Field>
51                  <Field name="group-name" type="name-string" relation="GenomeGroups">                  <Field name="contigs" type="int">
52                      <Notes>The group identifies a special grouping of organisms that would be displayed on a particular                      <Notes>Number of contigs for this organism.</Notes>
53                      page or of particular interest to a research group or web site. A single genome can belong to multiple                  </Field>
54                      such groups or none at all.</Notes>                  <Field name="pegs" type="int">
55                        <Notes>Number of [[protein encoding genes]] for this organism</Notes>
56                    </Field>
57                    <Field name="rnas" type="int">
58                        <Notes>Number of RNA features found for this organism.</Notes>
59                  </Field>                  </Field>
60              </Fields>              </Fields>
61              <Indexes>              <Indexes>
# Line 81  Line 91 
91                  </Index>                  </Index>
92              </Indexes>              </Indexes>
93          </Entity>          </Entity>
94            <Entity name="CDD" keyType="key-string">
95                <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit
96                on a feature's protein. The ID is six-digit string assigned by the public
97                Conserved Domain Database. A CDD
98                can occur on multiple features and a feature generally has multiple CDDs.</Notes>
99            </Entity>
100          <Entity name="Source" keyType="medium-string">          <Entity name="Source" keyType="medium-string">
101              <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization              <Notes>A source describes a place from which genome data was taken. This can be an organization
102              or a paper citation.</Notes>              or a paper citation.</Notes>
103              <Fields>              <Fields>
104                  <Field name="URL" type="string" relation="SourceURL">                  <Field name="URL" type="string" relation="SourceURL">
105                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>
106                  </Field>                  </Field>
107                  <Field name="description" type="text">                  <Field name="description" type="text">
108                      <Notes>Description the source. The description can be a street address or a citation.</Notes>                      <Notes>Description of the source. The description can be a street address or a citation.</Notes>
109                  </Field>                  </Field>
110              </Fields>              </Fields>
111          </Entity>          </Entity>
112          <Entity name="Contig" keyType="name-string">          <Entity name="Contig" keyType="name-string">
113              <Notes>A [i]contig[/i] is a contiguous run of residues. The contig's ID consists of the              <DisplayInfo theme="nmpdr" col="1" row="1" />
114                <Notes>A contig is a contiguous run of residues. The contig's ID consists of the
115              genome ID followed by a name that identifies which contig this is for the parent genome. As              genome ID followed by a name that identifies which contig this is for the parent genome. As
116              is the case with all keys in this database, the individual components are separated by a              is the case with all keys in this database, the individual components are separated by a
117              period.              period. A contig can contain over a million residues. For performance reasons, therefore,
118              [p]A contig can contain over a million residues. For performance reasons, therefore,              the contig is split into multiple pieces called sequences. The sequences
             the contig is split into multiple pieces called [i]sequences[/i]. The sequences  
119              contain the characters that represent the residues as well as data on the quality of              contain the characters that represent the residues as well as data on the quality of
120              the residue identification.</Notes>              the residue identification.</Notes>
121          </Entity>          </Entity>
122          <Entity name="Sequence" keyType="name-string">          <Entity name="Sequence" keyType="name-string">
123              <Notes>A [i]sequence[/i] is a continuous piece of a [i]contig[/i]. Contigs are split into              <Notes>A sequence is a continuous piece of a contig. Contigs are split into
124              sequences so that we don't have to have the entire contig in memory when we are              sequences so that we don't have to have the entire contig in memory when we are
125              manipulating it. The key of the sequence is the contig ID followed by the index of              manipulating it. The key of the sequence is the contig ID followed by the index of
126              the begin point.</Notes>              the begin point.</Notes>
127              <Fields>              <Fields>
128                  <Field name="sequence" type="text">                  <Field name="sequence" type="text">
129                      <Notes>String consisting of the residues. Each residue is described by a single                      <Notes>String consisting of the residues (base pairs). Each residue is described by a single
130                      character in the string.</Notes>                      character in the string.</Notes>
131                  </Field>                  </Field>
132                  <Field name="quality-vector" type="text">                  <Field name="quality-vector" type="text">
133                      <Notes>String describing the quality data for each base pair. Individual values will                      <Notes>String describing the quality data for each base pair. Individual values will
134                      be separated by periods. The value represents negative exponent of the probability                      be separated by periods. The value represents negative exponent of the probability
135                      of error. Thus, for example, a quality of 30 indicates the probability of error is                      of error. Thus, for example, a quality of 30 indicates the probability of error is
136                      10^-30. A higher quality number a better chance of a correct match. It is possible                      10^-30. A higher quality number indicates a better chance of a correct match. It is
137                      that the quality data is not known for a sequence. If that is the case, the quality                      possible that the quality data is not known for a sequence. If that is the case, the
138                      vector will contain the [b]unknown[/b].</Notes>                      quality vector will contain the string =unknown=.</Notes>
139                  </Field>                  </Field>
140              </Fields>              </Fields>
141          </Entity>          </Entity>
142            <Entity name="Keyword" keyType="name-string">
143                <Notes>A keyword is a word that can be used to search the feature table. This entity
144                contains the keyword's stem, its phonetic form, and the number of features that
145                can be found by searching for the word.</Notes>
146                <Fields>
147                    <Field name="stem" type="name-string">
148                        <Notes>The stem of a keyword is a normalized form that is independent of parts
149                        of speech. The actual keywords stored in the database search index are stems.</Notes>
150                    </Field>
151                    <Field name="count" type="counter">
152                        <Notes>Number of features that can be found by searching for the specified
153                        keyword.</Notes>
154                    </Field>
155                    <Field name="phonex" type="name-string">
156                        <Notes>A _phonex_ is a string that identifies the phonetic characteristics of the
157                        word stem. This can be used to find alternative spellings if an matching word is not
158                        present.</Notes>
159                    </Field>
160                </Fields>
161                <Indexes>
162                    <Index>
163                        <Notes>This index allows the user to find words by stem.</Notes>
164                        <IndexFields>
165                            <IndexField name="stem" order="ascending" />
166                        </IndexFields>
167                    </Index>
168                    <Index>
169                        <Notes>This index allows the user to find words by phonex.</Notes>
170                        <IndexFields>
171                            <IndexField name="phonex" order="ascending" />
172                            <IndexField name="count" order="descending" />
173                        </IndexFields>
174                    </Index>
175                </Indexes>
176            </Entity>
177            <Entity name="ExternalDatabase" keyType="key-string">
178                <Notes>An external database identifies a biological database surveyed by PIR International
179                as part of an effort to determine which features are essentially identical between bioinformatics
180                organizations. Each feature in the database will have zero or more corresponding IDs that are
181                captured from the PIR data. Each corresponding ID is represented in a relationship between an external
182                database and the feature itself.</Notes>
183            </Entity>
184          <Entity name="Feature" keyType="id-string">          <Entity name="Feature" keyType="id-string">
185              <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features              <DisplayInfo theme="nmpdr" col="3" row="3" />
186                <Notes>A feature (sometimes also called a "gene" is a part of a genome that is of special interest. Features
187              may be spread across multiple contigs of a genome, but never across more than              may be spread across multiple contigs of a genome, but never across more than
188              one genome. Features can be assigned to roles via spreadsheet cells,              one genome. Features can be assigned to roles via spreadsheet cells,
189              and are the targets of annotation.</Notes>              and are the targets of annotation. Each feature in the database has a unique FigId.</Notes>
190              <Fields>              <Fields>
191                  <Field name="feature-type" type="string">                  <Field name="feature-type" type="id-string">
192                      <Notes>Code indicating the type of this feature.</Notes>                      <Notes>Code indicating the type of this feature. Among the codes currently
193                  </Field>                      supported are =peg= for a [[protein encoding gene]], =bs= for a
194                  <Field name="alias" type="medium-string" relation="FeatureAlias">                      binding site, =opr= for an operon, and so forth.</Notes>
                     <Notes>Alternative name for this feature. A feature can have many aliases.</Notes>  
195                  </Field>                  </Field>
196                  <Field name="translation" type="text" relation="FeatureTranslation">                  <Field name="translation" type="text" relation="FeatureTranslation">
197                      <Notes>[i](optional)[/i] A translation of this feature's residues into character                      <Notes>_(optional)_ A translation of this feature's residues into character
198                      codes, formed by concatenating the pieces of the feature together. For a                      codes, formed by concatenating the pieces of the feature together. For a
199                      protein encoding group, this is the protein characters. For other types                      [[protein encoding gene]], the translation contains protein characters. For other types
200                      it is the DNA characters.</Notes>                      it contains DNA characters.</Notes>
201                  </Field>                  </Field>
202                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
203                      <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of                      <Notes>Upstream sequence for the feature. This includes residues preceding the feature as
204                      the feature's initial residues.</Notes>                      well as some of the feature's initial residues.</Notes>
205                  </Field>                  </Field>
206                  <Field name="assignment" type="text">                  <Field name="assignment" type="text">
207                      <Notes>Default functional assignment for this feature.</Notes>                      <Notes>Default functional assignment for this feature.</Notes>
208                  </Field>                  </Field>
209                  <Field name="active" type="boolean">                  <Field name="active" type="boolean">
210                      <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes>                      <Notes>(This field is deprecated.) TRUE if this feature is still considered valid,
211                        FALSE if it has been logically deleted.</Notes>
212                  </Field>                  </Field>
213                  <Field name="assignment-maker" type="name-string">                  <Field name="assignment-maker" type="name-string">
214                      <Notes>name of the user who made the functional assignment</Notes>                      <Notes>name of the user who made the functional assignment</Notes>
# Line 163  Line 222 
222                      functional assignment, subsystem roles, and special properties.</Notes>                      functional assignment, subsystem roles, and special properties.</Notes>
223                  </Field>                  </Field>
224                  <Field name="link" type="text" relation="FeatureLink">                  <Field name="link" type="text" relation="FeatureLink">
225                      <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The                      <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
226                      links are to other websites that have useful about the gene that the feature represents, and                      links are to other websites that have useful about the gene that the feature represents, and
227                      are coded as raw HTML, using [b]&lt;a href="[i]link[/i]"&gt;[i]text[/i]&lt;/a&gt;[/b] notation.</Notes>                      are coded as raw HTML, using &lt;a href="_link_"&gt;_text_&lt;/a&gt; notation.</Notes>
228                  </Field>                  </Field>
229                  <Field name="conservation" type="float" relation="FeatureConservation">                  <Field name="conservation" type="float" relation="FeatureConservation">
230                      <Notes>A number between 0 and 1 that indicates the degree to which this feature's DNA is                      <Notes>_(optional)_ A number between 0 and 1 that indicates the degree to which this feature's DNA is
231                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less
232                      than 1 is a reflect of the degree to which gap characters interfere in the alignment                      than 1 is a reflection of the degree to which gap characters interfere in the alignment
233                      between the feature and its close relatives.</Notes>                      between the feature and its close relatives.</Notes>
234                  </Field>                  </Field>
235                  <Field name="essential" type="text" relation="FeatureEssential" special="property_search">                  <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
# Line 192  Line 251 
251                      this field will have no values. Otherwise, it will have an epitope name and/or                      this field will have no values. Otherwise, it will have an epitope name and/or
252                      sequence, hyperlinked to the database.</Notes>                      sequence, hyperlinked to the database.</Notes>
253                  </Field>                  </Field>
254                    <Field name="location-string" type="text">
255                        <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location
256                        strings. This gives us a fast mechanism for extracting the feature location. Otherwise,
257                        we have to painstakingly paste together the [[#IsLocatedIn]] records, which are themselves
258                        designed to help look for features in a particular region rather than to find the location
259                        of a feature.</Notes>
260                    </Field>
261                    <Field name="signal-peptide" type="name-string">
262                        <Notes>The signal peptide location for this feature. This is expressed as start and end
263                        numbers with a hyphen for the relevant amino acids. So, "1-22" would indicate a signal
264                        peptide at the beginning of the feature's protein and extending through 22 amino acid
265                        positions. An empty string means no signal peptide is present.</Notes>
266                    </Field>
267                    <Field name="transmembrane-map" type="text">
268                        <Notes>A map indicating which sections of a protein will be embedded in a membrane.
269                        This is expressed as a comma-separated list of as start and end numbers with hyphens
270                        for the relevant amino acids. So, "10-12, 40-60" would indicate that there are two
271                        sections of the protein that become embedded in a membrane: the 10th through 12th
272                        amino acids, and the 40th through the 60th. An empty string means no
273                        transmembrane regions are known.</Notes>
274                    </Field>
275                    <Field name="similar-to-human" type="boolean">
276                        <Notes>TRUE if this feature generates a protein that is similar to one found in humans,
277                        else FALSE</Notes>
278                    </Field>
279                    <Field name="isoelectric-point" type="float">
280                        <Notes>pH in the surrounding medium at which the charge on a protein is neutral.
281                        If the pH of the medium is lower than this value, the protein will have a net
282                        positive charge. If the pH of the medium is higher, then the protein will have a
283                        net negative charge.</Notes>
284                    </Field>
285                    <Field name="molecular-weight" type="float">
286                        <Notes>Molecular weight of this feature's protein, in daltons. A weight of 0
287                        indicates that no protein is created.</Notes>
288                    </Field>
289                    <Field name="sequence-length" type="counter">
290                        <Notes>Number of base pairs in this feature.</Notes>
291                    </Field>
292                    <Field name="locked" type="boolean">
293                        <Notes>TRUE if a feature's assignment is locked. A locked feature's functional
294                        role cannot be changed by automated programs.</Notes>
295                    </Field>
296                    <Field name="in-genbank" type="boolean">
297                        <Notes>TRUE if a feature can be found in GenBank, else FALSE</Notes>
298                    </Field>
299              </Fields>              </Fields>
             <Indexes>  
                 <Index>  
                     <Notes>This index allows the user to find the feature corresponding to  
                     the specified alias name.</Notes>  
                     <IndexFields>  
                         <IndexField name="alias" order="ascending" />  
                     </IndexFields>  
                 </Index>  
             </Indexes>  
300          </Entity>          </Entity>
301          <Entity name="SynonymGroup" keyType="id-string">          <Entity name="FeatureAlias" keyType="medium-string">
302              <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features              <Notes>Alternative names for features. A feature can have many aliases. In general,
303              are mapped to the same synonym group, and this information is used to expand similarities.</Notes>              each alias corresponds to only one feature, but there are many exceptions to this rule.</Notes>
304          </Entity>          </Entity>
305          <Entity name="Role" keyType="string">          <Entity name="SproutUser" keyType="name-string">
306              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.              <Notes>A user is a person who can make annotations and view data in the database. The
307              One of the main goals of the database is to record the roles of the various features.</Notes>              user object is keyed on the user's login name.</Notes>
308              <Fields>              <Fields>
309                  <Field name="EC" type="string" relation="RoleEC">                  <Field name="description" type="string">
310                      <Notes>EC code for this role.</Notes>                      <Notes>Full name or description of this user.</Notes>
311                  </Field>                  </Field>
312                  <Field name="abbr" type="name-string">                  <Field name="access-code" type="key-string" relation="UserAccess">
313                      <Notes>Abbreviated name for the role, generally non-unique, but useful                      <Notes>This field is deprecated.</Notes>
                     in column headings for HTML tables.</Notes>  
314                  </Field>                  </Field>
315              </Fields>              </Fields>
316              <Indexes>          </Entity>
317                  <Index>          <Entity name="SynonymGroup" keyType="id-string">
318                      <Notes>This index allows the user to find the role corresponding to              <Notes>A synonym group represents a group of features. Features that represent substantially
319                      an EC number.</Notes>              identical proteins or DNA sequences are mapped to the same synonym group, and this information is
320                      <IndexFields>              used to expand similarities.</Notes>
321                          <IndexField name="EC" order="ascending" />          </Entity>
322                      </IndexFields>          <Entity name="Role" keyType="string">
323                  </Index>              <DisplayInfo theme="web" col="7" row="3" />
324              </Indexes>              <Notes>A role describes a biological function that may be fulfilled by a feature.
325                One of the main goals of the database is to record the roles of the various features.</Notes>
326            </Entity>
327            <Entity name="RoleEC" keyType="string">
328                <Notes>EC code for a role.</Notes>
329          </Entity>          </Entity>
330          <Entity name="Annotation" keyType="name-string">          <Entity name="Annotation" keyType="name-string">
331              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations              <DisplayInfo theme="seed" col="1" row="3" />
332              are currently the only objects that may be inserted directly into the database. All other              <Notes>An annotation contains supplementary information about a feature. The most
333              information is loaded from data exported by the SEED.</Notes>              important type of annotation is the assignment of a [[functional role]]; however,
334                other types of annotations are also possible.</Notes>
335              <Fields>              <Fields>
336                  <Field name="time" type="date">                  <Field name="time" type="date">
337                      <Notes>Date and time of the annotation.</Notes>                      <Notes>Date and time of the annotation.</Notes>
# Line 251  Line 350 
350              </Indexes>              </Indexes>
351          </Entity>          </Entity>
352          <Entity name="Reaction" keyType="key-string">          <Entity name="Reaction" keyType="key-string">
353              <Notes>A [i]reaction[/i] is a chemical process catalyzed by a protein. The reaction ID              <DisplayInfo  theme="web" col="7" row="5" />
354                <Notes>A reaction is a chemical process catalyzed by a protein. The reaction ID
355              is generally a small number preceded by a letter.</Notes>              is generally a small number preceded by a letter.</Notes>
356              <Fields>              <Fields>
357                  <Field name="url" type="string" relation="ReactionURL">                  <Field name="url" type="string" relation="ReactionURL">
# Line 264  Line 364 
364              </Fields>              </Fields>
365          </Entity>          </Entity>
366          <Entity name="Compound" keyType="name-string">          <Entity name="Compound" keyType="name-string">
367              <Notes>A [i]compound[/i] is a chemical that participates in a reaction.              <DisplayInfo  theme="web" col="7" row="7" />
368                <Notes>A compound is a chemical that participates in a reaction.
369              All compounds have a unique ID and may also have one or more names.</Notes>              All compounds have a unique ID and may also have one or more names.</Notes>
370              <Fields>              <Fields>
371                  <Field name="name-priority" type="int" relation="CompoundName">                  <Field name="label" type="string">
372                      <Notes>Priority of a compound name. The name with the loweset                      <Notes>Name used in reaction display strings. This is the same as the name
373                      priority is the main name of this compound.</Notes>                      possessing a priority of 1, but it is placed here to speed up the query
374                  </Field>                      used to create the display strings.</Notes>
                 <Field name="name" type="name-string" relation="CompoundName">  
                     <Notes>Descriptive name for the compound. A compound may  
                     have several names.</Notes>  
                 </Field>  
                 <Field name="cas-id" type="name-string" relation="CompoundCAS">  
                     <Notes>Chemical Abstract Service ID for this compound (optional).</Notes>  
                 </Field>  
                 <Field name="label" type="name-string">  
                     <Notes>Name used in reaction display strings.  
                     It is the same as the name possessing a priority of 1, but it is placed  
                     here to speed up the query used to create the display strings.</Notes>  
375                  </Field>                  </Field>
376              </Fields>              </Fields>
377              <Indexes>          </Entity>
378                  <Index>          <Entity name="CompoundName" keyType="string">
379                      <Notes>This index allows the user to find the compound corresponding to              <Notes>A compound name is a common name for the chemical represented by a
380                      the specified name.</Notes>              compound.</Notes>
381                      <IndexFields>          </Entity>
382                          <IndexField name="name" order="ascending" />          <Entity name="CompoundCAS" keyType="name-string">
383                      </IndexFields>              <Notes>This entity represents the Chemical Abstract Service ID for a
384                  </Index>              compound. Each Compound has at most one CAS ID.</Notes>
                 <Index>  
                     <Notes>This index allows the user to find the compound corresponding to  
                     the specified CAS ID.</Notes>  
                     <IndexFields>  
                         <IndexField name="cas-id" order="ascending" />  
                     </IndexFields>  
                 </Index>  
                 <Index>  
                     <Notes>This index allows the user to access the compound names in  
                     priority order.</Notes>  
                     <IndexFields>  
                         <IndexField name="id" order="ascending" />  
                         <IndexField name="name-priority" order="ascending" />  
                     </IndexFields>  
                 </Index>  
             </Indexes>  
385          </Entity>          </Entity>
386          <Entity name="Subsystem" keyType="string">          <Entity name="Subsystem" keyType="string">
387              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems              <DisplayInfo theme="seed" col="5" row="1" />
388                <Notes>A subsystem is a collection of roles that work together in a cell. Identification of subsystems
389              is an important tool for recognizing parallel genetic features in different organisms.</Notes>              is an important tool for recognizing parallel genetic features in different organisms.</Notes>
390              <Fields>              <Fields>
391                    <Field name="version" type="int">
392                        <Notes>Version number for the subsystem. This value is incremented each time the subsystem
393                        is backed up.</Notes>
394                    </Field>
395                  <Field name="curator" type="string">                  <Field name="curator" type="string">
396                      <Notes>Name of the person currently in charge of the subsystem.</Notes>                      <Notes>Name of the person currently in charge of the subsystem.</Notes>
397                  </Field>                  </Field>
398                  <Field name="notes" type="text">                  <Field name="notes" type="text">
399                      <Notes>Descriptive notes about the subsystem.</Notes>                      <Notes>Descriptive notes about the subsystem.</Notes>
400                  </Field>                  </Field>
401                    <Field name="description" type="text">
402                        <Notes>Description of the subsystem's function in the cell.</Notes>
403                    </Field>
404                  <Field name="classification" type="string" relation="SubsystemClass">                  <Field name="classification" type="string" relation="SubsystemClass">
405                      <Notes>Classification string, colon-delimited. This string organizes the                      <Notes>Classification string, colon-delimited. This string organizes the
406                      subsystems into a hierarchy.</Notes>                      subsystems into a hierarchy.</Notes>
407                  </Field>                  </Field>
408                    <Field name="hope-curation-notes" type="text" relation="SubsystemHopeNotes">
409                        <Notes>Text description of how the scenarios were curated.</Notes>
410                    </Field>
411              </Fields>              </Fields>
412          </Entity>          </Entity>
413          <Entity name="RoleSubset" keyType="string">          <Entity name="RoleSubset" keyType="string">
414              <Notes>A [i]role subset[/i] is a named collection of roles in a particular subsystem. The              <Notes>A role subset is a named collection of roles in a particular subsystem. The
415              subset names are generally very short, non-unique strings. The ID of the parent              subset names are generally very short, non-unique strings. The ID of the parent
416              subsystem is prefixed to the subset ID in order to make it unique.</Notes>              subsystem is prefixed to the subset ID in order to make it unique.</Notes>
417          </Entity>          </Entity>
418          <Entity name="GenomeSubset" keyType="string">          <Entity name="GenomeSubset" keyType="string">
419              <Notes>A [i]genome subset[/i] is a named collection of genomes that participate              <Notes>A genome subset is a named collection of genomes that participate
420              in a particular subsystem. The subset names are generally very short, non-unique              in a particular subsystem. The subset names are generally very short, non-unique
421              strings. The ID of the parent subsystem is prefixed to the subset ID in order              strings. The ID of the parent subsystem is prefixed to the subset ID in order
422              to make it unique.</Notes>              to make it unique.</Notes>
423          </Entity>          </Entity>
424          <Entity name="SSCell" keyType="hash-string">          <Entity name="SSCell" keyType="hash-string">
425              <Notes>Part of the process of locating and assigning features is creating a spreadsheet of              <DisplayInfo theme="seed" col="5" row="3" />
426              genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one              <Notes>Part of the process of subsystem annotation of features
427              of the positions on the spreadsheet.</Notes>              is creating a spreadsheet of genomes and roles to which features are assigned.
428          </Entity>              A spreadsheet cell represents one of the positions on the spreadsheet.</Notes>
         <Entity name="SproutUser" keyType="name-string">  
             <Notes>A [i]user[/i] is a person who can make annotations and view data in the database. The  
             user object is keyed on the user's login name.</Notes>  
             <Fields>  
                 <Field name="description" type="string">  
                     <Notes>Full name or description of this user.</Notes>  
                 </Field>  
                 <Field name="access-code" type="key-string" relation="UserAccess">  
                     <Notes>Access code possessed by this  
                     user. A user can have many access codes; a genome is accessible to the user if its  
                     access code matches any one of the user's access codes.</Notes>  
                 </Field>  
             </Fields>  
429          </Entity>          </Entity>
430          <Entity name="Property" keyType="int">          <Entity name="Property" keyType="int">
431              <Notes>A [i]property[/i] is a type of assertion that could be made about the properties of              <Notes>A property is a type of assertion that could be made about the properties of
432              a particular feature. Each property instance is a key/value pair and can be associated              a particular feature. Each property instance is a key/value pair and can be associated
433              with many different features. Conversely, a feature can be associated with many key/value              with many different features. Conversely, a feature can be associated with many key/value
434              pairs, even some that notionally contradict each other. For example, there can be evidence              pairs, even some that notionally contradict each other. For example, there can be evidence
# Line 383  Line 455 
455              </Indexes>              </Indexes>
456          </Entity>          </Entity>
457          <Entity name="Diagram" keyType="name-string">          <Entity name="Diagram" keyType="name-string">
458              <Notes>A functional diagram describes the chemical reactions, often comprising a single              <DisplayInfo theme="web" col="7" row="1" />
459                <Notes>A functional diagram describes a network of chemical reactions, often comprising a single
460              subsystem. A diagram is identified by a short name and contains a longer descriptive name.              subsystem. A diagram is identified by a short name and contains a longer descriptive name.
461              The actual diagram shows which functional roles guide the reactions along with the inputs              The actual diagram shows which functional roles guide the reactions along with the inputs
462              and outputs; the database, however, only indicate which roles belong to a particular              and outputs; the database, however, only indicates which roles belong to a particular
463              map.</Notes>              diagram's map.</Notes>
464              <Fields>              <Fields>
465                  <Field name="name" type="text">                  <Field name="name" type="text">
466                      <Notes>Descriptive name of this diagram.</Notes>                      <Notes>Descriptive name of this diagram.</Notes>
467                  </Field>                  </Field>
468              </Fields>              </Fields>
469          </Entity>          </Entity>
         <Entity name="ExternalAliasOrg" keyType="name-string">  
             <Notes>An external alias is a feature name for a functional assignment that is not a  
             FIG ID. Functional assignments for external aliases are kept in a separate section of  
             the database. This table contains a description of the relevant organism for an  
             external alias functional assignment.</Notes>  
                 <Fields>  
                     <Field name="org" type="text">  
                         <Notes>Descriptive name of the target organism for this external alias.</Notes>  
                     </Field>  
                 </Fields>  
         </Entity>  
         <Entity name="ExternalAliasFunc" keyType="name-string">  
             <Notes>An external alias is a feature name for a functional assignment that is not a  
             FIG ID. Functional assignments for external aliases are kept in a separate section of  
             the database. This table contains the functional role for the external alias functional  
             assignment.</Notes>  
                 <Fields>  
                     <Field name="func" type="text">  
                         <Notes>Functional role for this external alias.</Notes>  
                     </Field>  
                 </Fields>  
         </Entity>  
         <Entity name="Coupling" keyType="id-string">  
             <Notes>A coupling is a relationship between two features. The features are  
             physically close on the contig, and there is evidence that they generally  
             belong together. The key of this entity is formed by combining the coupled  
             feature IDs with a space.</Notes>  
             <Fields>  
                 <Field name="score" type="int">  
                     <Notes>A number based on the set of PCHs (pairs of close homologs). A PCH  
                     indicates that two genes near each other on one genome are very similar to  
                     genes near each other on another genome. The score only counts PCHs for which  
                     the genomes are very different. (In other words, we have a pairing that persists  
                     between different organisms.) A higher score implies a stronger meaning to the  
                     clustering.</Notes>  
                 </Field>  
             </Fields>  
         </Entity>  
         <Entity name="PCH" keyType="counter">  
             <Notes>A PCH (physically close homolog) connects a clustering (which is a  
             pair of physically close features on a contig) to a second pair of physically  
             close features that are similar to the first. Essentially, the PCH is a  
             relationship between two clusterings in which the first clustering's features  
             are similar to the second clustering's features. The simplest model for  
             this would be to simply relate clusterings to each other; however, not all  
             physically close pairs qualify as clusterings, so we relate a clustering to  
             a pair of features. The key a unique ID number.</Notes>  
             <Fields>  
                 <Field name="used" type="boolean">  
                     <Notes>TRUE if this PCH is used in scoring the attached clustering,  
                     else FALSE. If a clustering has a PCH for a particular genome and many  
                     similar genomes are present, then a PCH will probably exist for the  
                     similar genomes as well. When this happens, only one of the PCHs will  
                     be scored: the others are considered duplicates of the same evidence.</Notes>  
                 </Field>  
             </Fields>  
         </Entity>  
470          <Entity name="Family" keyType="id-string">          <Entity name="Family" keyType="id-string">
471              <Notes>A family is a group of homologous PEGs believed to have the same function. Protein              <DisplayInfo theme="seed" col="5" row="5" />
472              families provide a mechanism for verifying the accuracy of functional assignments              <Notes>A family (also called a FigFam) is a group of homologous features believed to have
473              and are also used in determining phylogenetic trees.</Notes>              the same function. Families provide a mechanism for verifying the accuracy of functional
474                assignments and are also used in Rapid Annotation and in determining phylogenetic trees.</Notes>
475              <Fields>              <Fields>
476                  <Field name="function" type="text">                  <Field name="function" type="text">
477                      <Notes>The functional assignment expected for all PEGs in this family.</Notes>                      <Notes>The functional assignment expected for all PEGs in this family.</Notes>
# Line 467  Line 484 
484              </Fields>              </Fields>
485          </Entity>          </Entity>
486          <Entity name="PDB" keyType="id-string">          <Entity name="PDB" keyType="id-string">
487              <Notes>A PDB is a protein database containing information that can be used to determine              <DisplayInfo theme="web" col="3" row="5" />
488              the shape of the protein and the energies required to dock with it. The ID is the              <Notes>A PDB is a protein data bank entry containing information that can be used
489              four-character name used on the PDB web site.</Notes>              to determine the shape of the protein and the energies required to dock with it.
490                The ID is the four-character name used on the PDB web site.</Notes>
491              <Fields>              <Fields>
492                  <Field name="docking-count" type="int">                  <Field name="docking-count" type="int">
493                      <Notes>The number of ligands that have been docked against this PDB.</Notes>                      <Notes>The number of ligands that have been docked against this PDB.</Notes>
# Line 485  Line 503 
503              </Indexes>              </Indexes>
504          </Entity>          </Entity>
505          <Entity name="Ligand" keyType="id-string">          <Entity name="Ligand" keyType="id-string">
506                <DisplayInfo theme="web" col="3" row="7" />
507              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.
508              The ID of the ligand is an 8-digit ZINC ID number.</Notes>              The ID of the ligand is an 8-digit ID number in the [[http://zinc.docking.org ZINC database]].</Notes>
509              <Fields>              <Fields>
510                  <Field name="name" type="long-string">                  <Field name="name" type="long-string">
511                      <Notes>Chemical name of this ligand.</Notes>                      <Notes>Chemical name of this ligand.</Notes>
512                  </Field>                  </Field>
513              </Fields>              </Fields>
514          </Entity>          </Entity>
515            <Entity name="CellLocation" keyType="key-string">
516                <Notes>A section of the cell in which a protein might be found. This includes the cell wall or
517                membrane, outside the cell, inside the cell, and so forth.</Notes>
518            </Entity>
519            <Entity name="Scenario" keyType="string">
520                <Notes>A scenario used to verify the validity of subsystem assignments. Each
521                scenario converrts input compounds to output compounds using reactions.
522                The scenario may use all of the reactions controlled by a subsystem or only
523                some, and may also incorporate additional reactions.</Notes>
524            </Entity>
525      </Entities>      </Entities>
526      <Relationships>      <Relationships>
527            <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM">
528                <DisplayInfo theme="web" />
529                <Notes>This relationship connects a role to the reactions it catalyzes.
530                The purpose of a role is to create proteins that trigger certain
531                chemical reactions. A single reaction can be triggered by many roles,
532                and a role can trigger many reactions.</Notes>
533            </Relationship>
534            <Relationship name="ExcludesReaction" from="Scenario" to="Reaction" arity="MM">
535                <Notes>This relationship connects a scenario to reactions of the parent
536                subsystem that do not participate in it.</Notes>
537            </Relationship>
538            <Relationship name="IncludesReaction" from="Scenario" to="Reaction" arity="MM">
539                <Notes>This relationship connects a scenario to reactions that participate
540                in it but are not part of the parent subsystem.</Notes>
541            </Relationship>
542            <Relationship name="HasScenario" from="Subsystem" to="Scenario" arity="MM">
543                <Notes>This relationship connects a role to the scenarios used to
544                validate it.</Notes>
545            </Relationship>
546            <Relationship name="IsInputFor" from="Compound" to="Scenario" arity="MM">
547                <Notes>This relationship connects a scenario to its input compounds.</Notes>
548            </Relationship>
549            <Relationship name="IsOutputOf" from="Compound" to="Scenario" arity="MM">
550                <Notes>This relationship connects a scenario to its output compounds</Notes>
551                <Fields>
552                    <Field name="auxiliary" type="boolean">
553                        <Notes>TRUE if this is an auxiliary output compound, FALSE if it is a
554                        main output compound.</Notes>
555                    </Field>
556                </Fields>
557            </Relationship>
558            <Relationship name="IsOnDiagram" from="Scenario" to="Diagram" arity="MM">
559                <Notes>This relationship connects a scenario to related diagrams.</Notes>
560            </Relationship>
561            <Relationship name="IsPossiblePlaceFor" from="CellLocation" to="Feature" arity="MM">
562                <Notes>This relationship connects a feature with the various places in a cell that the feature
563                might be found. The confidence factor is included as intersection data.</Notes>
564                <Fields>
565                    <Field name="confidence" type="float">
566                        <Notes>Confidence that the protein will be found in this location, expressed as a
567                        value from 0 to 10.</Notes>
568                    </Field>
569                </Fields>
570            </Relationship>
571            <Relationship name="IsPresentOnProteinOf" from="CDD" to="Feature" arity="MM">
572                <Notes>This relationship connects a feature to its CDD protein domains. The
573                match score is included as intersection data.</Notes>
574                <Fields>
575                    <Field name="score" type="float">
576                        <Notes>This is the match score between the feature and the CDD. A
577                        lower score is a better match.</Notes>
578                    </Field>
579                </Fields>
580                <FromIndex>
581                    <IndexFields>
582                        <IndexField name="score" order="ascending" />
583                    </IndexFields>
584                </FromIndex>
585            </Relationship>
586            <Relationship name="IsIdentifiedByCAS" from="Compound" to="CompoundCAS" arity="MM">
587                <Notes>Relates a compound's CAS ID to the compound itself. Every CAS ID is
588                associated with a compound, and some are associated with two compounds, but not
589                all compounds have CAS IDs.</Notes>
590            </Relationship>
591            <Relationship name="IsIdentifiedByEC" from="Role" to="RoleEC" arity="MM">
592                <Notes>Relates a role to its EC number. Every EC number is associated with a
593                role, but not all roles have EC numbers.</Notes>
594            </Relationship>
595            <Relationship name="IsAliasOf" from="FeatureAlias" to="Feature" arity="MM">
596                <Notes>Connects an alias to the feature it represents. Every alias connects
597                to at least 1 feature, and a feature connects to many aliases.</Notes>
598            </Relationship>
599            <Relationship name="HasCompoundName" from="Compound" to="CompoundName" arity="MM">
600                <Notes>Connects a compound to its names. A compound generally has several
601                names</Notes>
602                <Fields>
603                    <Field name="priority" type="int">
604                        <Notes>Priority of this name, with 1 being the highest priority, 2
605                        the next highest, and so forth.</Notes>
606                    </Field>
607                </Fields>
608                <FromIndex>
609                    <Notes>This index enables the application to view the names of a compound
610                    in priority order.</Notes>
611                    <IndexFields>
612                        <IndexField name="priority" order="ascending" />
613                    </IndexFields>
614                </FromIndex>
615            </Relationship>
616          <Relationship name="IsProteinForFeature" from="PDB" to="Feature" arity="MM">          <Relationship name="IsProteinForFeature" from="PDB" to="Feature" arity="MM">
617                <DisplayInfo caption="Is Protein\nFor Feature" theme="web" />
618              <Notes>Relates a PDB to features that produce highly similar proteins.</Notes>              <Notes>Relates a PDB to features that produce highly similar proteins.</Notes>
619              <Fields>              <Fields>
620                  <Field name="score" type="float">                  <Field name="score" type="float">
# Line 525  Line 644 
644              </FromIndex>              </FromIndex>
645          </Relationship>          </Relationship>
646          <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">          <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">
647                <DisplayInfo caption="Docks With" theme="web" />
648              <Notes>Indicates that a docking result exists between a PDB and a ligand. The              <Notes>Indicates that a docking result exists between a PDB and a ligand. The
649              docking result describes the energy required for the ligand to dock with              docking result describes the energy required for the ligand to dock with
650              the protein described by the PDB. A lower energy indicates the ligand has a              the protein described by the PDB. A lower energy indicates the ligand has a
# Line 533  Line 653 
653              <Fields>              <Fields>
654                  <Field name="reason" type="id-string">                  <Field name="reason" type="id-string">
655                      <Notes>Indication of the reason for determining the docking result.                      <Notes>Indication of the reason for determining the docking result.
656                      A value of [b]Random[/b] indicates the docking was attempted as a part                      A value of =Random= indicates the docking was attempted as a part
657                      of a random survey used to determine the docking characteristics of the                      of a random survey used to determine the docking characteristics of the
658                      PDB. A value of [b]Rich[/b] indicates the docking was attempted because                      PDB. A value of =Rich= indicates the docking was attempted because
659                      a low-energy docking result was predicted for the ligand with respect                      a low-energy docking result was predicted for the ligand with respect
660                      to the PDB.</Notes>                      to the PDB.</Notes>
661                  </Field>                  </Field>
# Line 552  Line 672 
672                  </Field>                  </Field>
673                  <Field name="electrostatic-energy" type="float">                  <Field name="electrostatic-energy" type="float">
674                      <Notes>Docking energy in kcal/mol that results from the movement of                      <Notes>Docking energy in kcal/mol that results from the movement of
675                      electrons (electrostatic force) between the PDB and the ligan.</Notes>                      electrons (electrostatic force) between the PDB and the ligand.</Notes>
676                  </Field>                  </Field>
677              </Fields>              </Fields>
678              <FromIndex>              <FromIndex>
# Line 564  Line 684 
684              </FromIndex>              </FromIndex>
685              <ToIndex>              <ToIndex>
686                  <Notes>This index enables the application to view a ligand's docking results from                  <Notes>This index enables the application to view a ligand's docking results from
687                  the lowest energy (best docking) to highest energy (worst docking). Note that                  the lowest energy (best docking) to highest energy (worst docking).</Notes>
                 since we only keep the best docking results for a PDB, this index is not likely  
                 to provide useful results.</Notes>  
688              </ToIndex>              </ToIndex>
689          </Relationship>          </Relationship>
690          <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">          <Relationship name="IsAlsoFoundIn" from="Feature" to="ExternalDatabase" arity="MM">
691              <Notes>This relationship connects a protein family to all of its PEGs and connects              <Notes>This relationship connects a feature to external databases that contain
692              each PEG to all of its protein families.</Notes>              essentially identical features. The name used in the external database is stored
693          </Relationship>              in the relationship as intersection data.</Notes>
694          <Relationship name="ParticipatesInCoupling" from="Feature" to="Coupling" arity="MM">              <Fields>
695              <Notes>This relationship connects a feature to all the functional couplings                  <Field name="alias" type="name-string">
696              in which it participates. A functional coupling is a recognition of the fact                      <Notes>ID of the feature in the specified external database.</Notes>
             that the features are close to each other on a chromosome, and similar  
             features in other genomes also tend to be close.</Notes>  
             <Fields>  
                 <Field name="pos" type="int">  
                     <Notes>Ordinal position of the feature in the coupling. Currently,  
                     this is either "1" or "2".</Notes>  
697                  </Field>                  </Field>
698              </Fields>              </Fields>
699              <ToIndex>              <Indexes>
700                  <Notes>This index enables the application to view the features of                  <Index>
701                  a coupling in the proper order. The order influences the way the                      <Notes>This index allows direct access to features by external ID.</Notes>
                 PCHs are examined.</Notes>  
702                  <IndexFields>                  <IndexFields>
703                      <IndexField name="pos" order="ascending" />                          <IndexField name="alias" order="ascending" />
704                  </IndexFields>                  </IndexFields>
705              </ToIndex>                  </Index>
706                </Indexes>
707            </Relationship>
708            <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">
709                <DisplayInfo caption="Belongs To" theme="seed" />
710                <Notes>This relationship connects a protein family to all of its PEGs and connects
711                each PEG to all of its protein families.</Notes>
712          </Relationship>          </Relationship>
713          <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="1M">          <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="MM">
714              <Notes>This relation connects a synonym group to the features that make it              <Notes>This relation connects a synonym group to the features that make it
715              up.</Notes>              up.</Notes>
716          </Relationship>          </Relationship>
717          <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M">          <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M">
718                <DisplayInfo theme="nmpdr" caption="Has\nFeature" />
719              <Notes>This relationship connects a genome to all of its features. This              <Notes>This relationship connects a genome to all of its features. This
720              relationship is redundant in a sense, because the genome ID is part              relationship is redundant in a sense, because the genome ID is part
721              of the feature ID; however, it makes the creation of certain queries more              of the feature ID; however, it makes the creation of certain queries more
# Line 616  Line 734 
734                  </IndexFields>                  </IndexFields>
735              </FromIndex>              </FromIndex>
736          </Relationship>          </Relationship>
         <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M">  
             <Notes>This relationship connects a functional coupling to the physically  
             close homologs (PCHs) which affirm that the coupling is meaningful.</Notes>  
         </Relationship>  
         <Relationship name="UsesAsEvidence" from="PCH" to="Feature" arity="MM">  
             <Notes>This relationship connects a PCH to the features that represent its  
             evidence. Each PCH is connected to a parent coupling that relates two features  
             on a specific genome. The PCH's evidence that the parent coupling is functional  
             is the existence of two physically close features on a different genome that  
             correspond to the features in the coupling. Those features are found on the  
             far side of this relationship.</Notes>  
             <Fields>  
                 <Field name="pos" type="int">  
                     <Notes>Ordinal position of the feature in the coupling that corresponds  
                     to our target feature. There is a one-to-one correspondence between the  
                     features connected to the PCH by this relationship and the features  
                     connected to the PCH's parent coupling. The ordinal position is used  
                     to decode that relationship. Currently, this field is either "1" or  
                     "2".</Notes>  
                 </Field>  
             </Fields>  
             <FromIndex>  
                 <Notes>This index enables the application to view the features of  
                 a PCH in the proper order.</Notes>  
                 <IndexFields>  
                     <IndexField name="pos" order="ascending" />  
                 </IndexFields>  
             </FromIndex>  
         </Relationship>  
737          <Relationship name="HasContig" from="Genome" to="Contig" arity="1M">          <Relationship name="HasContig" from="Genome" to="Contig" arity="1M">
738                <DisplayInfo caption="Is Part Of" theme="nmpdr" />
739              <Notes>This relationship connects a genome to the contigs that contain the actual genetic              <Notes>This relationship connects a genome to the contigs that contain the actual genetic
740              information.</Notes>              information.</Notes>
741          </Relationship>          </Relationship>
# Line 677  Line 767 
767              </FromIndex>              </FromIndex>
768          </Relationship>          </Relationship>
769          <Relationship name="IsTargetOfAnnotation" from="Feature" to="Annotation" arity="1M">          <Relationship name="IsTargetOfAnnotation" from="Feature" to="Annotation" arity="1M">
770                <DisplayInfo caption="Targets" theme="seed" />
771              <Notes>This relationship connects a feature to its annotations.</Notes>              <Notes>This relationship connects a feature to its annotations.</Notes>
772          </Relationship>          </Relationship>
773          <Relationship name="MadeAnnotation" from="SproutUser" to="Annotation" arity="1M">          <Relationship name="MadeAnnotation" from="SproutUser" to="Annotation" arity="1M">
774              <Notes>This relationship connects an annotation to the user who made it.</Notes>              <Notes>This relationship connects an annotation to the user who made it.</Notes>
775          </Relationship>          </Relationship>
776          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">
777                <DisplayInfo caption="\nParticipates\nIn" theme="seed" />
778              <Notes>This relationship connects subsystems to the genomes that use              <Notes>This relationship connects subsystems to the genomes that use
779              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be
780              connected to the genome features through the [b]SSCell[/b] object.</Notes>              connected to the genome features through the *SSCell* object.</Notes>
781              <Fields>              <Fields>
782                  <Field name="variant-code" type="key-string">                  <Field name="variant-code" type="key-string">
783                      <Notes>Code indicating the subsystem variant to which this                      <Notes>Code indicating the subsystem variant to which this
784                      genome belongs. Each subsystem can have multiple variants. A variant                      genome belongs. Each subsystem can have multiple variants. A variant
785                      code of [b]-1[/b] indicates that the genome does not have a functional                      code of =-1= indicates that the genome does not have a functional
786                      variant of the subsystem. A variant code of [b]0[/b] indicates that                      variant of the subsystem. A variant code of =0= indicates that
787                      the genome's participation is considered iffy.</Notes>                      the genome's participation is considered iffy.</Notes>
788                  </Field>                  </Field>
789              </Fields>              </Fields>
# Line 705  Line 797 
797              </ToIndex>              </ToIndex>
798          </Relationship>          </Relationship>
799          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">
800                <DisplayInfo caption="Uses" theme="seed" />
801              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>
802              <Fields>              <Fields>
803                    <Field name="abbr" type="name-string">
804                        <Notes>Abbreviated name for the role, generally non-unique, but useful
805                        in column headings for HTML tables.</Notes>
806                    </Field>
807                  <Field name="column-number" type="int">                  <Field name="column-number" type="int">
808                      <Notes>Column number for this role in the specified subsystem's                      <Notes>Column number for this role in the specified subsystem's
809                      spreadsheet.</Notes>                      spreadsheet.</Notes>
810                  </Field>                  </Field>
811                    <Field name="auxiliary" type="boolean">
812                        <Notes>If TRUE, then this role is ancillary to the purpose of the subsystem.
813                        If FALSE, it is essential to its metabolic pathway.</Notes>
814                    </Field>
815                    <Field name="hope_reaction_note" type="text">
816                        <Notes>A description of the status of a role in relation to the
817                        reactions it produces as determined by the scenarios. If present,
818                        will indicate if the role has been determined to be auxiliary,
819                        if it has been examined to verify an automatic assignment, and so
820                        forth.</Notes>
821                    </Field>
822                    <Field name="hope_reaction_link" type="text">
823                        <Notes>A description of the mapping between the reactions of
824                        this role and the scenarios used to validate it.</Notes>
825                    </Field>
826              </Fields>              </Fields>
827              <ToIndex>              <ToIndex>
828                  <Notes>This index enables the application to see the subsystem roles                  <Notes>This index enables the application to see the subsystem roles
# Line 722  Line 834 
834              </ToIndex>              </ToIndex>
835          </Relationship>          </Relationship>
836          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">
837                <DisplayInfo caption="Is Row Of" theme="seed" />
838              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
839              genome for the spreadsheet column.</Notes>              genome for the spreadsheet column.</Notes>
840          </Relationship>          </Relationship>
841          <Relationship name="IsRoleOf" from="Role" to="SSCell" arity="1M">          <Relationship name="IsRoleOf" from="Role" to="SSCell" arity="1M">
842                <DisplayInfo caption="Is In\nColumn\nFor" theme="seed" />
843              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
844              role for the spreadsheet row.</Notes>              role for the spreadsheet row.</Notes>
845          </Relationship>          </Relationship>
846          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">
847                <DisplayInfo caption="Is\nContained\nIn" theme="seed" />
848              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
849              features assigned to it.</Notes>              features assigned to it.</Notes>
850              <Fields>              <Fields>
# Line 740  Line 855 
855              </Fields>              </Fields>
856          </Relationship>          </Relationship>
857          <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM">          <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM">
858                <DisplayInfo caption="Involves" theme="web" />
859              <Notes>This relationship connects a reaction to the compounds that participate              <Notes>This relationship connects a reaction to the compounds that participate
860              in it.</Notes>              in it.</Notes>
861              <Fields>              <Fields>
# Line 799  Line 915 
915              </ToIndex>              </ToIndex>
916          </Relationship>          </Relationship>
917          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">
918                <DisplayInfo caption="Is\nLocation\nOf" theme="nmpdr" />
919              <Notes>This relationship connects a feature to the contig segments that work together              <Notes>This relationship connects a feature to the contig segments that work together
920              to effect it. The segments are numbered sequentially starting from 1. The database is              to effect it. The segments are numbered sequentially starting from 1. The database is
921              required to place an upper limit on the length of each segment. If a segment is longer              required to place an upper limit on the length of each segment. If a segment is longer
922              than the maximum, it can be broken into smaller bits.              than the maximum, it can be broken into smaller bits.  The upper limit enables applications
923              [p]The upper limit enables applications to locate all features that contain a specific              to locate all features that contain a specific residue. For example, if the upper limit
924              residue. For example, if the upper limit is 100 and we are looking for a feature that              is 100 and we are looking for a feature that contains residue 234 of contig *ABC*, we
925              contains residue 234 of contig [b]ABC[/b], we can look for features with a begin point              can look for features with a begin point between 135 and 333. The results can then be
926              between 135 and 333. The results can then be filtered by direction and length of the              filtered by direction and length of the segment.</Notes>
             segment.</Notes>  
927              <Fields>              <Fields>
928                  <Field name="locN" type="int">                  <Field name="locN" type="int">
929                      <Notes>Sequence number of this segment.</Notes>                      <Notes>Sequence number of this segment.</Notes>
# Line 822  Line 938 
938                      is forward and the point after the residue if the direction is backward.</Notes>                      is forward and the point after the residue if the direction is backward.</Notes>
939                  </Field>                  </Field>
940                  <Field name="dir" type="char">                  <Field name="dir" type="char">
941                      <Notes>Direction of the segment: [b]+[/b] if it is forward and                      <Notes>Direction of the segment: =+= if it is forward and
942                      [b]-[/b] if it is backward.</Notes>                      =-= if it is backward.</Notes>
943                  </Field>                  </Field>
944              </Fields>              </Fields>
945              <FromIndex>              <FromIndex>
# Line 856  Line 972 
972              </Fields>              </Fields>
973          </Relationship>          </Relationship>
974          <Relationship name="RoleOccursIn" from="Role" to="Diagram" arity="MM">          <Relationship name="RoleOccursIn" from="Role" to="Diagram" arity="MM">
975                <DisplayInfo caption="Shows" theme="web" />
976              <Notes>This relationship connects a role to the diagrams on which it              <Notes>This relationship connects a role to the diagrams on which it
977              appears. A role frequently identifies an enzyme, and can appear in many              appears. A role frequently identifies an enzyme, and can appear in many
978              diagrams. A diagram generally contains many different roles.</Notes>              diagrams. A diagram generally contains many different roles.</Notes>
979          </Relationship>          </Relationship>
980          <Relationship name="HasSSCell" from="Subsystem" to="SSCell" arity="1M">          <Relationship name="HasSSCell" from="Subsystem" to="SSCell" arity="1M">
981                <DisplayInfo caption="Is Container Of" theme="seed" />
982              <Notes>This relationship connects a subsystem to the spreadsheet cells              <Notes>This relationship connects a subsystem to the spreadsheet cells
983              used to analyze and display it. The cells themselves can be thought of              used to analyze and display it. The cells themselves can be thought of
984              as a grid with Roles on one axis and Genomes on the other. The              as a grid with Roles on one axis and Genomes on the other. The
# Line 872  Line 990 
990              assignment displayed is the most recent one by a user trusted              assignment displayed is the most recent one by a user trusted
991              by the current user. The current user implicitly trusts himself.              by the current user. The current user implicitly trusts himself.
992              If no trusted users are specified in the database, the user              If no trusted users are specified in the database, the user
993              also implicitly trusts the user [b]FIG[/b].</Notes>              also implicitly trusts the user =FIG=.</Notes>
994          </Relationship>          </Relationship>
995          <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">          <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">
996              <Notes>This relationship connects a role subset to the roles that it covers.              <Notes>This relationship connects a role subset to the roles that it covers.
# Line 900  Line 1018 
1018              subset, so the relationship between genomes and subsystems cannot be              subset, so the relationship between genomes and subsystems cannot be
1019              derived from the relationships going through the subset.</Notes>              derived from the relationships going through the subset.</Notes>
1020          </Relationship>          </Relationship>
         <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM">  
             <Notes>This relationship connects a role to the reactions it catalyzes.  
             The purpose of a role is to create proteins that trigger certain  
             chemical reactions. A single reaction can be triggered by many roles,  
             and a role can trigger many reactions.</Notes>  
         </Relationship>  
1021          <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">          <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">
1022              <Notes>This relationship connects a feature to the subsystems in which it              <Notes>This relationship connects a feature to the subsystems in which it
1023              participates. This is technically redundant information, but it is used              participates. This is technically redundant information, but it is used
1024              so often that it deserves its own table.</Notes>              so often that it gets its own table for performance reasons.</Notes>
1025              <Fields>              <Fields>
1026                  <Field name="genome" type="name-string">                  <Field name="genome" type="name-string">
1027                      <Notes>ID of the genome containing the feature</Notes>                      <Notes>ID of the genome containing the feature</Notes>
# Line 928  Line 1040 
1040              </ToIndex>              </ToIndex>
1041          </Relationship>          </Relationship>
1042      </Relationships>      </Relationships>
1043        <Shapes>
1044            <Shape type="oval" name="Pins">
1045                <DisplayInfo theme="nmpdr" col="1" row="4.5" fixed="1" />
1046                <Notes>The Pin Server provides information about functional couplings between features.</Notes>
1047            </Shape>
1048            <Shape type="oval" name="Sims">
1049                <DisplayInfo theme="nmpdr" col="1.5" row="5" fixed="1" />
1050                <Notes>The Similarity Server contains a high-performance custom database of similarities between features.</Notes>
1051            </Shape>
1052            <Shape type="oval" name="BBHs">
1053                <DisplayInfo theme="nmpdr" col="2" row="5.5" fixed="1" />
1054                <Notes>For each feature, the BBH Server has that feature's bidirectional best hits in other genomes.</Notes>
1055            </Shape>
1056            <Shape type="arrow" name="WebServices" from="Sims" to="Feature">
1057                <DisplayInfo caption=" " theme="nmpdr" col="2.5" row="4" />
1058            </Shape>
1059        </Shapes>
1060  </Database>  </Database>

Legend:
Removed from v.1.49  
changed lines
  Added in v.1.56

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3