[Bio] / Sprout / SaplingDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SaplingDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.1, Tue Jul 8 08:56:54 2008 UTC revision 1.2, Wed Sep 3 20:57:52 2008 UTC
# Line 1  Line 1 
 <?xml version="1.0" encoding="utf-8" ?>  
1  <Database>  <Database>
2    <Title>Sapling Bioinformatics Database</Title>    <Title>Sapling Bioinformatics Database</Title>
3    <Notes>The Sapling database is a distributable, self-contained copy of the NMPDR data.    <Notes>The Sapling database is a distributable, self-contained copy of the NMPDR data.
# Line 7  Line 6 
6    <Issues>    <Issues>
7      <Issue>Must add the new "image" data type to ERDB.</Issue>      <Issue>Must add the new "image" data type to ERDB.</Issue>
8      <Issue>Must add the new "dna" data type to ERDB.</Issue>      <Issue>Must add the new "dna" data type to ERDB.</Issue>
     <Issue>Diagrammer should be able to read real DBDs.</Issue>  
     <Issue>Diagrammer should allow editing the DBD.</Issue>  
9      <Issue>Must add back the ability to index a secondary relation. Note that      <Issue>Must add back the ability to index a secondary relation. Note that
10              such indexes can only have a single field.</Issue>              such indexes can only have a single field.</Issue>
11      <Issue>We probably need some type tables that describe things like Identifier(source)      <Issue>We probably need some type tables that describe things like Identifier(source)
12              or Family(kind).</Issue>              or Family(kind).</Issue>
     <Issue>I'm operating on the assumption that this database will eventually grow into a  
             successor for Sprout, hence the name "Sapling". If I'm wrong, then it should be  
             renamed "Root".</Issue>  
13      <Issue>The ERDB documentation needs to be updated to include DisplayInfo, Asides,      <Issue>The ERDB documentation needs to be updated to include DisplayInfo, Asides,
14              the "converse" attribute for relationships, and the Shapes section.</Issue>              the "converse" attribute for relationships, and the Shapes section.</Issue>
15        <Issue>Similarities and pairings are not hooked in correctly.</Issue>
16    </Issues>    </Issues>
17    <Entities>    <Entities>
     <Entity name="Scenario" keyType="string">  
       <DisplayInfo theme="web" col="5" row="1"/>  
       <Notes>A scenario is used to verify the validity of subsystem assignments. Each  
             scenario converrts input compounds to output compounds using reactions.  
             The scenario may use all of the reactions controlled by a subsystem or only  
             some, and may also incorporate additional reactions.</Notes>  
     </Entity>  
18      <Entity name="Compound" keyType="name-string">      <Entity name="Compound" keyType="name-string">
19        <DisplayInfo theme="web" col="1" row="3"/>        <DisplayInfo theme="web" col="3" row="1"/>
20        <Notes>A compound is a chemical that participates in a reaction.        <Notes>A compound is a chemical that participates in a reaction.
21              All compounds have a unique ID and may also have one or more names. Both              All compounds have a unique ID and may also have one or more names. Both
22              ligands and reaction components are treated as compounds.</Notes>              ligands and reaction components are treated as compounds.</Notes>
# Line 73  Line 61 
61        </Indexes>        </Indexes>
62      </Entity>      </Entity>
63      <Entity name="Diagram" keyType="name-string">      <Entity name="Diagram" keyType="name-string">
64        <DisplayInfo theme="web" col="3" row="3"/>        <DisplayInfo theme="web" col="5" row="3"/>
65        <Notes>A functional diagram describes a network of chemical reactions, often comprising a single        <Notes>A functional diagram describes a network of chemical reactions, often comprising a single
66              subsystem. A diagram is identified by a short name and contains a longer descriptive name.</Notes>              subsystem. A diagram is identified by a short name and contains a longer descriptive name.</Notes>
67        <Fields>        <Fields>
# Line 86  Line 74 
74        </Fields>        </Fields>
75      </Entity>      </Entity>
76      <Entity name="Reaction" keyType="key-string">      <Entity name="Reaction" keyType="key-string">
77        <DisplayInfo theme="web" col="5" row="3"/>        <DisplayInfo theme="web" col="3" row="3"/>
78        <Notes>A reaction is a chemical process that converts one set of compounds (substrate)        <Notes>A reaction is a chemical process that converts one set of compounds (substrate)
79              to another set (products). The reaction ID is generally a small number preceded by a              to another set (products). The reaction ID is generally a small number preceded by a
80              letter.</Notes>              letter.</Notes>
# Line 143  Line 131 
131        </Indexes>        </Indexes>
132      </Entity>      </Entity>
133      <Entity name="Publication" keyType="hash-string">      <Entity name="Publication" keyType="hash-string">
134        <DisplayInfo theme="web" col="1" row="7"/>        <DisplayInfo theme="web" col="1" row="8"/>
135        <Notes>A _publication_ is an article or citation that may be used as evidence for        <Notes>A _publication_ is an article or citation that may be used as evidence for
136              assertions made in the database. The key is a hash code computed from the URL.</Notes>              assertions made in the database. The key is a hash code computed from the URL.</Notes>
137        <Fields>        <Fields>
# Line 163  Line 151 
151          </Index>          </Index>
152        </Indexes>        </Indexes>
153      </Entity>      </Entity>
154      <Entity name="EC" keyType="key-string">      <Entity name="Variant" keyType="name-string">
       <DisplayInfo theme="web" col="3" row="5"/>  
       <Notes>An EC number is a code number associated with one or more particular roles.  
             EC numbers are a useful tool for identifying corresponding roles in different  
             databases.</Notes>  
     </Entity>  
     <Entity name="Role" keyType="string">  
       <DisplayInfo theme="web" col="5" row="5"/>  
       <Notes>A role describes a biological function that may be fulfilled by a feature.  
             One of the main goals of the database is to assign features to roles. Most  
             roles are effected by the construction of proteins. Some, however, deal with  
             functional regulation and message transmission</Notes>  
       <Fields>  
         <Field name="hypothetical" type="boolean">  
           <Notes>TRUE if a role is hypothetical, else FALSE</Notes>  
         </Field>  
       </Fields>  
     </Entity>  
     <Entity name="Variant" keyType="hash-string">  
155        <DisplayInfo theme="seed" col="7" row="5"/>        <DisplayInfo theme="seed" col="7" row="5"/>
156        <Notes>A variant is a functional subset of a subsystem. It indicates the particular        <Notes>A variant is a functional subset of a subsystem. It indicates the particular
157              sequence of roles used to implement a metabolic pathway. Variants are abstract              sequence of roles used to implement a metabolic pathway. Variants are abstract
158              concepts used to classify machines. The key of the variant is the subsystem ID followed              concepts used to classify machines. The key of the variant is the subsystem ID followed
159              by the variant code (usually a numeric string with zero or more decimal points).</Notes>              by the variant code (usually a numeric string with zero or more decimal points).</Notes>
160      </Entity>        <Fields>
161      <Entity name="Structure" keyType="string">          <Field name="role-rule" type="text">
162        <DisplayInfo theme="web" col="1" row="5"/>            <Notes>Boolean expression (encoded as text) that describes the roles in this variant.
163        <Notes>A structure represents a portion of a protein's surface. Structures are used            The roles themselves are represented by their IDs.</Notes>
164              to assist in understanding which reactions a protein catalyzes and why. The key of a          </Field>
165              structure is its type followed by an ID. The current types are PDB and CDD, though        </Fields>
             additional types may be added at a later date.</Notes>  
166      </Entity>      </Entity>
167      <Entity name="ProteinSequence" keyType="hash-string">      <Entity name="ProteinSequence" keyType="hash-string">
168        <DisplayInfo theme="web" col="3" row="7" caption="Protein Sequence"/>        <DisplayInfo theme="web" col="3" row="7" caption="Protein Sequence"/>
# Line 242  Line 211 
211          </Field>          </Field>
212        </Fields>        </Fields>
213      </Entity>      </Entity>
214      <Entity name="Feature" keyType="id-string">      <Entity name="Family" keyType="name-string">
215        <DisplayInfo theme="seed" col="5" row="9"/>        <DisplayInfo theme="seed" col="4" row="11"/>
216        <Notes>A feature (sometimes also called a gene) is a part of a genome that is of special        <Notes>A family is a group of features united by a particular determination algorithm.
217              interest. Features may be spread across multiple DNA sequences (contigs) of a genome, but               The algorithm will frequently-- but not always-- signify a functional role.</Notes>
             never across more than one genome. Each feature in the database has a unique FIG ID.</Notes>  
       <Fields>  
         <Field name="feature-type" type="id-string">  
           <Notes>Code indicating the type of this feature. Among the codes currently  
                     supported are "peg" for a protein encoding gene, "bs" for a  
                     binding site, "opr" for an operon, and so forth.</Notes>  
         </Field>  
         <Field name="link" type="text" relation="FeatureLink">  
           <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The  
                     links are to other websites that have useful about the gene that the feature represents, and  
                     are coded as raw HTML, using an anchor href tag.</Notes>  
         </Field>  
         <Field name="essential" type="text" relation="FeatureEssential" special="property_search">  
           <Notes>A value indicating the essentiality of the feature, coded as HTML. In most  
                     cases, this will be a word describing whether the essentiality is confirmed (essential)  
                     or potential (potential-essential), hyperlinked to the document from which the  
                     essentiality was curated. If a feature is not essential, this field will have no  
                     values; otherwise, it may have multiple values.</Notes>  
         </Field>  
         <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">  
           <Notes>A value indicating the virulence of the feature, coded as HTML. In most  
                     cases, this will be a phrase or SA number hyperlinked to the document from which  
                     the virulence information was curated. If the feature is not virulent, this field  
                     will have no values; otherwise, it may have multiple values.</Notes>  
         </Field>  
         <Field name="sequence-length" type="counter">  
           <Notes>Number of base pairs in this feature.</Notes>  
         </Field>  
       </Fields>  
218      </Entity>      </Entity>
219      <Entity name="Machine" keyType="key-string">      <Entity name="MolecularMachine" keyType="key-string">
220        <DisplayInfo theme="seed" col="7" row="7"/>        <DisplayInfo theme="seed" col="7" row="7" caption="Molecular\nMachine"/>
221        <Notes>A machine is a collection of features that implements a metabolic pathway. Machines        <Notes>A molecular machine is a collection of features that implements a metabolic pathway. Machines
222              are the physical instances of variants. Each machine corresponds to a row in a subsystem              are the physical instances of variants. Each machine corresponds to a row in a subsystem
223              spreadsheet. The key is the variant key followed by a colon and the Genome ID.</Notes>              spreadsheet. The key is the variant key followed by a colon and the Genome ID.</Notes>
224        <Fields>        <Fields>
# Line 290  Line 230 
230          </Field>          </Field>
231        </Fields>        </Fields>
232      </Entity>      </Entity>
233      <Entity name="Identifier" keyType="string">      <Entity name="Scenario" keyType="string">
234        <DisplayInfo theme="seed" col="4" row="10"/>        <DisplayInfo theme="web" col="5" row="1"/>
235        <Notes>An identifier is an alternate name for a feature.</Notes>        <Notes>A scenario is a partial instance of a subsystem with a defined set of
236        <Fields>    reactions.Each scenario converrts input compounds to output compounds using reactions.
237          <Field name="source" type="key-string">               The scenario may use all of the reactions controlled by a subsystem or only
238            <Notes>Specific type of the identifier, such as its source database or category.               some, and may also incorporate additional reactions.</Notes>
                     The type can usually be decoded to convert the identifier to a URL.</Notes>  
         </Field>  
       </Fields>  
       <Indexes>  
         <Index>  
           <Notes>This index allows all the identifiers of a specified type to be located.</Notes>  
           <IndexFields>  
             <IndexField name="source" order="ascending"/>  
           </IndexFields>  
         </Index>  
       </Indexes>  
     </Entity>  
     <Entity name="Assignment" keyType="hash-string">  
       <DisplayInfo col="5" row="7" theme="seed"/>  
       <Notes>An assignment connects a feature to its putative role. The key of the  
             assignment is the feature ID followed by a timestamp.</Notes>  
     </Entity>  
     <Entity name="EvidenceClass" keyType="name-string">  
       <DisplayInfo col="6" row="9" theme="seed"/>  
       <Notes>An evidence class describes a general type of evidence code. An actual evidence  
             code consists of its class (e.g. "dlit", "ff") and an optional modifier. The modifier  
             is contained in the relationship between the class and the target assignment.</Notes>  
       <Fields>  
         <Field name="format" type="string">  
           <Notes>The format string is an example showing how the modifier portion of the  
                     evidence code is formatted. It may contain HTML markup.</Notes>  
         </Field>  
         <Field name="short-description" type="string">  
           <Notes>The short description is a brief noun phrase explanation of the  
                     evidence class.</Notes>  
         </Field>  
         <Field name="description" type="text">  
           <Notes>The description is a long text description of the evidence class and its  
                     format string.</Notes>  
         </Field>  
       </Fields>  
239      </Entity>      </Entity>
240      <Entity name="Family" keyType="name-string">      <Entity name="Pairing" keyType="name-string">
241        <DisplayInfo theme="seed" col="5" row="11"/>        <DisplayInfo theme="seed" col="5" row="11"/>
242        <Notes>A family is a group of features united by a particular determination algorithm.        <Notes>A pairing indicates that two protein sequences are found close together on one or
243              The algorithm will frequently-- but not always-- signify a functional role.</Notes>               more DNA sequences. Not all possible pairings are stored in the database; only those that
244                 are considered for some reason to be significant for annotation purposes.The key of the pairing is the
245                 concatenation of the protein sequence keys in alphabetical order.</Notes>
246          <Asides>Because the protein sequence key is a hash of the sequence letters, the key of a pairing between two
247                  sequences is computable from the sequences themselves. Theoretically, the pairing
248                  is unordered: (A,B) and (B,A) are the same pairing. It is frequently the case,
249                  however, that we need to refer to the "first" or "second" protein in the pairing.
250                  When this happens, the first one is always the protein with the alphabetically
251                  lesser key. The IsInPair relationship automatically shows the proteins in this
252                  order.</Asides>
253      </Entity>      </Entity>
254      <Entity name="Genome" keyType="name-string">      <Entity name="Genome" keyType="name-string">
255        <DisplayInfo theme="nmpdr" col="7" row="9" caption="Genome Organism"/>        <DisplayInfo theme="nmpdr" col="7" row="9" caption="Genome Organism"/>
256        <Notes>Genome objects are organized in a hierarchy. At the bottom are the true genomes and        <Notes>A genome represents a specific organism with DNA, or a specific meta-genome. All DNA
257              meta-genomes that connect to the rest of the database. Above them are a hierarchy  sequences in the database belong to genomes.</Notes>
             based on taxonomic classification.</Notes>  
258        <Fields>        <Fields>
259          <Field name="full-name" type="name-string">          <Field name="full-name" type="name-string">
260            <Notes>Full name of the genome. This is either the taxonomic classification name            <Notes>Full genus/species/strain name of the genome.</Notes>
                     or a genus/species/strain name.</Notes>  
         </Field>  
         <Field name="level" type="int">  
           <Notes>Taxonomic classification level. A level of 0 indicates that this is  
                     a specific strain with DNA attached. Higher levels indicate progressively  
                     larger classifications. Each level number represents a specific type of  
                     classification. Sub-species is always 1, species is always 2, genus is always  
                     3, and so forth, up to 99 for domain. This means that as you travel up the  
                     taxonomy tree, the ranks will be non-sequential.</Notes>  
261          </Field>          </Field>
262          <Field name="domain" type="name-string">          <Field name="domain" type="name-string">
263            <Notes>Domain for this genome or taxonomic classification. The domain is            <Notes>Domain for this genome or taxonomic classification. The domain is
# Line 396  Line 299 
299          </Index>          </Index>
300          <Index>          <Index>
301            <Notes>This index allows the applications to find all genomes in lexical            <Notes>This index allows the applications to find all genomes in lexical
302                      order by name. Organisms will show up first, alphabetical by species and                   order by name.</Notes>
                     strain name, followed by the various taxonomic classifications grouped by  
                     increasing inclusivity. (In other words,</Notes>  
303            <IndexFields>            <IndexFields>
             <IndexField name="level" order="ascending"/>  
304              <IndexField name="full-name" order="ascending"/>              <IndexField name="full-name" order="ascending"/>
305            </IndexFields>            </IndexFields>
306          </Index>          </Index>
307        </Indexes>        </Indexes>
308      </Entity>      </Entity>
309      <Entity name="Pairing" keyType="name-string">      <Entity name="Feature" keyType="id-string">
310        <DisplayInfo theme="seed" col="3" row="9"/>        <DisplayInfo theme="seed" col="5" row="9"/>
311        <Notes>A pairing indicates that two protein sequences are found close together on one or        <Notes>A feature (sometimes also called a gene) is a part of a genome that is of special
312              more DNA sequences. Not all possible pairings are stored in the database; only those that               interest. Features may be spread across multiple DNA sequences (contigs) of a genome, but
313              are considered for some reason to be significant for annotation purposes. The pairing               never across more than one genome. Each feature in the database has a unique FIG ID.</Notes>
             includes a score that indicates how many of the DNA sequences are significantly  
             dissimilar. A higher score indicates a stronger pairing. The key of the pairing is the  
             concatenation of the protein sequence keys in alphabetical order.</Notes>  
       <Asides>Because the protein sequence key is a hash of the sequence letters, the key of a pairing between two  
             sequences is computable from the sequences themselves. Theoretically, the pairing  
             is unordered: (A,B) and (B,A) are the same pairing. It is frequently the case,  
             however, that we need to refer to the "first" or "second" protein in the pairing.  
             When this happens, the first one is always the protein with the alphabetically  
             lesser key. The IsInPair relationship automatically shows the proteins in this  
             order.</Asides>  
314        <Fields>        <Fields>
315          <Field name="score" type="int">          <Field name="feature-type" type="id-string">
316            <Notes>Coupling score for this pairing. A higher score indicates a stronger            <Notes>Code indicating the type of this feature. Among the codes currently
317                      coupling.</Notes>                   supported are "peg" for a protein encoding gene, "bs" for a
318                     binding site, "opr" for an operon, and so forth.</Notes>
319            </Field>
320            <Field name="link" type="text" relation="FeatureLink">
321              <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
322                     links are to other websites that have useful about the gene that the feature represents, and
323                     are coded as raw HTML, using an anchor href tag.</Notes>
324            </Field>
325            <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
326              <Notes>A value indicating the essentiality of the feature, coded as HTML. In most
327                     cases, this will be a word describing whether the essentiality is confirmed (essential)
328                     or potential (potential-essential), hyperlinked to the document from which the
329                     essentiality was curated. If a feature is not essential, this field will have no
330                     values; otherwise, it may have multiple values.</Notes>
331            </Field>
332            <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">
333              <Notes>A value indicating the virulence of the feature, coded as HTML. In most
334                     cases, this will be a phrase or SA number hyperlinked to the document from which
335                     the virulence information was curated. If the feature is not virulent, this field
336                     will have no values; otherwise, it may have multiple values.</Notes>
337            </Field>
338            <Field name="sequence-length" type="counter">
339              <Notes>Number of base pairs in this feature.</Notes>
340            </Field>
341            <Field name="evidence-code" type="string" relation="FeatureEvidence">
342              <Notes>An evidence code describes the possible evidence that exists
343          for deciding a feature's functional assignment. A feature may have no evidence,
344          a single evidence code, or several.</Notes>
345            </Field>
346            <Field name="function" type="text">
347              <Notes>Functional assignment for this feature. This will often indicate
348    the feature's functional role or roles, and may also have comments.</Notes>
349              <Asides>It will frequently be the case that a feature is assigned to a single
350    role, and it is identical to the function. In some cases, a feature will have
351    multiple roles, and all of them will be listed in the function field. In addition,
352    the function may have comment text at the end.</Asides>
353          </Field>          </Field>
354        </Fields>        </Fields>
355      </Entity>      </Entity>
356      <Entity name="EvidenceSet" keyType="int">      <Entity name="Annotation" keyType="string">
357        <DisplayInfo theme="seed" col="3" row="11" caption="Evidence Set"/>        <DisplayInfo col="3" row="11" theme="seed"/>
358        <Notes>An evidence set indicates evidence for a functional connection between protein        <Notes>An annotation is a comment attached to a feature. Annotations are used to
359              sequence pairs. The protein sequences possessing the connection are the ones that  track the history of a feature's functional assignments and any related issues. The
360              participate in the evidence set's pairings.</Notes>  key is the feature ID followed by a colon and an complemented eight-digit sequence number.</Notes>
361        <Asides>The pairings for a particular evidence set        <Asides>The complemented sequence number causes the annotations to sort with the most recent one
362              will contain protein sequences that are significantly similar. In other words, if  first.</Asides>
             (A,B) and (X,Y) are both pairings in a single evidence set, then (A =~ X) and  
             (B =~ Y) or (A =~ Y) and (B =~ X).</Asides>  
363        <Fields>        <Fields>
364          <Field name="score" type="int">          <Field name="annotator" type="string">
365            <Notes>Score for this evidence set. The score indicates the number of            <Notes>Name of the annotator who made the comment.</Notes>
366                      significantly different genomes represented by the pairings.</Notes>          </Field>
367            <Field name="comment" type="text">
368              <Notes>Text of the annotation.</Notes>
369            </Field>
370            <Field name="annotation-time" type="date">
371              <Notes>Date and time at which the annotation was made.</Notes>
372            </Field>
373          </Fields>
374        </Entity>
375        <Entity name="Role" keyType="hash-string">
376          <DisplayInfo theme="web" col="5" row="5"/>
377          <Notes>A role describes a biological function that may be fulfilled by a feature.
378                 One of the main goals of the database is to assign features to roles. Most
379                 roles are effected by the construction of proteins. Some, however, deal with
380                 functional regulation and message transmission.</Notes>
381          <Asides>A role represents a single gene function. Many roles are in
382    subsystems, but some are not. If a feature has multiple functions, each
383    is represented as a separate role.</Asides>
384          <Fields>
385            <Field name="hypothetical" type="boolean">
386              <Notes>TRUE if a role is hypothetical, else FALSE</Notes>
387            </Field>
388            <Field name="name" type="string">
389              <Notes>English name of this role. The actual role ID is computed from this field.</Notes>
390          </Field>          </Field>
391        </Fields>        </Fields>
392      </Entity>      </Entity>
393        <Entity name="RoleSet" keyType="int">
394          <DisplayInfo theme="web" col="3" row="5" caption="Role Set"/>
395          <Notes>A role set is a group of roles that work together to stimulate a reaction. Most role sets consist of a single
396    role; however, some reactions require the presence of multiple roles to get them started.</Notes>
397          <Asides>A reaction is usually triggered by a single role, but some reactions are triggered
398    by a boolean combination of roles (e.g. =(A and (B or C) and D) or (E and B and F) or G=). The boolean
399    expression can be converted into disjunctive normal form, which is a list of alternative sets
400     (e.g. =(A and B and D) or (A and C and D) or (E and B and F) or G=). Each alternative is then converted
401    into a role set. This allows us to precisely represent the triggering conditions of a reaction in the database.</Asides>
402        </Entity>
403      <Entity name="DnaSequence" keyType="name-string">      <Entity name="DnaSequence" keyType="name-string">
404        <DisplayInfo theme="nmpdr" col="7" row="11" caption="DNA Sequence"/>        <DisplayInfo theme="nmpdr" col="7" row="11" caption="DNA Sequence"/>
405        <Notes>A DNA sequence (sometimes called a "contig") is a contiguous sequence of base pairs        <Notes>A DNA sequence (sometimes called a "contig") is a contiguous sequence of base pairs
# Line 458  Line 414 
414          </Field>          </Field>
415        </Fields>        </Fields>
416      </Entity>      </Entity>
417    </Entities>      <Entity name="TaxonomicGrouping" keyType="string">
418    <Relationships>        <DisplayInfo row="10" col="8" caption="Taxonomic\nGrouping" theme="nmpdr"/>
419      <Relationship name="IsTargetOf" from="Role" to="Assignment" arity="1M" converse="Targets">        <Notes>A taxonomic grouping is a segment of the classification for an organism.
420        <DisplayInfo theme="seed" caption="Is\nTarget\nOf"/>    Taxonomic groupings are organized into a strict hierarchy by the IsClassOf
421        <Notes>This relationship connects an assignment to the target role. A role has  relationship.</Notes>
             many assignments, but an assignment targets exactly one role.</Notes>  
     </Relationship>  
     <Relationship name="IsAnnotatedBy" from="Feature" to="Assignment" arity="1M" converse="Annotates">  
       <DisplayInfo theme="seed" caption="Annotates"/>  
       <Notes>This relationship connects a feature to the assignments that annotate it.  
             A feature may have several assignments, but an assignment annotates exactly one  
             feature.</Notes>  
422        <Fields>        <Fields>
423          <Field name="time-stamp" type="date">          <Field name="level" type="int">
424            <Notes>Time at which the assignment was made.</Notes>            <Notes>Taxonomic classification level. A level of 0 indicates that this is
425          </Field>                      a specific strain with DNA attached. Higher levels indicate progressively
426          <Field name="annotator" type="string">                      larger classifications. Each level number represents a specific type of
427            <Notes>Name of the annotator who made the assignment.</Notes>                      classification. Sub-species is always 1, species is always 2, genus is always
428          </Field>                      3, and so forth, up to 99 for domain. This means that as you travel up the
429          <Field name="active" type="boolean">                      taxonomy tree, the ranks will be non-sequential.</Notes>
           <Notes>TRUE if this assignment is active; FALSE if it has been  
                     superceded.</Notes>  
430          </Field>          </Field>
431        </Fields>        </Fields>
432        <FromIndex>        <Indexes>
433          <Notes>This index presents the assignments in order from the most          <Index>
434                  recent to the least recent, with active assignments first.</Notes>            <Notes>This index allows the applications to find all groupings by level.
435                     lower (less inclusive) levels will occur first.</Notes>
436          <IndexFields>          <IndexFields>
437            <IndexField name="active" order="descending"/>              <IndexField name="level" order="ascending"/>
           <IndexField name="time-stamp" order="descending"/>  
438          </IndexFields>          </IndexFields>
439        </FromIndex>          </Index>
440      </Relationship>        </Indexes>
441      <Relationship name="IsEvidencedBy" from="Assignment" to="EvidenceClass" arity="MM" converse="IsEvidenceFor">      </Entity>
442        <DisplayInfo theme="seed" caption="Is\nEvidenced\nBy" fixed="1" col="6" row="8" />      <Entity name="Structure" keyType="name-string">
443        <Notes>This relationship contains the evidence for an assignment. An assignment will        <DisplayInfo theme="web" col="2" row="5"/>
444              have one or more evidence codes, and each evidence class will justify an enormous        <Notes>A structure is the geometrical representation of a protein sequence. A single protein sequence may
445              number of assignments. The intersection data contains details about the evidence.</Notes>    have multiple structural representations, either because it is folded in different ways or because there
446        <Fields>    are alternative representation formats. The key field is the representation type (e.g. PDB, SCOPE)
447          <Field name="modifier" type="string">    followed by the ID, with an intervening vertical bar.</Notes>
448            <Notes>A modifier for the evidence class. The modifier is concatenated to the      </Entity>
449                      class to form the complete evidence code. Frequently, the modifier will be the      <Entity name="FcEvidenceSet" keyType="int">
450                      ID of a family, subsystem, or evidence set.</Notes>        <DisplayInfo theme="seed" col="5" row="13" caption="Functional Coupling Evidence Set"/>
451          <Notes>A functional coupling evidence set indicates evidence for a functional connection between protein
452                 sequence pairs. The protein sequences possessing the connection are the ones that
453                 participate in the evidence set's pairings.</Notes>
454          <Asides>The pairings for a particular evidence set
455                  will contain protein sequences that are significantly similar. In other words, if
456                  (A,B) and (X,Y) are both pairings in a single evidence set, then (A =~ X) and
457                  (B =~ Y) or (A =~ Y) and (B =~ X), depending on the value of the "inverted" attribute of
458    the IsDeterminedBy relationship. Essentially, a pairing in its own right is unordered.
459    If (A,B) is a pair, then so is (B,A). However, the evidence set maintains a correspondence
460    between its pairs that _is_ ordered, because the constituent pairs must match. The
461    direction in which a pair matches others in the set is an attribute of the relationship from the pairs
462    to the sets.</Asides>
463          <Fields>
464            <Field name="score" type="int">
465              <Notes>Score for this evidence set. The score indicates the number of
466                     significantly different genomes represented by the pairings.</Notes>
467          </Field>          </Field>
468        </Fields>        </Fields>
469      </Relationship>      </Entity>
470        <Entity name="MachineRole" keyType="name-string">
471          <DisplayInfo row="7" col="5" caption="Machine Role" theme="seed"/>
472          <Notes>A machine role represents a role as it occurs in a molecular machine. The key
473          is the machine key plus the role abbreviation.</Notes>
474          <Asides>The machine role corresponds to a cell on the subsystem spreadsheet. Features
475          in the subsystem are assigned directly to the machine role.</Asides>
476        </Entity>
477        <Entity name="IdentifierSet" keyType="name-string">
478          <DisplayInfo row="9" col="1" theme="seed"/>
479          <Notes>The identifier set is a group of identifiers that mean the same thing, usually either a Feature
480      or a Protein Sequence. The identifiers in a set will frequently belong to different genomic databases.
481      Thus, if a specific protein sequence has one name in the NMPDR and another name in RefSeq, both of
482      the names would be in the same identifier set.</Notes>
483        </Entity>
484        <Entity name="Identifier" keyType="string">
485          <DisplayInfo theme="seed" col="3" row="9"/>
486          <Notes>An identifier is an alternate name for a feature or protein sequence.</Notes>
487          <Asides>Some identifiers name features or protein sequences that do not exist in the database. In this case,
488      the feature or protein sequence is considered _external_; that is, it belongs to another database.</Asides>
489          <Fields>
490            <Field name="source" type="key-string">
491              <Notes>Specific type of the identifier, such as its source database or category.
492                     The type can usually be decoded to convert the identifier to a URL.</Notes>
493            </Field>
494          </Fields>
495          <Indexes>
496            <Index>
497              <Notes>This index allows all the identifiers of a specified type to be located.</Notes>
498              <IndexFields>
499                <IndexField name="source" order="ascending"/>
500              </IndexFields>
501            </Index>
502          </Indexes>
503        </Entity>
504      </Entities>
505      <Relationships>
506      <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">      <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">
507        <DisplayInfo caption="Has As\nTerminus"/>        <DisplayInfo caption="Has As\nTerminus"/>
508        <Notes>A terminus for a scenario is a compound that acts as its input or output. A compound        <Notes>A terminus for a scenario is a compound that acts as its input or output. A compound
# Line 525  Line 524 
524          </IndexFields>          </IndexFields>
525        </ToIndex>        </ToIndex>
526      </Relationship>      </Relationship>
527      <Relationship name="HasAlias" from="Feature" to="Identifier" arity="MM" converse="IsAliasOf">      <Relationship name="IsRelevantFor" from="Diagram" to="Subsystem" arity="MM" converse="IsRelevantTo">
528        <DisplayInfo theme="seed" fixed="1" col="4" row="9" caption="Has Alias"/>        <DisplayInfo theme="seed" caption="Is\nRelevant\nFor"/>
529        <Notes>An identifier is an alias for multiple features. A feature may have multiple alias        <Notes>Thie relationship connects each subsystem to the diagrams that are useful in curating
530              identifiers.</Notes>        and understanding the subsystem. A subsystem may overlap many diagrams, but only those considered
531      </Relationship>        crucial are connected via this relationship. The relationship is many-to-many.</Notes>
     <Relationship name="Justifies" from="EvidenceSet" to="Family" arity="MM" converse="IsJustifiedBy">  
       <DisplayInfo theme="seed" caption="Is\nJustified\nBy"/>  
       <Notes>A family may use multiple sets as evidence. In general, an evidence set will  
             justify two families-- one for each side of the pairing.</Notes>  
     </Relationship>  
     <Relationship name="IsDeterminedBy" from="EvidenceSet" to="Pairing" arity="MM" converse="Determines">  
       <DisplayInfo theme="seed" caption="Determines"/>  
       <Notes>An evidence set exists because it has pairings in it, and this relationship  
             connects the evidence set to its constituent pairings. A pairing cam belong to  
             multiple evidence sets.</Notes>  
       <Fields>  
         <Field name="inverted" type="boolean">  
           <Notes>A pairing is an unordered pair of protein sequences, but its  
                     similarity to other pairings in an evidence set is ordered. Let (A,B) be  
                     a pairing and (X,Y) be another pairing in the same set. If this flag is  
                     FALSE, then (A =~ X) and (B =~ Y). If this flag is TRUE, then (A =~ Y) and  
                     (B =~ X).</Notes>  
         </Field>  
       </Fields>  
     </Relationship>  
     <Relationship name="IsInPair" from="ProteinSequence" to="Pairing" arity="MM" converse="Contains">  
       <DisplayInfo theme="seed" caption="Is In\nPair"/>  
       <Notes>A pairing contains exactly two protein sequences. A protein sequence can  
             belong to multiple pairings. When going from a protein sequence to its pairings,  
             they are presented in alphabetical order by sequence key.</Notes>  
     </Relationship>  
     <Relationship name="HasMember" from="Family" to="Feature" arity="1M" converse="IsMemberOf">  
       <DisplayInfo theme="seed" caption="Is\nMember\nOf" row="11.5" col="5"/>  
       <Notes>This relationship connects each feature family to its constituent  
             features. A family always has many features, but a single feature can  
             be found in at most one family.</Notes>  
     </Relationship>  
     <Relationship name="IsClassOf" from="Genome" to="Genome" arity="1M" converse="IsClassifiedAs">  
       <DisplayInfo theme="nmpdr" col="8" row="9" fixed="1" caption="Is\nClass\nOf"/>  
       <Notes>The recursive IsClassOf relationship organizes Genomes into a hierarchy  
             based on the standard taxonomy. Only genomes at the bottom of the hierarchy have  
             actual DNA attached.</Notes>  
     </Relationship>  
     <Relationship name="ConsistsOf" from="Variant" to="Role" arity="MM">  
       <DisplayInfo theme="seed" connected="1" caption="Belongs To"/>  
       <Notes>A variant is essentially a sequence of roles. Roles can belong to many  
             variants. Some roles will not belong to any variants.</Notes>  
     </Relationship>  
     <Relationship name="Contains" from="Diagram" to="Compound" arity="MM" converse="IsContainedIn">  
       <DisplayInfo theme="web" caption="Is\nContained\nIn"/>  
       <Notes>This relationship indicates that a compound appears on a particular diagram.  
             The same compound can appear on many diagrams, and a diagram always contains many  
             compounds.</Notes>  
     </Relationship>  
     <Relationship name="Includes" from="Subsystem" to="Role" arity="MM" converse="IsIncludedIn">  
       <DisplayInfo theme="seed" caption="Includes"/>  
       <Notes>A subsystem is defined by its roles. The subsystem's variants contain slightly  
             different sets of roles, but all of the roles in a variant must be connected to the  
             parent subsystem by this relationship.</Notes>  
       <Fields>  
         <Field name="sequence" type="counter">  
           <Notes>Sequence number of the role within the subsystem. When the roles  
                     are formed into a variant, they will generally appear in sequence order.</Notes>  
         </Field>  
       </Fields>  
       <FromIndex>  
         <Notes>This index insures that the roles of the subsystem are presented in sequence  
                 order.</Notes>  
         <IndexFields>  
           <IndexField name="sequence" order="ascending"/>  
         </IndexFields>  
       </FromIndex>  
532      </Relationship>      </Relationship>
533      <Relationship name="Describes" from="Subsystem" to="Variant" arity="1M" converse="IsDescribedBy">      <Relationship name="Describes" from="Subsystem" to="Variant" arity="1M" converse="IsDescribedBy">
534        <DisplayInfo theme="seed"/>        <DisplayInfo theme="seed"/>
# Line 609  Line 541 
541        <Notes>This relationship connects a diagram to its reactions. A diagram shows multiple        <Notes>This relationship connects a diagram to its reactions. A diagram shows multiple
542              reactions, and a reaction can be on many diagrams.</Notes>              reactions, and a reaction can be on many diagrams.</Notes>
543      </Relationship>      </Relationship>
544      <Relationship name="Performs" theme="web" from="Reaction" to="Role" arity="MM">      <Relationship name="IsOwnerOf" from="Genome" to="Feature" arity="1M" converse="IsOwnedBy">
545        <DisplayInfo theme="web"/>        <DisplayInfo caption="Is\nOwned\nBy" theme="seed"/>
546        <Notes>A reaction performs many roles. A role can be performed by many        <Notes>This relationship connects each feature to its parent genome.</Notes>
             reactions.</Notes>  
547      </Relationship>      </Relationship>
548      <Relationship name="IsImplementedBy" from="Variant" to="Machine" arity="1M" converse="Implements">      <Relationship name="IsImplementedBy" from="Variant" to="MolecularMachine" arity="1M" converse="Implements">
549        <DisplayInfo theme="seed" caption="Is\nImplemented\nBy"/>        <DisplayInfo theme="seed" caption="Is\nImplemented\nBy" row="6" col="7"/>
550        <Notes>This relationship connects a variant to the physical machines that implement        <Notes>This relationship connects a variant to the physical machines that implement
551              it in the genomes. A variant is implemented by many machines, but a machine belongs to              it in the genomes. A variant is implemented by many machines, but a machine belongs to
552              only one variant.</Notes>              only one variant.</Notes>
553      </Relationship>      </Relationship>
554      <Relationship name="Involves" from="Reaction" to="Compound" arity="MM" converse="IsInvolvedIn">      <Relationship name="Uses" theme="seed" from="Genome" to="MolecularMachine" arity="1M" converse="IsUsedBy">
555        <DisplayInfo theme="web" col="3" row="4" fixed="1" caption="Is\nInvolved\nIn"/>        <DisplayInfo theme="seed" caption="Is\nUsed\nBy"/>
556        <Notes>This relationship connects a reaction to the compounds that participate in        <Notes>This relationship connects a genome to the machines that form its
557              it. A reaction involves many compounds, and a compound can be involved in many reactions.              metabolic pathways. A genome can use many machines, but a machine is used by exactly
558              The relationship attributes indicate whether a compound is a product or substrate of the              one genome.</Notes>
559              reaction, as well as its stoichiometry.</Notes>      </Relationship>
560        <Relationship name="Includes" from="Subsystem" to="Role" arity="MM" converse="IsIncludedIn">
561          <DisplayInfo theme="seed" caption="Includes"/>
562          <Notes>A subsystem is defined by its roles. The subsystem's variants contain slightly
563                different sets of roles, but all of the roles in a variant must be connected to the
564                parent subsystem by this relationship. A subsystem always has at least one
565                role, and a role always belongs to at least one subsystem.</Notes>
566        <Fields>        <Fields>
567          <Field name="product" type="boolean">          <Field name="sequence" type="counter">
568            <Notes>TRUE if the compound is a product of the reaction, FALSE if            <Notes>Sequence number of the role within the subsystem. When the roles
569                      it is a substrate. When a reaction is written on paper in                   are formed into a variant, they will generally appear in sequence order.</Notes>
                     chemical notation, the substrates are left of the arrow and the  
                     products are to the right. Sorting on this field will cause  
                     the substrates to appear first, followed by the products. If the  
                     reaction is reversible, then the notion of substrates and products  
                     is not intuitive; however, a value here of FALSE still puts the  
                     compound left of the arrow and a value of TRUE still puts it to the  
                     right.</Notes>  
         </Field>  
         <Field name="stoichiometry" type="key-string">  
           <Notes>Number of molecules of the compound that participate in a  
                     single instance of the reaction. For example, if a reaction  
                     produces two water molecules, the stoichiometry of water for the  
                     reaction would be two. When a reaction is written on paper in  
                     chemical notation, the stoichiometry is the number next to the  
                     chemical formula of the compound.</Notes>  
         </Field>  
         <Field name="main" type="boolean">  
           <Notes>TRUE if this compound is one of the main participants in  
                     the reaction, else FALSE. It is permissible for none of the  
                     compounds in the reaction to be considered main, in which  
                     case this value would be FALSE for all of the relevant  
                     compounds.</Notes>  
         </Field>  
         <Field name="loc" type="key-string">  
           <Notes>An optional character string that indicates the relative  
                     position of this compound in the reaction's chemical formula. The  
                     location affects the way the compounds present as we cross the  
                     relationship from the reaction side. The product/substrate flag  
                     comes first, then the value of this field, then the main flag.  
                     The default value is an empty string; however, the empty string  
                     sorts first, so if this field is used, it should probably be  
                     used for every compound in the reaction.</Notes>  
570          </Field>          </Field>
571          <Field name="discriminator" type="int">          <Field name="abbreviation" type="key-string">
572            <Notes>A unique ID for this record. The discriminator does not            <Notes>Abbreviation for this role in this subsystem. The abbreviations are
573                      provide any useful data, but it prevents identical records from  used in columnar displays, and they also appear on diagrams.</Notes>
                     being collapsed by the SELECT DISTINCT command used by ERDB to  
                     retrieve data.</Notes>  
574          </Field>          </Field>
575        </Fields>        </Fields>
576        <ToIndex>        <FromIndex>
577          <Notes>This index presents the compounds in the reaction in the          <Notes>This index insures that the roles of the subsystem are presented in sequence
578                  order they should be displayed when writing it in chemical notation.                  order.</Notes>
                 All the substrates appear before all the products, and within that  
                 ordering, the main compounds appear first.</Notes>  
579          <IndexFields>          <IndexFields>
580            <IndexField name="product" order="ascending"/>            <IndexField name="sequence" order="ascending"/>
           <IndexField name="loc" order="ascending"/>  
           <IndexField name="main" order="descending"/>  
581          </IndexFields>          </IndexFields>
582        </ToIndex>        </FromIndex>
     </Relationship>  
     <Relationship name="IsSourceOf" from="Machine" to="Assignment" arity="1M" converse="HasSource">  
       <DisplayInfo theme="seed" caption="Has Source"/>  
       <Notes>This relationship connects a machine to the assignments made in its name.  
             A machine is the source of many assignments, but an assignment belongs to at most  
             one machine.</Notes>  
     </Relationship>  
     <Relationship name="Uses" theme="seed" from="Genome" to="Machine" arity="1M" converse="IsUsedBy">  
       <DisplayInfo theme="seed" caption="Is\nUsed\nBy"/>  
       <Notes>This relationship connects a genome to the machines that form its  
             metabolic pathways. A genome can use many machines, but a machine is used by exactly  
             one genome.</Notes>  
583      </Relationship>      </Relationship>
584      <Relationship name="Catalyzes" from="ProteinSequence" to="Role" arity="MM" converse="IsCatalyzedBy">      <Relationship name="Implements" from="ProteinSequence" to="Role" arity="MM" converse="IsCatalyzedBy">
585        <DisplayInfo theme="web" caption="Is\nCatalyzed\nBy"/>        <DisplayInfo theme="web" caption="Is\nImplemented\nBy"/>
586        <Notes>This relationship connects a protein sequence to the functional roles it        <Notes>This relationship connects a protein sequence to the functional roles it
587              catalyzes in the cell. A protein sequence can catalyze many roles, and a role can              implements in the cell. A protein sequence can implement many roles, and a role can
588              be catalyzed by many protein sequences. Roles that perform regulatory or message              be implemented by many protein sequences. Roles that perform regulatory or message
589              transmission functions do not participate in this relationship.</Notes>              transmission functions do not participate in this relationship.</Notes>
590      </Relationship>      </Relationship>
591      <Relationship name="IsProducedBy" from="ProteinSequence" to="Feature" arity="1M" converse="Produces">      <Relationship name="IsCombinationOf" from="RoleSet" to="Role" arity="MM" converse="IsInCombination">
592        <DisplayInfo caption="Is\nProduced\nBy" theme="seed" row="10" col="1.5"/>        <DisplayInfo theme="web" caption="Is\nCombination\nOf"/>
593        <Notes>This relationship connects a feature to the protein sequence it produces (if any).        <Notes>This relationship combines roles into role sets. Each role set is a combination of roles that can
594              Many features can produce the same protein sequence, but each feature produces at most  trigger a reaction.</Notes>
595              one protein sequence. Many features do not produce a protein sequence at all.</Notes>      </Relationship>
596        <Relationship name="IsTriggeredBy" from="Reaction" to="RoleSet" arity="MM" converse="Triggers">
597          <DisplayInfo theme="web" caption="Is\nTriggered\nBy"/>
598          <Notes>A reaction can be triggered by many role sets. A role set can trigger many reactions.</Notes>
599        </Relationship>
600        <Relationship name="IsClassOf" from="TaxonomicGrouping" to="TaxonomicGrouping" arity="1M" converse="IsClassifiedAs">
601          <DisplayInfo theme="nmpdr" col="8" row="11" fixed="1" caption="Is\nClass\nOf"/>
602          <Notes>The recursive IsClassOf relationship organizes taxonomic groupings into a hierarchy
603                based on the standard organism taxonomy.</Notes>
604        </Relationship>
605        <Relationship name="IsFoundOn" from="Role" to="Diagram" arity="MM" converse="IsLocationOf">
606          <DisplayInfo theme="web" caption="Is\nLocation\nOf"/>
607          <Notes>This relationship connects a role to the diagrams on which it appears. A diagram
608          always contains many roles. A role may appear on multiple diagrams.</Notes>
609      </Relationship>      </Relationship>
610      <Relationship name="IsLocatedIn" from="Feature" to="DnaSequence" arity="MM" converse="IsLocusFor">      <Relationship name="IsLocatedIn" from="Feature" to="DnaSequence" arity="MM" converse="IsLocusFor">
611        <DisplayInfo theme="seed" caption="Is\nLocated\nIn" fixed="1" row="11" col="6" />        <DisplayInfo theme="seed" caption="Is\nLocated\nIn" fixed="1" row="10" col="6"/>
612        <Notes>A feature is a set of DNA sequence fragments. Most features are a single contiquous        <Notes>A feature is a set of DNA sequence fragments. Most features are a single contiquous
613              fragment, so they are located in only one DNA sequence; however, fragments have a maximum              fragment, so they are located in only one DNA sequence; however, fragments have a maximum
614              length, so even a single contiguous feature may participate in this relationship multiple              length, so even a single contiguous feature may participate in this relationship multiple
# Line 748  Line 648 
648          </IndexFields>          </IndexFields>
649        </ToIndex>        </ToIndex>
650      </Relationship>      </Relationship>
651      <Relationship name="IsOwnerOf" from="Genome" to="Feature" arity="1M" converse="IsOwnedBy">      <Relationship name="IsDeterminedBy" from="FcEvidenceSet" to="Pairing" arity="MM" converse="Determines">
652        <DisplayInfo caption="Is\nOwned\nBy" theme="seed" fixed="1" row="10" col="6" />        <DisplayInfo theme="seed" caption="Determines"/>
653        <Notes>This relationship connects each feature to its parent genome.</Notes>        <Notes>A functional coupling evidence set exists because it has pairings in it, and this relationship
654                 connects the evidence set to its constituent pairings. A pairing cam belong to
655                 multiple evidence sets.</Notes>
656          <Fields>
657            <Field name="inverted" type="boolean">
658              <Notes>A pairing is an unordered pair of protein sequences, but its
659                     similarity to other pairings in an evidence set is ordered. Let (A,B) be
660                     a pairing and (X,Y) be another pairing in the same set. If this flag is
661                     FALSE, then (A =~ X) and (B =~ Y). If this flag is TRUE, then (A =~ Y) and
662                     (B =~ X).</Notes>
663            </Field>
664          </Fields>
665        </Relationship>
666        <Relationship name="IsFunctionOf" from="Role" to="Feature" arity="MM" converse="Targets">
667          <DisplayInfo theme="seed" fixed="1" row="7" col="4" caption="Is\nFunction\nOf"/>
668          <Notes>This relationship connects a role to the features that facilitate the role.
669    A role can be the function of multiple features, and a single feature may have
670    multiple roles.</Notes>
671      </Relationship>      </Relationship>
672      <Relationship name="IsMadeUpOf" from="Genome" to="DnaSequence" arity="1M" converse="MakesUp">      <Relationship name="IsMadeUpOf" from="Genome" to="DnaSequence" arity="1M" converse="MakesUp">
673        <DisplayInfo theme="nmpdr" caption="Is\nMade Up\nOf"/>        <DisplayInfo theme="nmpdr" caption="Is\nMade Up\nOf"/>
674        <Notes>This relationship connects each genome to the DNA sequences that make it up.</Notes>        <Notes>This relationship connects each genome to the DNA sequences that make it up.</Notes>
675      </Relationship>      </Relationship>
676      <Relationship name="Exposes" from="ProteinSequence" to="Structure" arity="MM" converse="IsExposedBy">      <Relationship name="IsAnnotatedBy" from="Feature" to="Annotation" arity="1M" converse="Annotates">
677        <DisplayInfo theme="web" caption="Is\nExposed\nBy"/>        <DisplayInfo theme="seed" caption="Is\nAnnotated\nBy" fixed="1" col="3" row="10"/>
678        <Notes>This relationship connects a protein sequence to the chemically active structures        <Notes>This relationship connects a feature to its annotations. A feature may have
679              on its surface. A protein sequence exposes many structures, and a particular structure  multiple annotations, but an annotation belongs to only one feature.</Notes>
680              may occur on many proteins.</Notes>      </Relationship>
681        <Relationship name="HasMember" from="Family" to="Feature" arity="1M" converse="IsMemberOf">
682          <DisplayInfo theme="seed" caption="Is\nMember\nOf" row="10" col="4" fixed="1"/>
683          <Notes>This relationship connects each feature family to its constituent
684                 features. A family always has many features, but a single feature can
685                 be found in at most one family.</Notes>
686      </Relationship>      </Relationship>
687      <Relationship name="Attracts" from="Structure" to="Compound" arity="MM" converse="IsAttractedTo">      <Relationship name="Attracts" from="Structure" to="Compound" arity="MM" converse="IsAttractedTo">
688        <DisplayInfo theme="web" caption="Is\nAttracted\nTo"/>        <DisplayInfo theme="web" row="1" col="2" fixed="1" caption="Is\nAttracted\nTo"/>
689        <Notes>This relationship connects a compound to the protein structures that attract it.        <Notes>This relationship connects a compound to the protein structures that attract it.
690              This is an incomplete relationship that exists to service drug targeting queries. Only              This is an incomplete relationship that exists to service drug targeting queries. Only
691              the attractions whose parameters have been determined through modeling or              the attractions whose parameters have been determined through modeling or
# Line 809  Line 731 
731          </IndexFields>          </IndexFields>
732        </ToIndex>        </ToIndex>
733      </Relationship>      </Relationship>
734        <Relationship name="Involves" from="Reaction" to="Compound" arity="MM" converse="IsInvolvedIn">
735          <DisplayInfo theme="web" caption="Is\nInvolved\nIn" fixed="1" row="2" col="2.5"/>
736          <Notes>This relationship connects a reaction to the compounds that participate in
737                it. A reaction involves many compounds, and a compound can be involved in many reactions.
738                The relationship attributes indicate whether a compound is a product or substrate of the
739                reaction, as well as its stoichiometry.</Notes>
740          <Fields>
741            <Field name="product" type="boolean">
742              <Notes>TRUE if the compound is a product of the reaction, FALSE if
743                        it is a substrate. When a reaction is written on paper in
744                        chemical notation, the substrates are left of the arrow and the
745                        products are to the right. Sorting on this field will cause
746                        the substrates to appear first, followed by the products. If the
747                        reaction is reversible, then the notion of substrates and products
748                        is not intuitive; however, a value here of FALSE still puts the
749                        compound left of the arrow and a value of TRUE still puts it to the
750                        right.</Notes>
751            </Field>
752            <Field name="stoichiometry" type="key-string">
753              <Notes>Number of molecules of the compound that participate in a
754                        single instance of the reaction. For example, if a reaction
755                        produces two water molecules, the stoichiometry of water for the
756                        reaction would be two. When a reaction is written on paper in
757                        chemical notation, the stoichiometry is the number next to the
758                        chemical formula of the compound.</Notes>
759            </Field>
760            <Field name="main" type="boolean">
761              <Notes>TRUE if this compound is one of the main participants in
762                        the reaction, else FALSE. It is permissible for none of the
763                        compounds in the reaction to be considered main, in which
764                        case this value would be FALSE for all of the relevant
765                        compounds.</Notes>
766            </Field>
767            <Field name="loc" type="key-string">
768              <Notes>An optional character string that indicates the relative
769                        position of this compound in the reaction's chemical formula. The
770                        location affects the way the compounds present as we cross the
771                        relationship from the reaction side. The product/substrate flag
772                        comes first, then the value of this field, then the main flag.
773                        The default value is an empty string; however, the empty string
774                        sorts first, so if this field is used, it should probably be
775                        used for every compound in the reaction.</Notes>
776            </Field>
777            <Field name="discriminator" type="int">
778              <Notes>A unique ID for this record. The discriminator does not
779                        provide any useful data, but it prevents identical records from
780                        being collapsed by the SELECT DISTINCT command used by ERDB to
781                        retrieve data.</Notes>
782            </Field>
783          </Fields>
784          <ToIndex>
785            <Notes>This index presents the compounds in the reaction in the
786                    order they should be displayed when writing it in chemical notation.
787                    All the substrates appear before all the products, and within that
788                    ordering, the main compounds appear first.</Notes>
789            <IndexFields>
790              <IndexField name="product" order="ascending"/>
791              <IndexField name="loc" order="ascending"/>
792              <IndexField name="main" order="descending"/>
793            </IndexFields>
794          </ToIndex>
795        </Relationship>
796        <Relationship name="Contains" from="Diagram" to="Compound" arity="MM" converse="IsContainedIn">
797          <DisplayInfo theme="web" fixed="1" caption="Is\nContained\nIn" row="2" col="3.5"/>
798          <Notes>This relationship indicates that a compound appears on a particular diagram.
799                The same compound can appear on many diagrams, and a diagram always contains many
800                compounds.</Notes>
801        </Relationship>
802        <Relationship name="IsContainedIn" from="Feature" to="MachineRole" arity="MM" converse="Contains">
803          <DisplayInfo theme="seed" caption="Is\nContained\nIn" row="8" col="5"/>
804          <Notes>This relationship connects a machine role to the features that occur in it. A feature
805        may occur in many machine roles and a machine role may contain many features. The subsystem
806        annotation process is essentially the maintenance of this relationship.</Notes>
807        </Relationship>
808        <Relationship name="IsRoleOf" from="Role" to="MachineRole" arity="1M" converse="HasRole">
809          <DisplayInfo caption="Is\nRole\nOf" theme="seed"/>
810          <Notes>This relationship connects a role to the machine roles that represent its
811          appearance in a molecular machine. A machine role has exactly one associated role,
812          but a role may be represented by many machine roles.</Notes>
813        </Relationship>
814      <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">      <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">
815        <DisplayInfo theme="web" caption="Has As\nTerminus"/>        <DisplayInfo theme="web" caption="Is\nTerminus\nFor"/>
816        <Notes>A terminus for a scenario is a compound that acts as its input or output. A        <Notes>A terminus for a scenario is a compound that acts as its input or output. A
817              compound can be the terminus for many scenarios, and a scenario will have many termini.              compound can be the terminus for many scenarios, and a scenario will have many termini.
818              The relationship attributes indicate whether the compound is an input to the scenario or              The relationship attributes indicate whether the compound is an input to the scenario or
819              an output. In some cases, there may be multiple alternative output groups. This is also              an output.</Notes>
             indicated by the attributes.</Notes>  
820        <Fields>        <Fields>
821          <Field name="group-number" type="int">          <Field name="group-number" type="int">
822            <Notes>The group number is 0 for an input compound; otherwise, it is the            <Notes>The group number is 0 for an input compound; 1, for an output compound, and 2 for
823                      number of the output group to which the compound belongs. Output groups                      an auxiliary compound. An ancillary compound is one that is produced by the
824                      represent alternative outputs for the scenario. A compound in multiple                      scenario, but is not the primary output.</Notes>
                     output groups will appear multiple times in this relationship.</Notes>  
825          </Field>          </Field>
826        </Fields>        </Fields>
827        <ToIndex>        <ToIndex>
# Line 832  Line 832 
832          </IndexFields>          </IndexFields>
833        </ToIndex>        </ToIndex>
834      </Relationship>      </Relationship>
835        <Relationship name="Exposes" from="ProteinSequence" to="Structure" arity="MM" converse="IsExposedBy">
836          <DisplayInfo theme="web" fixed="1" row="7" col="2" caption="Is\nExposed\nBy"/>
837          <Notes>This relationship connects a protein sequence to its structural representations. It is a
838      many-to-many relationship. Note that only some protein sequences have known structural representations.</Notes>
839        </Relationship>
840        <Relationship name="IsSubInstanceOf" from="Subsystem" to="Scenario" arity="1M" converse="Validates">
841          <DisplayInfo theme="seed" caption="Is Part\nInstance\nOf" fixed="1" row="1" col="7"/>
842          <Notes>This relationship connects a scenario to its subsystem it validates. A scenario
843                belongs to exactly one subsystem, but a subsystem may have multiple scenarios.</Notes>
844        </Relationship>
845      <Relationship name="Overlaps" from="Scenario" to="Diagram" arity="MM" converse="IncludesPartOf">      <Relationship name="Overlaps" from="Scenario" to="Diagram" arity="MM" converse="IncludesPartOf">
846        <DisplayInfo theme="web"/>        <DisplayInfo theme="web" fixed="1" row="2" col="5.5"/>
847        <Notes>A Scenario overlaps a diagram when the diagram displays a portion of the reactions        <Notes>A Scenario overlaps a diagram when the diagram displays a portion of the reactions
848              that make up the scenario. A scenario may overlap many diagrams, and a diagram may              that make up the scenario. A scenario may overlap many diagrams, and a diagram may
849              be include portions of many scenarios.</Notes>              be include portions of many scenarios.</Notes>
850      </Relationship>      </Relationship>
851      <Relationship name="HasParticipant" from="Scenario" to="Reaction" arity="MM" converse="ParticipatesIn">      <Relationship name="HasParticipant" from="Scenario" to="Reaction" arity="MM" converse="ParticipatesIn">
852        <DisplayInfo theme="web" caption="\nParticipates\nIn"/>        <DisplayInfo theme="web" caption="Has\nParticipant" row="2" col="4.5" fixed="1"/>
853        <Notes>A scenario consists of many participant reactions that convert the input compounds        <Notes>A scenario consists of many participant reactions that convert the input compounds
854              to output compounds. A single reaction may participate in many scenarios.</Notes>              to output compounds. A single reaction may participate in many scenarios.</Notes>
855          <Fields>
856            <Field name="type" type="int">
857              <Notes>Indicates the type of participaton. If 0, the reaction is in the main pathway of
858          the scenario. If 1, the reaction is necessary to make the model work but is not in the
859          subsystem. If 2, the reaction is part of the subsystem but should not be included in
860          the modelling process.</Notes>
861            </Field>
862          </Fields>
863          <FromIndex>
864            <Notes>This index presents the reactions in the scenario in order from
865    most important to least important.</Notes>
866            <IndexFields>
867              <IndexField name="type" order="ascending"/>
868            </IndexFields>
869          </FromIndex>
870      </Relationship>      </Relationship>
871      <Relationship name="IsValidatedBy" from="Subsystem" to="Scenario" arity="1M" converse="Validates">      <Relationship name="IsInPair" from="Feature" to="Pairing" arity="MM" converse="Contains">
872        <DisplayInfo theme="seed" caption="Is\nValidated\nBy"/>        <DisplayInfo theme="seed" caption="Is In\nPair"/>
873        <Notes>This relationship connects a scenario to the subsystem it validates. A scenario        <Notes>A pairing contains exactly two protein sequences. A protein sequence can
874              validates exactly one subsystem, but a subsystem may have multiple scenarios used for               belong to multiple pairings. When going from a protein sequence to its pairings,
875              validation.</Notes>               they are presented in alphabetical order by sequence key.</Notes>
876      </Relationship>      </Relationship>
877      <Relationship name="Concerns" from="Publication" to="ProteinSequence" arity="MM" converse="IsATopicOf">      <Relationship name="Concerns" from="Publication" to="ProteinSequence" arity="MM" converse="IsATopicOf">
878        <DisplayInfo theme="web"/>        <DisplayInfo theme="web" row="8" col="2" caption="Is A\nTopic\nOf" fixed="1"/>
879        <Notes>This relationship connects a publication to the protein sequences it        <Notes>This relationship connects a publication to the protein sequences it
880              describes.</Notes>              describes.</Notes>
881      </Relationship>      </Relationship>
882      <Relationship name="Identifies" from="EC" to="Role" arity="1M" converse="IsIdentifiedBy">      <Relationship name="IsTaxonomyOf" to="Genome" from="TaxonomicGrouping" arity="1M" converse="IsInTaxa">
883        <DisplayInfo theme="web"/>        <DisplayInfo theme="nmpdr" fixed="1" caption="Is In\nTaxa" row="9" col="8"/>
884        <Notes>This relationship connects an EC number code to its relevant roles. A role will        <Notes>A genome belongs to exactly one taxonomic grouping. A taxonomic grouping
885              only have one EC number, but an EC number can identify multiple roles.</Notes>    contains many genomes. Some taxonomic groupings do not contain any genomes. These
886      in fact contain other taxonomic groups.</Notes>
887        </Relationship>
888        <Relationship name="IsMachineOf" from="MolecularMachine" to="MachineRole" arity="1M" converse="IsRoleOf">
889          <DisplayInfo caption="Is\nMachine\nOf" theme="seed"/>
890          <Notes>This relationship connects a molecular machine to its various machine roles.
891          Each machine has many machine roles, but each machine role belongs to only one machine.</Notes>
892        </Relationship>
893        <Relationship name="IsSequenceFor" from="ProteinSequence" to="Identifier" arity="1M" converse="IsFeatureFor">
894          <DisplayInfo caption="Is\nSequence\nFor" theme="seed"/>
895          <Notes>This relationship connects a peg identifier to the protein sequence it produces (if any).
896                Only peg identifiers participate in this relationship. Identifiers that name RNAs,
897                operons, or other non-protein feature do not connect to protein sequences. A single
898                protein sequence will frequently have many identifiers.</Notes>
899        </Relationship>
900        <Relationship name="IncludesIdentifier" from="IdentifierSet" to="Identifier" arity="1M" converse="IsIncludedInSet">
901          <DisplayInfo theme="seed" caption="Includes" row="9.5" col="1.5"/>
902          <Notes>An identifier set contains many identifiers. If the set identifies a feature, then one of the identifiers
903      will be a feature ID. If the set identifies a protein sequence, then one of the identifiers will be the
904      MD5 hash key for the protein sequence.</Notes>
905      </Relationship>      </Relationship>
906    </Relationships>    </Relationships>
907    <Shapes>    <Shapes>
908        <Shape type="diamond" name="ConsistsOf" from="Variant" to="Role">
909          <DisplayInfo theme="neutral" caption="Belongs To" connected="1"/>
910          <Notes>This relationship is not physically implemented in the database. It is
911          implicit in the data for a variant. A variant contains a boolean expression that
912          describes the various combinations of roles it can contain.</Notes>
913        </Shape>
914        <Shape type="diamond" name="IsIdentifiedBy" from="Feature" to="Identifier">
915          <DisplayInfo theme="neutral" caption="Identifies" connected="1"/>
916          <Notes>This relationship is not physically implemented in the database. It is
917          implicit in the data for an identifier. If the identifiers is a FIG feature
918          ID, then it identifies that feature, as do all other identifiers in the same
919          identifier set.</Notes>
920        </Shape>
921    </Shapes>    </Shapes>
922  </Database>  </Database>

Legend:
Removed from v.1.1  
changed lines
  Added in v.1.2

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3