[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

View of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.53 - (download) (as text) (annotate)
Tue Apr 29 20:55:25 2008 UTC (11 years, 7 months ago) by parrello
Branch: MAIN
Changes since 1.52: +1 -1 lines
Fixed a typo.

<?xml version="1.0" encoding="utf-8" ?>
<Database>
    <Title>Sprout Genome and Subsystem Database</Title>
    <Notes>The Sprout database contains the genetic data for all complete organisms in the [[SeedEnvironment]].
    The data that is not in Sprout-- attributes, similarities, couplings-- is stored on external
    servers available to the Sprout software. The Sprout database is reloaded approximately once
    per month. There is significant redundancy in the Sprout database because it has been
    optimized for searching. In particular, the Feature table contains an extra copy of the
    feature's functional role and a list of possible search terms.</Notes>
    <Entities>
        <Entity name="Genome" keyType="name-string">
            <Notes>A [[Genome]] contains the sequence data for a particular individual organism.</Notes>
            <Fields>
                <Field name="genus" type="name-string">
                    <Notes>Genus of the relevant organism.</Notes>
                </Field>
                <Field name="species" type="name-string">
                    <Notes>Species of the relevant organism.</Notes>
                </Field>
                <Field name="unique-characterization" type="medium-string">
                    <Notes>The unique characterization identifies the particular organism instance from which the
                    genome is taken. It is possible to have in the database more than one genome for a
                    particular species, and every individual organism has variations in its DNA.</Notes>
                </Field>
                <Field name="version" type="name-string">
                    <Notes>version string for this genome, generally consisting of the genome ID followed
                    by a period and a string of digits.</Notes>
                </Field>
                <Field name="access-code" type="key-string">
                    <Notes>The access code field is deprecated. Its function has been replaced by
                    the account management system developed for the [[RapidAnnotationServer]].</Notes>
                </Field>
                <Field name="complete" type="boolean">
                    <Notes>TRUE if the genome is complete, else FALSE</Notes>
                </Field>
                <Field name="dna-size" type="counter">
                    <Notes>number of base pairs in the genome</Notes>
                </Field>
                <Field name="taxonomy" type="text">
                    <Notes>The taxonomy string contains the full [[Wikipedia:taxonomy]] of the organism, while individual elements
                    separated by semi-colons (and optional white space), starting with the domain and ending with
                    the disambiguated genus and species (which is the organism's scientific name plus an
                    identifying string).</Notes>
                </Field>
                <Field name="primary-group" type="name-string">
                    <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group
                    per organism (either based on the organism name or the default value =Supporting=). In general,
                    more data is kept on organisms in NMPDR groups than on supporting organisms.</Notes>
                </Field>
                <Field name="contigs" type="int">
                    <Notes>Number of contigs for this organism.</Notes>
                </Field>
                <Field name="pegs" type="int">
                    <Notes>Number of [[protein encoding genes]] for this organism</Notes>
                </Field>
                <Field name="rnas" type="int">
                    <Notes>Number of RNA features found for this organism.</Notes>
                </Field>
            </Fields>
            <Indexes>
                <Index>
                    <Notes>This index allows the applications to find all genomes associated with
                    a specific access code, so that a complete list of the genomes users can view
                    may be generated.</Notes>
                    <IndexFields>
                        <IndexField name="access-code" order="ascending" />
                        <IndexField name="genus" order="ascending" />
                        <IndexField name="species" order="ascending" />
                        <IndexField name="unique-characterization" order="ascending" />
                    </IndexFields>
                </Index>
                <Index>
                    <Notes>This index allows the applications to find all genomes associated with
                    a specific primary (NMPDR) group.</Notes>
                    <IndexFields>
                        <IndexField name="primary-group" order="ascending" />
                        <IndexField name="genus" order="ascending" />
                        <IndexField name="species" order="ascending" />
                        <IndexField name="unique-characterization" order="ascending" />
                    </IndexFields>
                </Index>
                <Index>
                    <Notes>This index allows the applications to find all genomes for a particular
                    species.</Notes>
                    <IndexFields>
                        <IndexField name="genus" order="ascending" />
                        <IndexField name="species" order="ascending" />
                        <IndexField name="unique-characterization" order="ascending" />
                    </IndexFields>
                </Index>
            </Indexes>
        </Entity>
        <Entity name="CDD" keyType="key-string">
            <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit
            on a feature's protein. The ID is six-digit string assigned by the public
            [[http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml Conserved Domain Database]]. A CDD
            can occur on multiple features and a feature generally has multiple CDDs.</Notes>
        </Entity>
        <Entity name="Source" keyType="medium-string">
            <Notes>A _source_ describes a place from which genome data was taken. This can be an organization
            or a paper citation.</Notes>
            <Fields>
                <Field name="URL" type="string" relation="SourceURL">
                    <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>
                </Field>
                <Field name="description" type="text">
                    <Notes>Description of the source. The description can be a street address or a citation.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="Contig" keyType="name-string">
            <Notes>A _contig_ is a contiguous run of residues. The contig's ID consists of the
            genome ID followed by a name that identifies which contig this is for the parent genome. As
            is the case with all keys in this database, the individual components are separated by a
            period. A contig can contain over a million residues. For performance reasons, therefore,
            the contig is split into multiple pieces called _sequences_. The sequences
            contain the characters that represent the residues as well as data on the quality of
            the residue identification.</Notes>
        </Entity>
        <Entity name="Sequence" keyType="name-string">
            <Notes>A _sequence_ is a continuous piece of a contig. Contigs are split into
            sequences so that we don't have to have the entire contig in memory when we are
            manipulating it. The key of the sequence is the contig ID followed by the index of
            the begin point.</Notes>
            <Fields>
                <Field name="sequence" type="text">
                    <Notes>String consisting of the residues (base pairs). Each residue is described by a single
                    character in the string.</Notes>
                </Field>
                <Field name="quality-vector" type="text">
                    <Notes>String describing the quality data for each base pair. Individual values will
                    be separated by periods. The value represents negative exponent of the probability
                    of error. Thus, for example, a quality of 30 indicates the probability of error is
                    10^-30. A higher quality number indicates a better chance of a correct match. It is
                    possible that the quality data is not known for a sequence. If that is the case, the
                    quality vector will contain the string =unknown=.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="Feature" keyType="id-string">
            <Notes>A _feature_ (sometimes also called a [[gene]]) is a part of a genome that is of special interest. Features
            may be spread across multiple contigs of a genome, but never across more than
            one genome. Features can be assigned to roles via spreadsheet cells,
            and are the targets of annotation. Each feature in the database has a unique [[FigId]].</Notes>
            <Fields>
                <Field name="feature-type" type="id-string">
                    <Notes>Code indicating the type of this feature. Among the codes currently
                    supported are =peg= for a [[protein encoding gene]], =bs= for a
                    binding site, =opr= for an operon, and so forth.</Notes>
                </Field>
                <Field name="translation" type="text" relation="FeatureTranslation">
                    <Notes>_(optional)_ A translation of this feature's residues into character
                    codes, formed by concatenating the pieces of the feature together. For a
                    [[protein encoding gene]], the translation contains protein characters. For other types
                    it contains DNA characters.</Notes>
                </Field>
                <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
                    <Notes>Upstream sequence for the feature. This includes residues preceding the feature as
                    well as some of the feature's initial residues.</Notes>
                </Field>
                <Field name="assignment" type="text">
                    <Notes>Default functional assignment for this feature.</Notes>
                </Field>
                <Field name="active" type="boolean">
                    <Notes>(This field is deprecated.) TRUE if this feature is still considered valid,
                    FALSE if it has been logically deleted.</Notes>
                </Field>
                <Field name="assignment-maker" type="name-string">
                    <Notes>name of the user who made the functional assignment</Notes>
                </Field>
                <Field name="assignment-quality" type="char">
                    <Notes>quality of the functional assignment, usually a space, but may be W (indicating weak) or X
                    (indicating experimental)</Notes>
                </Field>
                <Field name="keywords" type="text" searchable="1">
                    <Notes>This is a list of search keywords for the feature. It includes the
                    functional assignment, subsystem roles, and special properties.</Notes>
                </Field>
                <Field name="link" type="text" relation="FeatureLink">
                    <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
                    links are to other websites that have useful about the gene that the feature represents, and
                    are coded as raw HTML, using &lt;a href="_link_"&gt;_text_&lt;/a&gt; notation.</Notes>
                </Field>
                <Field name="conservation" type="float" relation="FeatureConservation">
                    <Notes>_(optional)_ A number between 0 and 1 that indicates the degree to which this feature's DNA is
                    conserved in related genomes. A value of 1 indicates perfect conservation. A value less
                    than 1 is a reflection of the degree to which gap characters interfere in the alignment
                    between the feature and its close relatives.</Notes>
                </Field>
                <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
                    <Notes>A value indicating the essentiality of the feature, coded as HTML. In most
                    cases, this will be a word describing whether the essentiality is confirmed (essential)
                    or potential (potential-essential), hyperlinked to the document from which the
                    essentiality was curated. If a feature is not essential, this field will have no
                    values; otherwise, it may have multiple values.</Notes>
                </Field>
                <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">
                    <Notes>A value indicating the virulence of the feature, coded as HTML. In most
                    cases, this will be a phrase or SA number hyperlinked to the document from which
                    the virulence information was curated. If the feature is not virulent, this field
                    will have no values; otherwise, it may have multiple values.</Notes>
                </Field>
                <Field name="cello" type="name-string">
                    <Notes>The cello value specifies the expected location of the protein: cytoplasm,
                    cell wall, inner membrane, and so forth.</Notes>
                </Field>
                <Field name="iedb" type="text" relation="FeatureIEDB" special="property_search">
                    <Notes>A value indicating whether or not the feature can be found in the
                    Immune Epitope Database. If the feature has not been matched to that database,
                    this field will have no values. Otherwise, it will have an epitope name and/or
                    sequence, hyperlinked to the database.</Notes>
                </Field>
                <Field name="location-string" type="text">
                    <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location
                    strings. This gives us a fast mechanism for extracting the feature location. Otherwise,
                    we have to painstakingly paste together the [[#IsLocatedIn]] records, which are themselves
                    designed to help look for features in a particular region rather than to find the location
                    of a feature.</Notes>
                </Field>
            </Fields>
            <Indexes>
                <Index>
                    <Notes>This index allows us to locate a feature by its CELLO value.</Notes>
                    <IndexFields>
                        <IndexField name="cello" order="ascending" />
                    </IndexFields>
                </Index>
            </Indexes>
        </Entity>
        <Entity name="FeatureAlias" keyType="medium-string">
            <Notes>Alternative names for features. A feature can have many aliases. In general,
            each alias corresponds to only one feature, but there are many exceptions to this rule.</Notes>
        </Entity>
        <Entity name="SproutUser" keyType="name-string">
            <Notes>A _user_ is a person who can make annotations and view data in the database. The
            user object is keyed on the user's login name.</Notes>
            <Fields>
                <Field name="description" type="string">
                    <Notes>Full name or description of this user.</Notes>
                </Field>
                <Field name="access-code" type="key-string" relation="UserAccess">
                    <Notes>This field is deprecated.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="SynonymGroup" keyType="id-string">
            <Notes>A _synonym group_ represents a group of features. Features that represent substantially
            identical proteins or DNA sequences are mapped to the same synonym group, and this information is
            used to expand similarities.</Notes>
        </Entity>
        <Entity name="Role" keyType="string">
            <Notes>A _role_ describes a biological function that may be fulfilled by a feature.
            One of the main goals of the database is to record the roles of the various features.</Notes>
        </Entity>
        <Entity name="RoleEC" keyType="string">
            <Notes>EC code for a role.</Notes>
        </Entity>
        <Entity name="Annotation" keyType="name-string">
            <Notes>An _annotation_ contains supplementary information about a feature. The most
            important type of annotation is the assignment of a [[functional role]]; however,
            other types of annotations are also possible.</Notes>
            <Fields>
                <Field name="time" type="date">
                    <Notes>Date and time of the annotation.</Notes>
                </Field>
                <Field name="annotation" type="text">
                    <Notes>Text of the annotation.</Notes>
                </Field>
            </Fields>
            <Indexes>
                <Index>
                    <Notes>This index allows the user to find recent annotations.</Notes>
                    <IndexFields>
                        <IndexField name="time" order="descending" />
                    </IndexFields>
                </Index>
            </Indexes>
        </Entity>
        <Entity name="Reaction" keyType="key-string">
            <Notes>A _reaction_ is a chemical process catalyzed by a protein. The reaction ID
            is generally a small number preceded by a letter.</Notes>
            <Fields>
                <Field name="url" type="string" relation="ReactionURL">
                    <Notes>HTML string containing a link to a web location that describes the
                    reaction. This field is optional.</Notes>
                </Field>
                <Field name="rev" type="boolean">
                    <Notes>TRUE if this reaction is reversible, else FALSE</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="Compound" keyType="name-string">
            <Notes>A _compound_ is a chemical that participates in a reaction.
            All compounds have a unique ID and may also have one or more names.</Notes>
            <Fields>
                <Field name="label" type="string">
                    <Notes>Name used in reaction display strings. This is the same as the name
                    possessing a priority of 1, but it is placed here to speed up the query
                    used to create the display strings.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="CompoundName" keyType="string">
            <Notes>A _compound name_ is a common name for the chemical represented by a
            compound.</Notes>
        </Entity>
        <Entity name="CompoundCAS" keyType="name-string">
            <Notes>This entity represents the [[http://www.cas.org/ Chemical Abstract Service]] ID for a
            compound. Each Compound has at most one CAS ID.</Notes>
        </Entity>
        <Entity name="Subsystem" keyType="string">
            <Notes>A _subsystem_ is a collection of roles that work together in a cell. Identification of subsystems
            is an important tool for recognizing parallel genetic features in different organisms. See also
            [[Subsystems Approach]] and [[Subsystem]].</Notes>
            <Fields>
                <Field name="curator" type="string">
                    <Notes>Name of the person currently in charge of the subsystem.</Notes>
                </Field>
                <Field name="notes" type="text">
                    <Notes>Descriptive notes about the subsystem.</Notes>
                </Field>
                <Field name="description" type="text">
                    <Notes>Description of the subsystem's function.</Notes>
                </Field>
                <Field name="classification" type="string" relation="SubsystemClass">
                    <Notes>Classification string, colon-delimited. This string organizes the
                    subsystems into a hierarchy.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="RoleSubset" keyType="string">
            <Notes>A _role subset_ is a named collection of roles in a particular subsystem. The
            subset names are generally very short, non-unique strings. The ID of the parent
            subsystem is prefixed to the subset ID in order to make it unique.</Notes>
        </Entity>
        <Entity name="GenomeSubset" keyType="string">
            <Notes>A _genome subset_ is a named collection of genomes that participate
            in a particular subsystem. The subset names are generally very short, non-unique
            strings. The ID of the parent subsystem is prefixed to the subset ID in order
            to make it unique.</Notes>
        </Entity>
        <Entity name="SSCell" keyType="hash-string">
            <Notes>Part of the process of [[SubsystemsApproach][subsystem annotation]] of [[features]]
            is creating a spreadsheet of genomes and roles to which features are assigned. A _spreadsheet
            cell_ represents one of the positions on the spreadsheet.</Notes>
        </Entity>
        <Entity name="Property" keyType="int">
            <Notes>A _property_ is a type of assertion that could be made about the properties of
            a particular feature. Each property instance is a key/value pair and can be associated
            with many different features. Conversely, a feature can be associated with many key/value
            pairs, even some that notionally contradict each other. For example, there can be evidence
            that a feature is essential to the organism's survival and evidence that it is superfluous.</Notes>
            <Fields>
                <Field name="property-name" type="name-string">
                    <Notes>Name of this property.</Notes>
                </Field>
                <Field name="property-value" type="string">
                    <Notes>Value associated with this property. For each property
                    name, there must by a property record for all of its possible
                    values.</Notes>
                </Field>
            </Fields>
            <Indexes>
                <Index>
                    <Notes>This index enables the application to find all values for a specified property
                    name, or any given name/value pair.</Notes>
                    <IndexFields>
                        <IndexField name="property-name" order="ascending" />
                        <IndexField name="property-value" order="ascending" />
                    </IndexFields>
                </Index>
            </Indexes>
        </Entity>
        <Entity name="Diagram" keyType="name-string">
            <Notes>A functional diagram describes a network chemical reactions, often comprising a single
            subsystem. A diagram is identified by a short name and contains a longer descriptive name.
            The actual diagram shows which functional roles guide the reactions along with the inputs
            and outputs; the database, however, only indicates which roles belong to a particular
            diagram's map.</Notes>
            <Fields>
                <Field name="name" type="text">
                    <Notes>Descriptive name of this diagram.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="ExternalAliasOrg" keyType="name-string">
            <Notes>An external alias is a feature name for a functional assignment that is not a
            FIG ID. Functional assignments for external aliases are kept in a separate section of
            the database. This table contains a description of the relevant organism for an
            external alias functional assignment.</Notes>
                <Fields>
                    <Field name="org" type="text">
                        <Notes>Descriptive name of the target organism for this external alias.</Notes>
                    </Field>
                </Fields>
        </Entity>
        <Entity name="ExternalAliasFunc" keyType="name-string">
            <Notes>An external alias is a feature name for a functional assignment that is not a
            FIG ID. Functional assignments for external aliases are kept in a separate section of
            the database. This table contains the functional role for the external alias functional
            assignment.</Notes>
                <Fields>
                    <Field name="func" type="text">
                        <Notes>Functional role for this external alias.</Notes>
                    </Field>
                </Fields>
        </Entity>
        <Entity name="Family" keyType="id-string">
            <Notes>A _family_ (also called a [[FigFam]]) is a group of homologous features believed to have
            the same function. Families provide a mechanism for verifying the accuracy of functional assignments
            and are also used in [[Rapid Annotation]] and in determining phylogenetic trees.</Notes>
            <Fields>
                <Field name="function" type="text">
                    <Notes>The functional assignment expected for all PEGs in this family.</Notes>
                </Field>
                <Field name="size" type="int">
                    <Notes>The number of proteins in this family. This may be larger than the
                    number of PEGs included in the family, since the family may also contain external
                    IDs.</Notes>
                </Field>
            </Fields>
        </Entity>
        <Entity name="PDB" keyType="id-string">
            <Notes>A PDB is a protein data bank entry containing information that can be used
            to determine the shape of the protein and the energies required to dock with it.
            The ID is the four-character name used on the [[http://www.rcsb.org PDB web site]].</Notes>
            <Fields>
                <Field name="docking-count" type="int">
                    <Notes>The number of ligands that have been docked against this PDB.</Notes>
                </Field>
            </Fields>
            <Indexes>
                <Index>
                    <IndexFields>
                        <IndexField name="docking-count" order="descending" />
                        <IndexField name="id" order="ascending" />
                    </IndexFields>
                </Index>
            </Indexes>
        </Entity>
        <Entity name="Ligand" keyType="id-string">
            <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.
            The ID of the ligand is an 8-digit ID number in the [[http://zinc.docking.org ZINC database]].</Notes>
            <Fields>
                <Field name="name" type="long-string">
                    <Notes>Chemical name of this ligand.</Notes>
                </Field>
            </Fields>
        </Entity>
    </Entities>
    <Relationships>
        <Relationship name="IsPresentOnProteinOf" from="CDD" to="Feature" arity="MM">
            <Notes>This relationship connects a feature to its CDD protein domains. The
            match score is included as intersection data.</Notes>
            <Fields>
                <Field name="score" type="float">
                    <Notes>This is the match score between the feature and the CDD. A
                    lower score is a better match.</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <IndexFields>
                    <IndexField name="score" order="ascending" />
                </IndexFields>
            </FromIndex>
        </Relationship>
        <Relationship name="IsIdentifiedByCAS" from="Compound" to="CompoundCAS" arity="MM">
            <Notes>Relates a compound's CAS ID to the compound itself. Every CAS ID is
            associated with a compound, and some are associated with two compounds, but not
            all compounds have CAS IDs.</Notes>
        </Relationship>
        <Relationship name="IsIdentifiedByEC" from="Role" to="RoleEC" arity="MM">
            <Notes>Relates a role to its EC number. Every EC number is associated with a
            role, but not all roles have EC numbers.</Notes>
        </Relationship>
        <Relationship name="IsAliasOf" from="FeatureAlias" to="Feature" arity="MM">
            <Notes>Connects an alias to the feature it represents. Every alias connects
            to at least 1 feature, and a feature connects to many aliases.</Notes>
        </Relationship>
        <Relationship name="HasCompoundName" from="Compound" to="CompoundName" arity="MM">
            <Notes>Connects a compound to its names. A compound generally has several
            names</Notes>
            <Fields>
                <Field name="priority" type="int">
                    <Notes>Priority of this name, with 1 being the highest priority, 2
                    the next highest, and so forth.</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <Notes>This index enables the application to view the names of a compound
                in priority order.</Notes>
                <IndexFields>
                    <IndexField name="priority" order="ascending" />
                </IndexFields>
            </FromIndex>
        </Relationship>
        <Relationship name="IsProteinForFeature" from="PDB" to="Feature" arity="MM">
            <Notes>Relates a PDB to features that produce highly similar proteins.</Notes>
            <Fields>
                <Field name="score" type="float">
                    <Notes>Similarity score for the comparison between the feature and
                    the PDB protein. A lower score indicates a better match.</Notes>
                </Field>
                <Field name="start-location" type="int">
                    <Notes>Starting location within the feature of the matching region.</Notes>
                </Field>
                <Field name="end-location" type="int">
                    <Notes>Ending location within the feature of the matching region.</Notes>
                </Field>
            </Fields>
            <ToIndex>
                <Notes>This index enables the application to view the PDBs of a
                feature in order from the closest match to the furthest.</Notes>
                <IndexFields>
                    <IndexField name="score" order="ascending" />
                </IndexFields>
            </ToIndex>
            <FromIndex>
                <Notes>This index enables the application to view the features of
                a PDB in order from the closest match to the furthest.</Notes>
                <IndexFields>
                    <IndexField name="score" order="ascending" />
                </IndexFields>
            </FromIndex>
        </Relationship>
        <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">
            <Notes>Indicates that a [[docking result]] exists between a PDB and a ligand. The
            docking result describes the energy required for the ligand to dock with
            the protein described by the PDB. A lower energy indicates the ligand has a
            good chance of disabling the protein. At the current time, only the best
            docking results are kept.</Notes>
            <Fields>
                <Field name="reason" type="id-string">
                    <Notes>Indication of the reason for determining the docking result.
                    A value of =Random= indicates the docking was attempted as a part
                    of a random survey used to determine the docking characteristics of the
                    PDB. A value of =Rich= indicates the docking was attempted because
                    a low-energy docking result was predicted for the ligand with respect
                    to the PDB.</Notes>
                </Field>
                <Field name="tool" type="id-string">
                    <Notes>Name of the tool used to produce the docking result.</Notes>
                </Field>
                <Field name="total-energy" type="float">
                    <Notes>Total energy required for the ligand to dock with the PDB
                    protein, in kcal/mol. A negative value means energy is released.</Notes>
                </Field>
                <Field name="vanderwalls-energy" type="float">
                    <Notes>Docking energy in kcal/mol that results from the geometric fit
                    (Van der Waals force) between the PDB and the ligand.</Notes>
                </Field>
                <Field name="electrostatic-energy" type="float">
                    <Notes>Docking energy in kcal/mol that results from the movement of
                    electrons (electrostatic force) between the PDB and the ligand.</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <Notes>This index enables the application to view a PDB's docking results from
                the lowest energy (best docking) to highest energy (worst docking).</Notes>
                <IndexFields>
                    <IndexField name="total-energy" order="ascending" />
                </IndexFields>
            </FromIndex>
            <ToIndex>
                <Notes>This index enables the application to view a ligand's docking results from
                the lowest energy (best docking) to highest energy (worst docking).</Notes>
            </ToIndex>
        </Relationship>
        <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">
            <Notes>This relationship connects a protein family to all of its PEGs and connects
            each PEG to all of its protein families.</Notes>
        </Relationship>
        <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="MM">
            <Notes>This relation connects a synonym group to the features that make it
            up.</Notes>
        </Relationship>
        <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M">
            <Notes>This relationship connects a genome to all of its features. This
            relationship is redundant in a sense, because the genome ID is part
            of the feature ID; however, it makes the creation of certain queries more
            convenient because you can drag in filtering information for a feature's
            genome.</Notes>
            <Fields>
                <Field name="type" type="key-string">
                    <Notes>Feature type (eg. peg, rna)</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <Notes>This index enables the application to view the features of a
                Genome sorted by type.</Notes>
                <IndexFields>
                    <IndexField name="type" order="ascending" />
                </IndexFields>
            </FromIndex>
        </Relationship>
        <Relationship name="HasContig" from="Genome" to="Contig" arity="1M">
            <Notes>This relationship connects a genome to the contigs that contain the actual genetic
            information.</Notes>
        </Relationship>
        <Relationship name="ComesFrom" from="Genome" to="Source" arity="MM">
            <Notes>This relationship connects a genome to the sources that mapped it. A genome can
            come from a single source or from a cooperation among multiple sources.</Notes>
        </Relationship>
        <Relationship name="IsMadeUpOf" from="Contig" to="Sequence" arity="1M">
            <Notes>A contig is stored in the database as an ordered set of sequences. By splitting the
            contig into sequences, we get a performance boost from only needing to keep small portions
            of a contig in memory at any one time. This relationship connects the contig to its
            constituent sequences.</Notes>
            <Fields>
                <Field name="len" type="int">
                    <Notes>Length of the sequence.</Notes>
                </Field>
                <Field name="start-position" type="int">
                    <Notes>Index (1-based) of the point in the contig where this
                    sequence starts.</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <Notes>This index enables the application to find all of the sequences in
                a contig in order, and makes it easier to find a particular residue section.</Notes>
                <IndexFields>
                    <IndexField name="start-position" order="ascending" />
                    <IndexField name="len" order="ascending" />
                </IndexFields>
            </FromIndex>
        </Relationship>
        <Relationship name="IsTargetOfAnnotation" from="Feature" to="Annotation" arity="1M">
            <Notes>This relationship connects a feature to its annotations.</Notes>
        </Relationship>
        <Relationship name="MadeAnnotation" from="SproutUser" to="Annotation" arity="1M">
            <Notes>This relationship connects an annotation to the user who made it.</Notes>
        </Relationship>
        <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">
            <Notes>This relationship connects subsystems to the genomes that use
            it. If the subsystem has been curated for the genome, then the subsystem's roles will also be
            connected to the genome features through the *SSCell* object.</Notes>
            <Fields>
                <Field name="variant-code" type="key-string">
                    <Notes>Code indicating the subsystem variant to which this
                    genome belongs. Each subsystem can have multiple variants. A variant
                    code of =-1= indicates that the genome does not have a functional
                    variant of the subsystem. A variant code of =0= indicates that
                    the genome's participation is considered iffy.</Notes>
                </Field>
            </Fields>
            <ToIndex>
                <Notes>This index enables the application to find all of the genomes using
                a subsystem in order by variant code, which is how we wish to display them
                in the spreadsheets.</Notes>
                <IndexFields>
                    <IndexField name="variant-code" order="ascending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
        <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">
            <Notes>This relationship connects roles to the subsystems that implement them. </Notes>
            <Fields>
                <Field name="abbr" type="name-string">
                    <Notes>Abbreviated name for the role, generally non-unique, but useful
                    in column headings for HTML tables.</Notes>
                </Field>
                <Field name="column-number" type="int">
                    <Notes>Column number for this role in the specified subsystem's
                    spreadsheet.</Notes>
                </Field>
            </Fields>
            <ToIndex>
                <Notes>This index enables the application to see the subsystem roles
                in column order. The ordering of the roles is usually significant,
                so it is important to preserve it.</Notes>
                <IndexFields>
                    <IndexField name="column-number" order="ascending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
        <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">
            <Notes>This relationship connects a subsystem's spreadsheet cell to the
            genome for the spreadsheet column.</Notes>
        </Relationship>
        <Relationship name="IsRoleOf" from="Role" to="SSCell" arity="1M">
            <Notes>This relationship connects a subsystem's spreadsheet cell to the
            role for the spreadsheet row.</Notes>
        </Relationship>
        <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">
            <Notes>This relationship connects a subsystem's spreadsheet cell to the
            features assigned to it.</Notes>
            <Fields>
                <Field name="cluster-number" type="int">
                    <Notes>ID of this feature's cluster. Clusters represent families of
                    related proteins participating in a subsystem.</Notes>
                </Field>
            </Fields>
        </Relationship>
        <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM">
            <Notes>This relationship connects a reaction to the compounds that participate
            in it.</Notes>
            <Fields>
                <Field name="product" type="boolean">
                    <Notes>TRUE if the compound is a product of the reaction, FALSE if
                    it is a substrate. When a reaction is written on paper in
                    chemical notation, the substrates are left of the arrow and the
                    products are to the right. Sorting on this field will cause
                    the substrates to appear first, followed by the products. If the
                    reaction is reversible, then the notion of substrates and products
                    is not at intuitive; however, a value here of FALSE still puts the
                    compound left of the arrow and a value of TRUE still puts it to the
                    right.</Notes>
                </Field>
                <Field name="stoichiometry" type="key-string">
                    <Notes>Number of molecules of the compound that participate in a
                    single instance of the reaction. For example, if a reaction
                    produces two water molecules, the stoichiometry of water for the
                    reaction would be two. When a reaction is written on paper in
                    chemical notation, the stoichiometry is the number next to the
                    chemical formula of the compound.</Notes>
                </Field>
                <Field name="main" type="boolean">
                    <Notes>TRUE if this compound is one of the main participants in
                    the reaction, else FALSE. It is permissible for none of the
                    compounds in the reaction to be considered main, in which
                    case this value would be FALSE for all of the relevant
                    compounds.</Notes>
                </Field>
                <Field name="loc" type="key-string">
                    <Notes>An optional character string that indicates the relative
                    position of this compound in the reaction's chemical formula. The
                    location affects the way the compounds present as we cross the
                    relationship from the reaction side. The product/substrate flag
                    comes first, then the value of this field, then the main flag.
                    The default value is an empty string; however, the empty string
                    sorts first, so if this field is used, it should probably be
                    used for every compound in the reaction.</Notes>
                </Field>
                <Field name="discriminator" type="int">
                    <Notes>A unique ID for this record. The discriminator does not
                    provide any useful data, but it prevents identical records from
                    being collapsed by the SELECT DISTINCT command used by ERDB to
                    retrieve data.</Notes>
                </Field>
            </Fields>
            <ToIndex>
                <Notes>This index presents the compounds in the reaction in the
                order they should be displayed when writing it in chemical notation.
                All the substrates appear before all the products, and within that
                ordering, the main compounds appear first.</Notes>
                <IndexFields>
                    <IndexField name="product" order="ascending" />
                    <IndexField name="loc" order="ascending" />
                    <IndexField name="main" order="descending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
        <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">
            <Notes>This relationship connects a feature to the contig segments that work together
            to effect it. The segments are numbered sequentially starting from 1. The database is
            required to place an upper limit on the length of each segment. If a segment is longer
            than the maximum, it can be broken into smaller bits.  The upper limit enables applications
            to locate all features that contain a specific residue. For example, if the upper limit
            is 100 and we are looking for a feature that contains residue 234 of contig *ABC*, we
            can look for features with a begin point between 135 and 333. The results can then be
            filtered by direction and length of the segment.</Notes>
            <Fields>
                <Field name="locN" type="int">
                    <Notes>Sequence number of this segment.</Notes>
                </Field>
                <Field name="beg" type="int">
                    <Notes>Index (1-based) of the first residue in the contig that
                    belongs to the segment.</Notes>
                </Field>
                <Field name="len" type="int">
                    <Notes>Number of residues in the segment. A length of 0 identifies
                    a specific point between residues. This is the point before the residue if the direction
                    is forward and the point after the residue if the direction is backward.</Notes>
                </Field>
                <Field name="dir" type="char">
                    <Notes>Direction of the segment: =+= if it is forward and
                    =-= if it is backward.</Notes>
                </Field>
            </Fields>
            <FromIndex>
                <Notes>This index allows the application to find all the segments of a feature in
                the proper order.</Notes>
                <IndexFields>
                    <IndexField name="locN" order="ascending" />
                </IndexFields>
            </FromIndex>
            <ToIndex>
                <Notes>This index is the one used by applications to find all the feature
                segments that contain a specific residue.</Notes>
                <IndexFields>
                    <IndexField name="beg" order="ascending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
        <Relationship name="HasProperty" from="Feature" to="Property" arity="MM">
            <Notes>This relationship connects a feature to its known property values.
            The relationship contains text data that indicates the paper or organization
            that discovered evidence that the feature possesses the property. So, for
            example, if two papers presented evidence that a feature is essential,
            there would be an instance of this relationship for both.</Notes>
            <Fields>
                <Field name="evidence" type="text">
                    <Notes>URL or citation of the paper or
                    institution that reported evidence of the relevant feature possessing
                    the specified property value.</Notes>
                </Field>
            </Fields>
        </Relationship>
        <Relationship name="RoleOccursIn" from="Role" to="Diagram" arity="MM">
            <Notes>This relationship connects a role to the diagrams on which it
            appears. A role frequently identifies an enzyme, and can appear in many
            diagrams. A diagram generally contains many different roles.</Notes>
        </Relationship>
        <Relationship name="HasSSCell" from="Subsystem" to="SSCell" arity="1M">
            <Notes>This relationship connects a subsystem to the spreadsheet cells
            used to analyze and display it. The cells themselves can be thought of
            as a grid with Roles on one axis and Genomes on the other. The
            various features of the subsystem are then assigned to the cells.</Notes>
        </Relationship>
        <Relationship name="IsTrustedBy" from="SproutUser" to="SproutUser" arity="MM">
            <Notes>This relationship identifies the users trusted by each
            particular user. When viewing functional assignments, the
            assignment displayed is the most recent one by a user trusted
            by the current user. The current user implicitly trusts himself.
            If no trusted users are specified in the database, the user
            also implicitly trusts the user =FIG=.</Notes>
        </Relationship>
        <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">
            <Notes>This relationship connects a role subset to the roles that it covers.
            A subset is, essentially, a named group of roles belonging to a specific
            subsystem, and this relationship effects that. Note that will a role
            may belong to many subsystems, a subset belongs to only one subsystem,
            and all roles in the subset must have that subsystem in common.</Notes>
        </Relationship>
        <Relationship name="ConsistsOfGenomes" from="GenomeSubset" to="Genome" arity="MM">
            <Notes>This relationship connects a subset to the genomes that it covers.
            A subset is, essentially, a named group of genomes participating in a specific
            subsystem, and this relationship effects that. Note that while a genome
            may belong to many subsystems, a subset belongs to only one subsystem,
            and all genomes in the subset must have that subsystem in common.</Notes>
        </Relationship>
        <Relationship name="HasRoleSubset" from="Subsystem" to="RoleSubset" arity="1M">
            <Notes>This relationship connects a subsystem to its constituent
            role subsets. Note that some roles in a subsystem may not belong to a
            subset, so the relationship between roles and subsystems cannot be
            derived from the relationships going through the subset.</Notes>
        </Relationship>
        <Relationship name="HasGenomeSubset" from="Subsystem" to="GenomeSubset" arity="1M">
            <Notes>This relationship connects a subsystem to its constituent
            genome subsets. Note that some genomes in a subsystem may not belong to a
            subset, so the relationship between genomes and subsystems cannot be
            derived from the relationships going through the subset.</Notes>
        </Relationship>
        <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM">
            <Notes>This relationship connects a role to the reactions it catalyzes.
            The purpose of a role is to create proteins that trigger certain
            chemical reactions. A single reaction can be triggered by many roles,
            and a role can trigger many reactions.</Notes>
        </Relationship>
        <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">
            <Notes>This relationship connects a feature to the subsystems in which it
            participates. This is technically redundant information, but it is used
            so often that it gets its own table for performance reasons.</Notes>
            <Fields>
                <Field name="genome" type="name-string">
                    <Notes>ID of the genome containing the feature</Notes>
                </Field>
                <Field name="type" type="key-string">
                    <Notes>Feature type (eg. peg, rna)</Notes>
                </Field>
            </Fields>
            <ToIndex>
                <Notes>This index enables the application to view the features of a
                subsystem sorted by genome and feature type.</Notes>
                <IndexFields>
                    <IndexField name="genome" order="ascending" />
                    <IndexField name="type" order="ascending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
    </Relationships>
</Database>

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3