[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.9, Sun Aug 14 23:59:03 2005 UTC revision 1.54, Wed May 7 23:16:47 2008 UTC
# Line 1  Line 1 
1  <?xml version="1.0" encoding="utf-8" ?>  <?xml version="1.0" encoding="utf-8" ?>
2  <Database>  <Database>
3      <Title>Sprout Genome and Subsystem Database</Title>      <Title>Sprout Genome and Subsystem Database</Title>
4        <Notes>The Sprout database contains the genetic data for all complete organisms in the [[SeedEnvironment]].
5        The data that is not in Sprout-- attributes, similarities, couplings-- is stored on external
6        servers available to the Sprout software. The Sprout database is reloaded approximately once
7        per month. There is significant redundancy in the Sprout database because it has been
8        optimized for searching. In particular, the Feature table contains an extra copy of the
9        feature's functional role and a list of possible search terms.</Notes>
10      <Entities>      <Entities>
11          <Entity name="Genome" keyType="name-string">          <Entity name="Genome" keyType="name-string">
12              <Notes>A [i]genome[/i] contains the sequence data for a particular individual organism.</Notes>              <Notes>A [[Genome]] contains the sequence data for a particular individual organism.</Notes>
13              <Fields>              <Fields>
14                  <Field name="genus" type="name-string">                  <Field name="genus" type="name-string">
15                      <Notes>Genus of the relevant organism.</Notes>                      <Notes>Genus of the relevant organism.</Notes>
                     <DataGen pass="1">RandParam('streptococcus', 'staphyloccocus', 'felis', 'homo', 'ficticio', 'strangera', 'escherischia', 'carborunda')</DataGen>  
16                  </Field>                  </Field>
17                  <Field name="species" type="name-string">                  <Field name="species" type="name-string">
18                      <Notes>Species of the relevant organism.</Notes>                      <Notes>Species of the relevant organism.</Notes>
                     <DataGen pass="1">StringGen('PKVKVKVKVKV')</DataGen>  
19                  </Field>                  </Field>
20                  <Field name="unique-characterization" type="medium-string">                  <Field name="unique-characterization" type="medium-string">
21                      <Notes>The unique characterization identifies the particular organism instance from which the                      <Notes>The unique characterization identifies the particular organism instance from which the
22                      genome is taken. It is possible to have in the database more than one genome for a                      genome is taken. It is possible to have in the database more than one genome for a
23                      particular species, and every individual organism has variations in its DNA.</Notes>                      particular species, and every individual organism has variations in its DNA.</Notes>
24                      <DataGen>StringGen('PKVKVK999')</DataGen>                  </Field>
25                    <Field name="version" type="name-string">
26                        <Notes>version string for this genome, generally consisting of the genome ID followed
27                        by a period and a string of digits.</Notes>
28                  </Field>                  </Field>
29                  <Field name="access-code" type="key-string">                  <Field name="access-code" type="key-string">
30                      <Notes>The access code determines which users can look at the data relating to this genome.                      <Notes>The access code field is deprecated. Its function has been replaced by
31                      Each user is associated with a set of access codes. In order to view a genome, one of                      the account management system developed for the [[RapidAnnotationServer]].</Notes>
32                      the user's access codes must match this value.</Notes>                  </Field>
33                      <DataGen>RandParam('low','medium','high')</DataGen>                  <Field name="complete" type="boolean">
34                        <Notes>TRUE if the genome is complete, else FALSE</Notes>
35                    </Field>
36                    <Field name="dna-size" type="counter">
37                        <Notes>number of base pairs in the genome</Notes>
38                  </Field>                  </Field>
39                  <Field name="taxonomy" type="text">                  <Field name="taxonomy" type="text">
40                      <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements                      <Notes>The taxonomy string contains the full [[Wikipedia:taxonomy]] of the organism, while individual elements
41                      separated by semi-colons (and optional white space), starting with the domain and ending with                      separated by semi-colons (and optional white space), starting with the domain and ending with
42                      the disambiguated genus and species (which is the organism's scientific name plus an                      the disambiguated genus and species (which is the organism's scientific name plus an
43                      identifying string).</Notes>                      identifying string).</Notes>
                     <DataGen pass="2">join('; ', (RandParam('bacteria', 'archaea', 'eukaryote', 'virus', 'environmental'),  
                                                   ListGen('PKVKVKVK', 5), $this->{genus}, $this->{species}))</DataGen>  
44                  </Field>                  </Field>
45                  <Field name="group-name" type="name-string" relation="GenomeGroups">                  <Field name="primary-group" type="name-string">
46                      <Notes>The group identifies a special grouping of organisms that would be displayed on a particular                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group
47                      page or of particular interest to a research group or web site. A single genome can belong to multiple                      per organism (either based on the organism name or the default value =Supporting=). In general,
48                      such groups or none at all.</Notes>                      more data is kept on organisms in NMPDR groups than on supporting organisms.</Notes>
49                  </Field>                  </Field>
50                  <Field name="complete" type="boolean">                  <Field name="contigs" type="int">
51                      <Notes>This field is TRUE if the genome is believed to be complete, else                      <Notes>Number of contigs for this organism.</Notes>
52                      FALSE.</Notes>                  </Field>
53                    <Field name="pegs" type="int">
54                        <Notes>Number of [[protein encoding genes]] for this organism</Notes>
55                    </Field>
56                    <Field name="rnas" type="int">
57                        <Notes>Number of RNA features found for this organism.</Notes>
58                  </Field>                  </Field>
59              </Fields>              </Fields>
60              <Indexes>              <Indexes>
# Line 55  Line 69 
69                          <IndexField name="unique-characterization" order="ascending" />                          <IndexField name="unique-characterization" order="ascending" />
70                      </IndexFields>                      </IndexFields>
71                  </Index>                  </Index>
72                  <Index Unique="false">                  <Index>
73                        <Notes>This index allows the applications to find all genomes associated with
74                        a specific primary (NMPDR) group.</Notes>
75                        <IndexFields>
76                            <IndexField name="primary-group" order="ascending" />
77                            <IndexField name="genus" order="ascending" />
78                            <IndexField name="species" order="ascending" />
79                            <IndexField name="unique-characterization" order="ascending" />
80                        </IndexFields>
81                    </Index>
82                    <Index>
83                      <Notes>This index allows the applications to find all genomes for a particular                      <Notes>This index allows the applications to find all genomes for a particular
84                      species.</Notes>                      species.</Notes>
85                      <IndexFields>                      <IndexFields>
# Line 66  Line 90 
90                  </Index>                  </Index>
91              </Indexes>              </Indexes>
92          </Entity>          </Entity>
93            <Entity name="CDD" keyType="key-string">
94                <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit
95                on a feature's protein. The ID is six-digit string assigned by the public
96                [[http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml Conserved Domain Database]]. A CDD
97                can occur on multiple features and a feature generally has multiple CDDs.</Notes>
98            </Entity>
99          <Entity name="Source" keyType="medium-string">          <Entity name="Source" keyType="medium-string">
100              <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization              <Notes>A _source_ describes a place from which genome data was taken. This can be an organization
101              or a paper citation.</Notes>              or a paper citation.</Notes>
102              <Fields>              <Fields>
103                  <Field name="URL" type="string" relation="SourceURL">                  <Field name="URL" type="string" relation="SourceURL">
104                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>
                     <DataGen>"http://www.conservativecat.com/Ferdy/TestTarget.php?Source=" . $this->{id}</DataGen>  
105                  </Field>                  </Field>
106                  <Field name="description" type="text">                  <Field name="description" type="text">
107                      <Notes>Description the source. The description can be a street address or a citation.</Notes>                      <Notes>Description of the source. The description can be a street address or a citation.</Notes>
                     <DataGen>$this->{id} . ': ' . StringGen(IntGen(50,200))</DataGen>  
108                  </Field>                  </Field>
109              </Fields>              </Fields>
110          </Entity>          </Entity>
111          <Entity name="Contig" keyType="name-string">          <Entity name="Contig" keyType="name-string">
112              <Notes>A [i]contig[/i] is a contiguous run of residues. The contig's ID consists of the              <Notes>A _contig_ is a contiguous run of residues. The contig's ID consists of the
113              genome ID followed by a name that identifies which contig this is for the parent genome. As              genome ID followed by a name that identifies which contig this is for the parent genome. As
114              is the case with all keys in this database, the individual components are separated by a              is the case with all keys in this database, the individual components are separated by a
115              period.              period. A contig can contain over a million residues. For performance reasons, therefore,
116              [p]A contig can contain over a million residues. For performance reasons, therefore,              the contig is split into multiple pieces called _sequences_. The sequences
             the contig is split into multiple pieces called [i]sequences[/i]. The sequences  
117              contain the characters that represent the residues as well as data on the quality of              contain the characters that represent the residues as well as data on the quality of
118              the residue identification.</Notes>              the residue identification.</Notes>
119          </Entity>          </Entity>
120          <Entity name="Sequence" keyType="name-string">          <Entity name="Sequence" keyType="name-string">
121              <Notes>A [i]sequence[/i] is a continuous piece of a [i]contig[/i]. Contigs are split into              <Notes>A _sequence_ is a continuous piece of a contig. Contigs are split into
122              sequences so that we don't have to have the entire contig in memory when we are              sequences so that we don't have to have the entire contig in memory when we are
123              manipulating it. The key of the sequence is the contig ID followed by the index of              manipulating it. The key of the sequence is the contig ID followed by the index of
124              the begin point.</Notes>              the begin point.</Notes>
125              <Fields>              <Fields>
126                  <Field name="sequence" type="text">                  <Field name="sequence" type="text">
127                      <Notes>String consisting of the residues. Each residue is described by a single                      <Notes>String consisting of the residues (base pairs). Each residue is described by a single
128                      character in the string.</Notes>                      character in the string.</Notes>
                     <DataGen>RandChars("ACGT", IntGen(100,400))</DataGen>  
129                  </Field>                  </Field>
130                  <Field name="quality-vector" type="text">                  <Field name="quality-vector" type="text">
131                      <Notes>String describing the quality data for each base pair. Individual values will                      <Notes>String describing the quality data for each base pair. Individual values will
132                      be separated by periods. The value represents negative exponent of the probability                      be separated by periods. The value represents negative exponent of the probability
133                      of error. Thus, for example, a quality of 30 indicates the probability of error is                      of error. Thus, for example, a quality of 30 indicates the probability of error is
134                      10^-30. A higher quality number a better chance of a correct match. It is possible                      10^-30. A higher quality number indicates a better chance of a correct match. It is
135                      that the quality data is not known for a sequence. If that is the case, the quality                      possible that the quality data is not known for a sequence. If that is the case, the
136                      vector will contain the [b]unknown[/b].</Notes>                      quality vector will contain the string =unknown=.</Notes>
137                      <DataGen>unknown</DataGen>                  </Field>
138                </Fields>
139            </Entity>
140            <Entity name="Keyword" keyType="name-string">
141                <Notes>A _keyword_ is a word stem that can be used to search the feature table. This entity
142                indicates how many features correspond to each word.</Notes>
143                <Fields>
144                    <Field name="count" type="counter">
145                        <Notes>Number of features that can be found by searching for the specified
146                        keyword.</Notes>
147                  </Field>                  </Field>
148              </Fields>              </Fields>
149          </Entity>          </Entity>
150          <Entity name="Feature" keyType="name-string">          <Entity name="Feature" keyType="id-string">
151              <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features              <Notes>A _feature_ (sometimes also called a [[gene]]) is a part of a genome that is of special interest. Features
152              may be spread across multiple contigs of a genome, but never across more than              may be spread across multiple contigs of a genome, but never across more than
153              one genome. Features can be assigned to roles via spreadsheet cells,              one genome. Features can be assigned to roles via spreadsheet cells,
154              and are the targets of annotation.</Notes>              and are the targets of annotation. Each feature in the database has a unique [[FigId]].</Notes>
155              <Fields>              <Fields>
156                  <Field name="feature-type" type="string">                  <Field name="feature-type" type="id-string">
157                      <Notes>Code indicating the type of this feature.</Notes>                      <Notes>Code indicating the type of this feature. Among the codes currently
158                      <DataGen>RandParam('peg','rna')</DataGen>                      supported are =peg= for a [[protein encoding gene]], =bs= for a
159                  </Field>                      binding site, =opr= for an operon, and so forth.</Notes>
                 <Field name="alias" type="name-string" relation="FeatureAlias">  
                     <Notes>Alternative name for this feature. A feature can have many aliases.</Notes>  
                     <DataGen testCount="3">StringGen('Pgi|99999', 'Puni|XXXXXX', 'PAAAAAA999')</DataGen>  
160                  </Field>                  </Field>
161                  <Field name="translation" type="text" relation="FeatureTranslation">                  <Field name="translation" type="text" relation="FeatureTranslation">
162                      <Notes>[i](optional)[/i] A translation of this feature's residues into character                      <Notes>_(optional)_ A translation of this feature's residues into character
163                      codes, formed by concatenating the pieces of the feature together. For a                      codes, formed by concatenating the pieces of the feature together. For a
164                      protein encoding group, this is the protein characters. For other types                      [[protein encoding gene]], the translation contains protein characters. For other types
165                      it is the DNA characters.</Notes>                      it contains DNA characters.</Notes>
                     <DataGen testCount="0"></DataGen>  
166                  </Field>                  </Field>
167                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
168                      <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of                      <Notes>Upstream sequence for the feature. This includes residues preceding the feature as
169                      the feature's initial residues.</Notes>                      well as some of the feature's initial residues.</Notes>
170                      <DataGen testCount="0"></DataGen>                  </Field>
171                    <Field name="assignment" type="text">
172                        <Notes>Default functional assignment for this feature.</Notes>
173                  </Field>                  </Field>
174                  <Field name="active" type="boolean">                  <Field name="active" type="boolean">
175                      <Notes>TRUE if this feature is still considered valid, if it has been logically deleted.</Notes>                      <Notes>(This field is deprecated.) TRUE if this feature is still considered valid,
176                      <DataGen>1</DataGen>                      FALSE if it has been logically deleted.</Notes>
177                    </Field>
178                    <Field name="assignment-maker" type="name-string">
179                        <Notes>name of the user who made the functional assignment</Notes>
180                    </Field>
181                    <Field name="assignment-quality" type="char">
182                        <Notes>quality of the functional assignment, usually a space, but may be W (indicating weak) or X
183                        (indicating experimental)</Notes>
184                    </Field>
185                    <Field name="keywords" type="text" searchable="1">
186                        <Notes>This is a list of search keywords for the feature. It includes the
187                        functional assignment, subsystem roles, and special properties.</Notes>
188                  </Field>                  </Field>
189                  <Field name="link" type="text" relation="FeatureLink">                  <Field name="link" type="text" relation="FeatureLink">
190                      <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The                      <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
191                      links are to other websites that have useful about the gene that the feature represents, and                      links are to other websites that have useful about the gene that the feature represents, and
192                      are coded as raw HTML, using [b]&lt;a href="[i]link[/i]"&gt;[i]text[/i]&lt;/a&gt;[/b] notation.</Notes>                      are coded as raw HTML, using &lt;a href="_link_"&gt;_text_&lt;/a&gt; notation.</Notes>
193                      <DataGen testCount="3">'http://www.conservativecat.com/Ferdy/TestTarget.php?Source=' . $this->{id} .                  </Field>
194                      "&amp;Number=" . IntGen(1,99)</DataGen>                  <Field name="conservation" type="float" relation="FeatureConservation">
195                        <Notes>_(optional)_ A number between 0 and 1 that indicates the degree to which this feature's DNA is
196                        conserved in related genomes. A value of 1 indicates perfect conservation. A value less
197                        than 1 is a reflection of the degree to which gap characters interfere in the alignment
198                        between the feature and its close relatives.</Notes>
199                    </Field>
200                    <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
201                        <Notes>A value indicating the essentiality of the feature, coded as HTML. In most
202                        cases, this will be a word describing whether the essentiality is confirmed (essential)
203                        or potential (potential-essential), hyperlinked to the document from which the
204                        essentiality was curated. If a feature is not essential, this field will have no
205                        values; otherwise, it may have multiple values.</Notes>
206                    </Field>
207                    <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">
208                        <Notes>A value indicating the virulence of the feature, coded as HTML. In most
209                        cases, this will be a phrase or SA number hyperlinked to the document from which
210                        the virulence information was curated. If the feature is not virulent, this field
211                        will have no values; otherwise, it may have multiple values.</Notes>
212                    </Field>
213                    <Field name="cello" type="name-string">
214                        <Notes>The cello value specifies the expected location of the protein: cytoplasm,
215                        cell wall, inner membrane, and so forth.</Notes>
216                    </Field>
217                    <Field name="iedb" type="text" relation="FeatureIEDB" special="property_search">
218                        <Notes>A value indicating whether or not the feature can be found in the
219                        Immune Epitope Database. If the feature has not been matched to that database,
220                        this field will have no values. Otherwise, it will have an epitope name and/or
221                        sequence, hyperlinked to the database.</Notes>
222                    </Field>
223                    <Field name="location-string" type="text">
224                        <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location
225                        strings. This gives us a fast mechanism for extracting the feature location. Otherwise,
226                        we have to painstakingly paste together the [[#IsLocatedIn]] records, which are themselves
227                        designed to help look for features in a particular region rather than to find the location
228                        of a feature.</Notes>
229                  </Field>                  </Field>
230              </Fields>              </Fields>
231              <Indexes>              <Indexes>
232                  <Index>                  <Index>
233                      <Notes>This index allows the user to find the feature corresponding to                      <Notes>This index allows us to locate a feature by its CELLO value.</Notes>
                     the specified alias name.</Notes>  
234                      <IndexFields>                      <IndexFields>
235                          <IndexField name="alias" order="ascending" />                          <IndexField name="cello" order="ascending" />
236                      </IndexFields>                      </IndexFields>
237                  </Index>                  </Index>
238              </Indexes>              </Indexes>
239          </Entity>          </Entity>
240          <Entity name="Role" keyType="string">          <Entity name="FeatureAlias" keyType="medium-string">
241              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.              <Notes>Alternative names for features. A feature can have many aliases. In general,
242              One of the main goals of the database is to record the roles of the various features.</Notes>              each alias corresponds to only one feature, but there are many exceptions to this rule.</Notes>
243            </Entity>
244            <Entity name="SproutUser" keyType="name-string">
245                <Notes>A _user_ is a person who can make annotations and view data in the database. The
246                user object is keyed on the user's login name.</Notes>
247              <Fields>              <Fields>
248                  <Field name="name" type="string" relation="RoleName">                  <Field name="description" type="string">
249                      <Notes>Expanded name of the role. This value is generally only available for roles                      <Notes>Full name or description of this user.</Notes>
250                      that are encoded as EC numbers.</Notes>                  </Field>
251                      <DataGen testCount="1">StringGen(IntGen(20,40)) . "(" . $this->{id} . ")"</DataGen>                  <Field name="access-code" type="key-string" relation="UserAccess">
252                        <Notes>This field is deprecated.</Notes>
253                  </Field>                  </Field>
254              </Fields>              </Fields>
255          </Entity>          </Entity>
256            <Entity name="SynonymGroup" keyType="id-string">
257                <Notes>A _synonym group_ represents a group of features. Features that represent substantially
258                identical proteins or DNA sequences are mapped to the same synonym group, and this information is
259                used to expand similarities.</Notes>
260            </Entity>
261            <Entity name="Role" keyType="string">
262                <Notes>A _role_ describes a biological function that may be fulfilled by a feature.
263                One of the main goals of the database is to record the roles of the various features.</Notes>
264            </Entity>
265            <Entity name="RoleEC" keyType="string">
266                <Notes>EC code for a role.</Notes>
267            </Entity>
268          <Entity name="Annotation" keyType="name-string">          <Entity name="Annotation" keyType="name-string">
269              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations              <Notes>An _annotation_ contains supplementary information about a feature. The most
270              are currently the only objects that may be inserted directly into the database. All other              important type of annotation is the assignment of a [[functional role]]; however,
271              information is loaded from data exported by the SEED.              other types of annotations are also possible.</Notes>
             [p]Each annotation is associated with a target [b]Feature[/b]. The key of the annotation  
             is the target feature ID followed by a timestamp.</Notes>  
272              <Fields>              <Fields>
273                  <Field name="time" type="date">                  <Field name="time" type="date">
274                      <Notes>Date and time of the annotation.</Notes>                      <Notes>Date and time of the annotation.</Notes>
# Line 185  Line 277 
277                      <Notes>Text of the annotation.</Notes>                      <Notes>Text of the annotation.</Notes>
278                  </Field>                  </Field>
279              </Fields>              </Fields>
280                <Indexes>
281                    <Index>
282                        <Notes>This index allows the user to find recent annotations.</Notes>
283                        <IndexFields>
284                            <IndexField name="time" order="descending" />
285                        </IndexFields>
286                    </Index>
287                </Indexes>
288            </Entity>
289            <Entity name="Reaction" keyType="key-string">
290                <Notes>A _reaction_ is a chemical process catalyzed by a protein. The reaction ID
291                is generally a small number preceded by a letter.</Notes>
292                <Fields>
293                    <Field name="url" type="string" relation="ReactionURL">
294                        <Notes>HTML string containing a link to a web location that describes the
295                        reaction. This field is optional.</Notes>
296                    </Field>
297                    <Field name="rev" type="boolean">
298                        <Notes>TRUE if this reaction is reversible, else FALSE</Notes>
299                    </Field>
300                </Fields>
301          </Entity>          </Entity>
302          <Entity name="Subsystem" keyType="string">          <Entity name="Compound" keyType="name-string">
303              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems              <Notes>A _compound_ is a chemical that participates in a reaction.
304              is an important tool for recognizing parallel genetic features in different organisms.</Notes>              All compounds have a unique ID and may also have one or more names.</Notes>
305                <Fields>
306                    <Field name="label" type="string">
307                        <Notes>Name used in reaction display strings. This is the same as the name
308                        possessing a priority of 1, but it is placed here to speed up the query
309                        used to create the display strings.</Notes>
310                    </Field>
311                </Fields>
312          </Entity>          </Entity>
313          <Entity name="SSCell" keyType="name-string">          <Entity name="CompoundName" keyType="string">
314              <Notes>Part of the process of locating and assigning features is creating a spreadsheet of              <Notes>A _compound name_ is a common name for the chemical represented by a
315              genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one              compound.</Notes>
             of the positions on the spreadsheet.</Notes>  
316          </Entity>          </Entity>
317          <Entity name="SproutUser" keyType="name-string">          <Entity name="CompoundCAS" keyType="name-string">
318              <Notes>A [i]user[/i] is a person who can make annotations and view data in the database. The              <Notes>This entity represents the [[http://www.cas.org/ Chemical Abstract Service]] ID for a
319              user object is keyed on the user's login name.</Notes>              compound. Each Compound has at most one CAS ID.</Notes>
320            </Entity>
321            <Entity name="Subsystem" keyType="string">
322                <Notes>A _subsystem_ is a collection of roles that work together in a cell. Identification of subsystems
323                is an important tool for recognizing parallel genetic features in different organisms. See also
324                [[Subsystems Approach]] and [[Subsystem]].</Notes>
325              <Fields>              <Fields>
326                  <Field name="description" type="string">                  <Field name="curator" type="string">
327                      <Notes>Full name or description of this user.</Notes>                      <Notes>Name of the person currently in charge of the subsystem.</Notes>
328                  </Field>                  </Field>
329                  <Field name="access-code" type="key-string" relation="UserAccess">                  <Field name="notes" type="text">
330                      <Notes>Access code possessed by this                      <Notes>Descriptive notes about the subsystem.</Notes>
331                      user. A user can have many access codes; a genome is accessible to the user if its                  </Field>
332                      access code matches any one of the user's access codes.</Notes>                  <Field name="description" type="text">
333                      <DataGen testCount="2">RandParam('low', 'medium', 'high')</DataGen>                      <Notes>Description of the subsystem's function.</Notes>
334                    </Field>
335                    <Field name="classification" type="string" relation="SubsystemClass">
336                        <Notes>Classification string, colon-delimited. This string organizes the
337                        subsystems into a hierarchy.</Notes>
338                  </Field>                  </Field>
339              </Fields>              </Fields>
340          </Entity>          </Entity>
341            <Entity name="RoleSubset" keyType="string">
342                <Notes>A _role subset_ is a named collection of roles in a particular subsystem. The
343                subset names are generally very short, non-unique strings. The ID of the parent
344                subsystem is prefixed to the subset ID in order to make it unique.</Notes>
345            </Entity>
346            <Entity name="GenomeSubset" keyType="string">
347                <Notes>A _genome subset_ is a named collection of genomes that participate
348                in a particular subsystem. The subset names are generally very short, non-unique
349                strings. The ID of the parent subsystem is prefixed to the subset ID in order
350                to make it unique.</Notes>
351            </Entity>
352            <Entity name="SSCell" keyType="hash-string">
353                <Notes>Part of the process of [[SubsystemsApproach][subsystem annotation]] of [[features]]
354                is creating a spreadsheet of genomes and roles to which features are assigned. A _spreadsheet
355                cell_ represents one of the positions on the spreadsheet.</Notes>
356            </Entity>
357          <Entity name="Property" keyType="int">          <Entity name="Property" keyType="int">
358              <Notes>A [i]property[/i] is a type of assertion that could be made about the properties of              <Notes>A _property_ is a type of assertion that could be made about the properties of
359              a particular feature. Each property instance is a key/value pair and can be associated              a particular feature. Each property instance is a key/value pair and can be associated
360              with many different features. Conversely, a feature can be associated with many key/value              with many different features. Conversely, a feature can be associated with many key/value
361              pairs, even some that notionally contradict each other. For example, there can be evidence              pairs, even some that notionally contradict each other. For example, there can be evidence
# Line 238  Line 382 
382              </Indexes>              </Indexes>
383          </Entity>          </Entity>
384          <Entity name="Diagram" keyType="name-string">          <Entity name="Diagram" keyType="name-string">
385              <Notes>A functional diagram describes the chemical reactions, often comprising a single              <Notes>A functional diagram describes a network chemical reactions, often comprising a single
386              subsystem. A diagram is identified by a short name and contains a longer descriptive name.              subsystem. A diagram is identified by a short name and contains a longer descriptive name.
387              The actual diagram shows which functional roles guide the reactions along with the inputs              The actual diagram shows which functional roles guide the reactions along with the inputs
388              and outputs; the database, however, only indicate which roles belong to a particular              and outputs; the database, however, only indicates which roles belong to a particular
389              map.</Notes>              diagram's map.</Notes>
390              <Fields>              <Fields>
391                  <Field name="name" type="text">                  <Field name="name" type="text">
392                      <Notes>Descriptive name of this diagram.</Notes>                      <Notes>Descriptive name of this diagram.</Notes>
# Line 271  Line 415 
415                      </Field>                      </Field>
416                  </Fields>                  </Fields>
417          </Entity>          </Entity>
418          <Entity name="Coupling" keyType="medium-string">          <Entity name="Family" keyType="id-string">
419              <Notes>A coupling is a relationship between two features. The features are              <Notes>A _family_ (also called a [[FigFam]]) is a group of homologous features believed to have
420              physically close on the contig, and there is evidence that they generally              the same function. Families provide a mechanism for verifying the accuracy of functional assignments
421              belong together. The key of this entity is formed by combining the coupled              and are also used in [[Rapid Annotation]] and in determining phylogenetic trees.</Notes>
422              feature IDs with a space.</Notes>              <Fields>
423              <Fields>                  <Field name="function" type="text">
424                  <Field name="score" type="int">                      <Notes>The functional assignment expected for all PEGs in this family.</Notes>
425                      <Notes>A number based on the set of PCHs (pairs of close homologs). A PCH                  </Field>
426                      indicates that two genes near each other on one genome are very similar to                  <Field name="size" type="int">
427                      genes near each other on another genome. The score only counts PCHs for which                      <Notes>The number of proteins in this family. This may be larger than the
428                      the genomes are very different. (In other words, we have a pairing that persists                      number of PEGs included in the family, since the family may also contain external
429                      between different organisms.) A higher score implies a stronger meaning to the                      IDs.</Notes>
430                      clustering.</Notes>                  </Field>
431                </Fields>
432            </Entity>
433            <Entity name="PDB" keyType="id-string">
434                <Notes>A PDB is a protein data bank entry containing information that can be used
435                to determine the shape of the protein and the energies required to dock with it.
436                The ID is the four-character name used on the [[http://www.rcsb.org PDB web site]].</Notes>
437                <Fields>
438                    <Field name="docking-count" type="int">
439                        <Notes>The number of ligands that have been docked against this PDB.</Notes>
440                  </Field>                  </Field>
441              </Fields>              </Fields>
442                <Indexes>
443                    <Index>
444                        <IndexFields>
445                            <IndexField name="docking-count" order="descending" />
446                            <IndexField name="id" order="ascending" />
447                        </IndexFields>
448                    </Index>
449                </Indexes>
450          </Entity>          </Entity>
451          <Entity name="PCH" keyType="string">          <Entity name="Ligand" keyType="id-string">
452              <Notes>A PCH (physically close homolog) connects a clustering (which is a              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.
453              pair of physically close features on a contig) to a second pair of physically              The ID of the ligand is an 8-digit ID number in the [[http://zinc.docking.org ZINC database]].</Notes>
454              close features that are similar to the first. Essentially, the PCH is a              <Fields>
455              relationship between two clusterings in which the first clustering's features                  <Field name="name" type="long-string">
456              are similar to the second clustering's features. The simplest model for                      <Notes>Chemical name of this ligand.</Notes>
             this would be to simply relate clusterings to each other; however, not all  
             physically close pairs qualify as clusterings, so we relate a clustering to  
             a pair of features. The key is the clustering key followed by the IDs  
             of the features in the second pair.</Notes>  
             <Fields>  
                 <Field name="used" type="boolean">  
                     <Notes>TRUE if this PCH is used in scoring the attached clustering,  
                     else FALSE. If a clustering has a PCH for a particular genome and many  
                     similar genomes are present, then a PCH will probably exist for the  
                     similar genomes as well. When this happens, only one of the PCHs will  
                     be scored: the others are considered duplicates of the same evidence.</Notes>  
457                  </Field>                  </Field>
458              </Fields>              </Fields>
459          </Entity>          </Entity>
460      </Entities>      </Entities>
461      <Relationships>      <Relationships>
462          <Relationship name="ParticipatesInCoupling" from="Feature" to="Coupling" arity="MM">          <Relationship name="IsPresentOnProteinOf" from="CDD" to="Feature" arity="MM">
463              <Notes>This relationship connects a feature to all the functional couplings              <Notes>This relationship connects a feature to its CDD protein domains. The
464              in which it participates. A functional coupling is a recognition of the fact              match score is included as intersection data.</Notes>
465              that the features are close to each other on a chromosome, and similar              <Fields>
466              features in other genomes also tend to be close.</Notes>                  <Field name="score" type="float">
467              <Fields>                      <Notes>This is the match score between the feature and the CDD. A
468                  <Field name="pos" type="int">                      lower score is a better match.</Notes>
469                      <Notes>Ordinal position of the feature in the coupling. Currently,                  </Field>
470                      this is either "1" or "2".</Notes>              </Fields>
471                <FromIndex>
472                    <IndexFields>
473                        <IndexField name="score" order="ascending" />
474                    </IndexFields>
475                </FromIndex>
476            </Relationship>
477            <Relationship name="IsIdentifiedByCAS" from="Compound" to="CompoundCAS" arity="MM">
478                <Notes>Relates a compound's CAS ID to the compound itself. Every CAS ID is
479                associated with a compound, and some are associated with two compounds, but not
480                all compounds have CAS IDs.</Notes>
481            </Relationship>
482            <Relationship name="IsIdentifiedByEC" from="Role" to="RoleEC" arity="MM">
483                <Notes>Relates a role to its EC number. Every EC number is associated with a
484                role, but not all roles have EC numbers.</Notes>
485            </Relationship>
486            <Relationship name="IsAliasOf" from="FeatureAlias" to="Feature" arity="MM">
487                <Notes>Connects an alias to the feature it represents. Every alias connects
488                to at least 1 feature, and a feature connects to many aliases.</Notes>
489            </Relationship>
490            <Relationship name="HasCompoundName" from="Compound" to="CompoundName" arity="MM">
491                <Notes>Connects a compound to its names. A compound generally has several
492                names</Notes>
493                <Fields>
494                    <Field name="priority" type="int">
495                        <Notes>Priority of this name, with 1 being the highest priority, 2
496                        the next highest, and so forth.</Notes>
497                    </Field>
498                </Fields>
499                <FromIndex>
500                    <Notes>This index enables the application to view the names of a compound
501                    in priority order.</Notes>
502                    <IndexFields>
503                        <IndexField name="priority" order="ascending" />
504                    </IndexFields>
505                </FromIndex>
506            </Relationship>
507            <Relationship name="IsProteinForFeature" from="PDB" to="Feature" arity="MM">
508                <Notes>Relates a PDB to features that produce highly similar proteins.</Notes>
509                <Fields>
510                    <Field name="score" type="float">
511                        <Notes>Similarity score for the comparison between the feature and
512                        the PDB protein. A lower score indicates a better match.</Notes>
513                    </Field>
514                    <Field name="start-location" type="int">
515                        <Notes>Starting location within the feature of the matching region.</Notes>
516                    </Field>
517                    <Field name="end-location" type="int">
518                        <Notes>Ending location within the feature of the matching region.</Notes>
519                  </Field>                  </Field>
520              </Fields>              </Fields>
521              <ToIndex>              <ToIndex>
522                    <Notes>This index enables the application to view the PDBs of a
523                    feature in order from the closest match to the furthest.</Notes>
524                    <IndexFields>
525                        <IndexField name="score" order="ascending" />
526                    </IndexFields>
527                </ToIndex>
528                <FromIndex>
529                  <Notes>This index enables the application to view the features of                  <Notes>This index enables the application to view the features of
530                  a coupling in the proper order. The order influences the way the                  a PDB in order from the closest match to the furthest.</Notes>
531                  PCHs are examined.</Notes>                  <IndexFields>
532                        <IndexField name="score" order="ascending" />
533                    </IndexFields>
534                </FromIndex>
535            </Relationship>
536            <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">
537                <Notes>Indicates that a [[docking result]] exists between a PDB and a ligand. The
538                docking result describes the energy required for the ligand to dock with
539                the protein described by the PDB. A lower energy indicates the ligand has a
540                good chance of disabling the protein. At the current time, only the best
541                docking results are kept.</Notes>
542                <Fields>
543                    <Field name="reason" type="id-string">
544                        <Notes>Indication of the reason for determining the docking result.
545                        A value of =Random= indicates the docking was attempted as a part
546                        of a random survey used to determine the docking characteristics of the
547                        PDB. A value of =Rich= indicates the docking was attempted because
548                        a low-energy docking result was predicted for the ligand with respect
549                        to the PDB.</Notes>
550                    </Field>
551                    <Field name="tool" type="id-string">
552                        <Notes>Name of the tool used to produce the docking result.</Notes>
553                    </Field>
554                    <Field name="total-energy" type="float">
555                        <Notes>Total energy required for the ligand to dock with the PDB
556                        protein, in kcal/mol. A negative value means energy is released.</Notes>
557                    </Field>
558                    <Field name="vanderwalls-energy" type="float">
559                        <Notes>Docking energy in kcal/mol that results from the geometric fit
560                        (Van der Waals force) between the PDB and the ligand.</Notes>
561                    </Field>
562                    <Field name="electrostatic-energy" type="float">
563                        <Notes>Docking energy in kcal/mol that results from the movement of
564                        electrons (electrostatic force) between the PDB and the ligand.</Notes>
565                    </Field>
566                </Fields>
567                <FromIndex>
568                    <Notes>This index enables the application to view a PDB's docking results from
569                    the lowest energy (best docking) to highest energy (worst docking).</Notes>
570                  <IndexFields>                  <IndexFields>
571                      <IndexField name="pos" order="ascending" />                      <IndexField name="total-energy" order="ascending" />
572                  </IndexFields>                  </IndexFields>
573                </FromIndex>
574                <ToIndex>
575                    <Notes>This index enables the application to view a ligand's docking results from
576                    the lowest energy (best docking) to highest energy (worst docking).</Notes>
577              </ToIndex>              </ToIndex>
578          </Relationship>          </Relationship>
579          <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M">          <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">
580              <Notes>This relationship connects a functional coupling to the physically              <Notes>This relationship connects a protein family to all of its PEGs and connects
581              close homologs (PCHs) which affirm that the coupling is meaningful.</Notes>              each PEG to all of its protein families.</Notes>
582          </Relationship>          </Relationship>
583          <Relationship name="UsesAsEvidence" from="PCH" to="Feature" arity="MM">          <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="MM">
584              <Notes>This relationship connects a PCH to the features that represent its              <Notes>This relation connects a synonym group to the features that make it
585              evidence. Each PCH is connected to a parent coupling that relates two features              up.</Notes>
586              on a specific genome. The PCH's evidence that the parent coupling is functional          </Relationship>
587              is the existence of two physically close features on a different genome that          <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M">
588              correspond to the features in the coupling. Those features are found on the              <Notes>This relationship connects a genome to all of its features. This
589              far side of this relationship.</Notes>              relationship is redundant in a sense, because the genome ID is part
590              <Fields>              of the feature ID; however, it makes the creation of certain queries more
591                  <Field name="pos" type="int">              convenient because you can drag in filtering information for a feature's
592                      <Notes>Ordinal position of the feature in the coupling that corresponds              genome.</Notes>
593                      to our target feature. There is a one-to-one correspondence between the              <Fields>
594                      features connected to the PCH by this relationship and the features                  <Field name="type" type="key-string">
595                      connected to the PCH's parent coupling. The ordinal position is used                      <Notes>Feature type (eg. peg, rna)</Notes>
                     to decode that relationship. Currently, this field is either "1" or  
                     "2".</Notes>  
596                  </Field>                  </Field>
597              </Fields>              </Fields>
598              <FromIndex>              <FromIndex>
599                  <Notes>This index enables the application to view the features of                  <Notes>This index enables the application to view the features of a
600                  a PCH in the proper order.</Notes>                  Genome sorted by type.</Notes>
601                  <IndexFields>                  <IndexFields>
602                      <IndexField name="pos" order="ascending" />                      <IndexField name="type" order="ascending" />
603                  </IndexFields>                  </IndexFields>
604              </FromIndex>              </FromIndex>
605          </Relationship>          </Relationship>
# Line 398  Line 643 
643          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">
644              <Notes>This relationship connects subsystems to the genomes that use              <Notes>This relationship connects subsystems to the genomes that use
645              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be
646              connected to the genome features through the [b]SSCell[/b] object.</Notes>              connected to the genome features through the *SSCell* object.</Notes>
647                <Fields>
648                    <Field name="variant-code" type="key-string">
649                        <Notes>Code indicating the subsystem variant to which this
650                        genome belongs. Each subsystem can have multiple variants. A variant
651                        code of =-1= indicates that the genome does not have a functional
652                        variant of the subsystem. A variant code of =0= indicates that
653                        the genome's participation is considered iffy.</Notes>
654                    </Field>
655                </Fields>
656                <ToIndex>
657                    <Notes>This index enables the application to find all of the genomes using
658                    a subsystem in order by variant code, which is how we wish to display them
659                    in the spreadsheets.</Notes>
660                    <IndexFields>
661                        <IndexField name="variant-code" order="ascending" />
662                    </IndexFields>
663                </ToIndex>
664          </Relationship>          </Relationship>
665          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">
666              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>
667                <Fields>
668                    <Field name="abbr" type="name-string">
669                        <Notes>Abbreviated name for the role, generally non-unique, but useful
670                        in column headings for HTML tables.</Notes>
671                    </Field>
672                    <Field name="column-number" type="int">
673                        <Notes>Column number for this role in the specified subsystem's
674                        spreadsheet.</Notes>
675                    </Field>
676                </Fields>
677                <ToIndex>
678                    <Notes>This index enables the application to see the subsystem roles
679                    in column order. The ordering of the roles is usually significant,
680                    so it is important to preserve it.</Notes>
681                    <IndexFields>
682                        <IndexField name="column-number" order="ascending" />
683                    </IndexFields>
684                </ToIndex>
685          </Relationship>          </Relationship>
686          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">
687              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
# Line 414  Line 694 
694          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">
695              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
696              features assigned to it.</Notes>              features assigned to it.</Notes>
697                <Fields>
698                    <Field name="cluster-number" type="int">
699                        <Notes>ID of this feature's cluster. Clusters represent families of
700                        related proteins participating in a subsystem.</Notes>
701                    </Field>
702                </Fields>
703            </Relationship>
704            <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM">
705                <Notes>This relationship connects a reaction to the compounds that participate
706                in it.</Notes>
707                <Fields>
708                    <Field name="product" type="boolean">
709                        <Notes>TRUE if the compound is a product of the reaction, FALSE if
710                        it is a substrate. When a reaction is written on paper in
711                        chemical notation, the substrates are left of the arrow and the
712                        products are to the right. Sorting on this field will cause
713                        the substrates to appear first, followed by the products. If the
714                        reaction is reversible, then the notion of substrates and products
715                        is not at intuitive; however, a value here of FALSE still puts the
716                        compound left of the arrow and a value of TRUE still puts it to the
717                        right.</Notes>
718                    </Field>
719                    <Field name="stoichiometry" type="key-string">
720                        <Notes>Number of molecules of the compound that participate in a
721                        single instance of the reaction. For example, if a reaction
722                        produces two water molecules, the stoichiometry of water for the
723                        reaction would be two. When a reaction is written on paper in
724                        chemical notation, the stoichiometry is the number next to the
725                        chemical formula of the compound.</Notes>
726                    </Field>
727                    <Field name="main" type="boolean">
728                        <Notes>TRUE if this compound is one of the main participants in
729                        the reaction, else FALSE. It is permissible for none of the
730                        compounds in the reaction to be considered main, in which
731                        case this value would be FALSE for all of the relevant
732                        compounds.</Notes>
733                    </Field>
734                    <Field name="loc" type="key-string">
735                        <Notes>An optional character string that indicates the relative
736                        position of this compound in the reaction's chemical formula. The
737                        location affects the way the compounds present as we cross the
738                        relationship from the reaction side. The product/substrate flag
739                        comes first, then the value of this field, then the main flag.
740                        The default value is an empty string; however, the empty string
741                        sorts first, so if this field is used, it should probably be
742                        used for every compound in the reaction.</Notes>
743                    </Field>
744                    <Field name="discriminator" type="int">
745                        <Notes>A unique ID for this record. The discriminator does not
746                        provide any useful data, but it prevents identical records from
747                        being collapsed by the SELECT DISTINCT command used by ERDB to
748                        retrieve data.</Notes>
749                    </Field>
750                </Fields>
751                <ToIndex>
752                    <Notes>This index presents the compounds in the reaction in the
753                    order they should be displayed when writing it in chemical notation.
754                    All the substrates appear before all the products, and within that
755                    ordering, the main compounds appear first.</Notes>
756                    <IndexFields>
757                        <IndexField name="product" order="ascending" />
758                        <IndexField name="loc" order="ascending" />
759                        <IndexField name="main" order="descending" />
760                    </IndexFields>
761                </ToIndex>
762          </Relationship>          </Relationship>
763          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">
764              <Notes>This relationship connects a feature to the contig segments that work together              <Notes>This relationship connects a feature to the contig segments that work together
765              to effect it. The segments are numbered sequentially starting from 1. The database is              to effect it. The segments are numbered sequentially starting from 1. The database is
766              required to place an upper limit on the length of each segment. If a segment is longer              required to place an upper limit on the length of each segment. If a segment is longer
767              than the maximum, it can be broken into smaller bits.              than the maximum, it can be broken into smaller bits.  The upper limit enables applications
768              [p]The upper limit enables applications to locate all features that contain a specific              to locate all features that contain a specific residue. For example, if the upper limit
769              residue. For example, if the upper limit is 100 and we are looking for a feature that              is 100 and we are looking for a feature that contains residue 234 of contig *ABC*, we
770              contains residue 234 of contig [b]ABC[/b], we can look for features with a begin point              can look for features with a begin point between 135 and 333. The results can then be
771              between 135 and 333. The results can then be filtered by direction and length of the              filtered by direction and length of the segment.</Notes>
             segment.</Notes>  
772              <Fields>              <Fields>
773                  <Field name="locN" type="int">                  <Field name="locN" type="int">
774                      <Notes>Sequence number of this segment.</Notes>                      <Notes>Sequence number of this segment.</Notes>
# Line 439  Line 783 
783                      is forward and the point after the residue if the direction is backward.</Notes>                      is forward and the point after the residue if the direction is backward.</Notes>
784                  </Field>                  </Field>
785                  <Field name="dir" type="char">                  <Field name="dir" type="char">
786                      <Notes>Direction of the segment: [b]+[/b] if it is forward and                      <Notes>Direction of the segment: =+= if it is forward and
787                      [b]-[/b] if it is backward.</Notes>                      =-= if it is backward.</Notes>
788                  </Field>                  </Field>
789              </Fields>              </Fields>
790              <FromIndex Unique="false">              <FromIndex>
791                  <Notes>This index allows the application to find all the segments of a feature in                  <Notes>This index allows the application to find all the segments of a feature in
792                  the proper order.</Notes>                  the proper order.</Notes>
793                  <IndexFields>                  <IndexFields>
# Line 458  Line 802 
802                  </IndexFields>                  </IndexFields>
803              </ToIndex>              </ToIndex>
804          </Relationship>          </Relationship>
         <Relationship name="IsBidirectionalBestHitOf" from="Feature" to="Feature" arity="MM">  
             <Notes>This relationship is one of two that relate features to each other. It  
             connects features that are very similar but on separate genomes. A  
             bidirectional best hit relationship exists between two features [b]A[/b]  
             and [b]B[/b] if [b]A[/b] is the best match for [b]B[/b] on [b]A[/b]'s genome  
             and [b]B[/b] is the best match for [b]A[/b] on [b]B[/b]'s genome. </Notes>  
             <Fields>  
                 <Field name="genome" type="name-string">  
                     <Notes>ID of the genome containing the target (to) feature.</Notes>  
                 </Field>  
                 <Field name="sc" type="float">  
                     <Notes>score for this relationship</Notes>  
                 </Field>  
             </Fields>  
             <FromIndex>  
                 <Notes>This index allows the application to find a feature's best hit for  
                 a specific target genome.</Notes>  
                 <IndexFields>  
                     <IndexField name="genome" order="ascending" />  
                 </IndexFields>  
             </FromIndex>  
         </Relationship>  
805          <Relationship name="HasProperty" from="Feature" to="Property" arity="MM">          <Relationship name="HasProperty" from="Feature" to="Property" arity="MM">
806              <Notes>This relationship connects a feature to its known property values.              <Notes>This relationship connects a feature to its known property values.
807              The relationship contains text data that indicates the paper or organization              The relationship contains text data that indicates the paper or organization
# Line 511  Line 833 
833              assignment displayed is the most recent one by a user trusted              assignment displayed is the most recent one by a user trusted
834              by the current user. The current user implicitly trusts himself.              by the current user. The current user implicitly trusts himself.
835              If no trusted users are specified in the database, the user              If no trusted users are specified in the database, the user
836              also implicitly trusts the user [b]FIG[/b].</Notes>              also implicitly trusts the user =FIG=.</Notes>
837            </Relationship>
838            <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">
839                <Notes>This relationship connects a role subset to the roles that it covers.
840                A subset is, essentially, a named group of roles belonging to a specific
841                subsystem, and this relationship effects that. Note that will a role
842                may belong to many subsystems, a subset belongs to only one subsystem,
843                and all roles in the subset must have that subsystem in common.</Notes>
844            </Relationship>
845            <Relationship name="ConsistsOfGenomes" from="GenomeSubset" to="Genome" arity="MM">
846                <Notes>This relationship connects a subset to the genomes that it covers.
847                A subset is, essentially, a named group of genomes participating in a specific
848                subsystem, and this relationship effects that. Note that while a genome
849                may belong to many subsystems, a subset belongs to only one subsystem,
850                and all genomes in the subset must have that subsystem in common.</Notes>
851            </Relationship>
852            <Relationship name="HasRoleSubset" from="Subsystem" to="RoleSubset" arity="1M">
853                <Notes>This relationship connects a subsystem to its constituent
854                role subsets. Note that some roles in a subsystem may not belong to a
855                subset, so the relationship between roles and subsystems cannot be
856                derived from the relationships going through the subset.</Notes>
857            </Relationship>
858            <Relationship name="HasGenomeSubset" from="Subsystem" to="GenomeSubset" arity="1M">
859                <Notes>This relationship connects a subsystem to its constituent
860                genome subsets. Note that some genomes in a subsystem may not belong to a
861                subset, so the relationship between genomes and subsystems cannot be
862                derived from the relationships going through the subset.</Notes>
863            </Relationship>
864            <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM">
865                <Notes>This relationship connects a role to the reactions it catalyzes.
866                The purpose of a role is to create proteins that trigger certain
867                chemical reactions. A single reaction can be triggered by many roles,
868                and a role can trigger many reactions.</Notes>
869            </Relationship>
870            <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">
871                <Notes>This relationship connects a feature to the subsystems in which it
872                participates. This is technically redundant information, but it is used
873                so often that it gets its own table for performance reasons.</Notes>
874                <Fields>
875                    <Field name="genome" type="name-string">
876                        <Notes>ID of the genome containing the feature</Notes>
877                    </Field>
878                    <Field name="type" type="key-string">
879                        <Notes>Feature type (eg. peg, rna)</Notes>
880                    </Field>
881                </Fields>
882                <ToIndex>
883                    <Notes>This index enables the application to view the features of a
884                    subsystem sorted by genome and feature type.</Notes>
885                    <IndexFields>
886                        <IndexField name="genome" order="ascending" />
887                        <IndexField name="type" order="ascending" />
888                    </IndexFields>
889                </ToIndex>
890          </Relationship>          </Relationship>
891      </Relationships>      </Relationships>
892  </Database>  </Database>

Legend:
Removed from v.1.9  
changed lines
  Added in v.1.54

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3