[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.7, Thu Jun 16 19:10:04 2005 UTC revision 1.27, Sun Jun 25 07:34:46 2006 UTC
# Line 25  Line 25 
25                          the user's access codes must match this value.</Notes>                          the user's access codes must match this value.</Notes>
26                                          <DataGen>RandParam('low','medium','high')</DataGen>                                          <DataGen>RandParam('low','medium','high')</DataGen>
27                  </Field>                  </Field>
28                    <Field name="complete" type="boolean">
29                        <Notes>TRUE if the genome is complete, else FALSE</Notes>
30                    </Field>
31                                  <Field name="taxonomy" type="text">                                  <Field name="taxonomy" type="text">
32                                          <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements                                          <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements
33                                          separated by semi-colons (and optional white space), starting with the domain and ending with                                          separated by semi-colons (and optional white space), starting with the domain and ending with
# Line 98  Line 101 
101                                          <DataGen>RandChars("ACGT", IntGen(100,400))</DataGen>                                          <DataGen>RandChars("ACGT", IntGen(100,400))</DataGen>
102                                  </Field>                                  </Field>
103                  <Field name="quality-vector" type="text">                  <Field name="quality-vector" type="text">
104                                          <Notes>String describing the quality data for each . Individual values will                      <Notes>String describing the quality data for each base pair. Individual values will
105                                          be separated by periods. The value represents negative exponent of the probability                                          be separated by periods. The value represents negative exponent of the probability
106                                          of error. Thus, for example, a quality of 30 indicates the probability of error is                                          of error. Thus, for example, a quality of 30 indicates the probability of error is
107                                          10^-30. A higher quality number a better chance of a correct match. It is possible                                          10^-30. A higher quality number a better chance of a correct match. It is possible
108                                          that the quality data is known for a sequence. If that is the case, the quality                      that the quality data is not known for a sequence. If that is the case, the quality
109                                          vector will contain the [b]unknown[/b].</Notes>                                          vector will contain the [b]unknown[/b].</Notes>
110                                          <DataGen>unknown</DataGen>                                          <DataGen>unknown</DataGen>
111                                  </Field>                                  </Field>
112              </Fields>              </Fields>
113          </Entity>          </Entity>
114          <Entity name="Feature" keyType="name-string">          <Entity name="Feature" keyType="id-string">
115              <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features              <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features
116              may be spread across multiple contigs of a genome, but never across more than              may be spread across multiple contigs of a genome, but never across more than
117              one genome. Features can be assigned to roles via spreadsheet cells,              one genome. Features can be assigned to roles via spreadsheet cells,
# Line 118  Line 121 
121                                          <Notes>Code indicating the type of this feature.</Notes>                                          <Notes>Code indicating the type of this feature.</Notes>
122                                          <DataGen>RandParam('peg','rna')</DataGen>                                          <DataGen>RandParam('peg','rna')</DataGen>
123                                  </Field>                                  </Field>
124                  <Field name="alias" type="name-string" relation="FeatureAlias">                  <Field name="alias" type="medium-string" relation="FeatureAlias">
125                                          <Notes>Alternative name for this feature. feature can have many aliases.</Notes>                      <Notes>Alternative name for this feature. A feature can have many aliases.</Notes>
126                                          <DataGen testCount="3">StringGen('Pgi|99999', 'Puni|XXXXXX', 'PAAAAAA999')</DataGen>                                          <DataGen testCount="3">StringGen('Pgi|99999', 'Puni|XXXXXX', 'PAAAAAA999')</DataGen>
127                                  </Field>                                  </Field>
128                  <Field name="translation" type="text" relation="FeatureTranslation">                  <Field name="translation" type="text" relation="FeatureTranslation">
129                                          <Notes>[i](optional)[/i] A of this feature's residues into character codes, formed by concatenating                      <Notes>[i](optional)[/i] A translation of this feature's residues into character
130                          the pieces of the feature together.</Notes>                      codes, formed by concatenating the pieces of the feature together. For a
131                        protein encoding group, this is the protein characters. For other types
132                        it is the DNA characters.</Notes>
133                                          <DataGen testCount="0"></DataGen>                                          <DataGen testCount="0"></DataGen>
134                                  </Field>                                  </Field>
135                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
# Line 133  Line 138 
138                                          <DataGen testCount="0"></DataGen>                                          <DataGen testCount="0"></DataGen>
139                                  </Field>                                  </Field>
140                  <Field name="active" type="boolean">                  <Field name="active" type="boolean">
141                                          <Notes>TRUE if this feature is still considered valid, if it has been logically deleted.</Notes>                      <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes>
142                                          <DataGen>1</DataGen>                                          <DataGen>1</DataGen>
143                                  </Field>                                  </Field>
144                                  <Field name="link" type="text" relation="FeatureLink">                                  <Field name="link" type="text" relation="FeatureLink">
# Line 144  Line 149 
149                                          "&amp;Number=" . IntGen(1,99)</DataGen>                                          "&amp;Number=" . IntGen(1,99)</DataGen>
150                                  </Field>                                  </Field>
151              </Fields>              </Fields>
152                <Indexes>
153                    <Index>
154                        <Notes>This index allows the user to find the feature corresponding to
155                        the specified alias name.</Notes>
156                        <IndexFields>
157                            <IndexField name="alias" order="ascending" />
158                        </IndexFields>
159                    </Index>
160                </Indexes>
161            </Entity>
162            <Entity name="SynonymGroup" keyType="id-string">
163                <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features
164                are mapped to the same synonym group, and this information is used to expand similarities.</Notes>
165          </Entity>          </Entity>
166          <Entity name="Role" keyType="string">          <Entity name="Role" keyType="string">
167              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.
168              One of the main goals of the database is to record the roles of the various features.</Notes>              One of the main goals of the database is to record the roles of the various features.</Notes>
169                          <Fields>                          <Fields>
170                                  <Field name="name" type="string" relation="RoleName">                  <Field name="EC" type="string" relation="RoleEC">
171                                          <Notes>Expanded name of the role. This value is generally only available for roles                      <Notes>EC code for this role.</Notes>
                                         that are encoded as EC numbers.</Notes>  
172                                          <DataGen testCount="1">StringGen(IntGen(20,40)) . "(" . $this->{id} . ")"</DataGen>                                          <DataGen testCount="1">StringGen(IntGen(20,40)) . "(" . $this->{id} . ")"</DataGen>
173                                  </Field>                                  </Field>
174                    <Field name="abbr" type="name-string">
175                        <Notes>Abbreviated name for the role, generally non-unique, but useful
176                        in column headings for HTML tables.</Notes>
177                    </Field>
178                          </Fields>                          </Fields>
179                <Indexes>
180                    <Index>
181                        <Notes>This index allows the user to find the role corresponding to
182                        an EC number.</Notes>
183                        <IndexFields>
184                            <IndexField name="EC" order="ascending" />
185                        </IndexFields>
186                    </Index>
187                </Indexes>
188          </Entity>          </Entity>
189          <Entity name="Annotation" keyType="name-string">          <Entity name="Annotation" keyType="name-string">
190              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations
191                          are currently the only objects that may be inserted directly into the database. All other                          are currently the only objects that may be inserted directly into the database. All other
192                          information is loaded from data exported by the SEED.              information is loaded from data exported by the SEED.</Notes>
                         [p]Each annotation is associated with a target [b]Feature[/b]. The key of the annotation  
                         is the target feature ID followed by a timestamp.</Notes>  
193              <Fields>              <Fields>
194                  <Field name="time" type="date">                  <Field name="time" type="date">
195                                          <Notes>Date and time of the annotation.</Notes>                                          <Notes>Date and time of the annotation.</Notes>
# Line 170  Line 198 
198                                          <Notes>Text of the annotation.</Notes>                                          <Notes>Text of the annotation.</Notes>
199                                  </Field>                                  </Field>
200              </Fields>              </Fields>
201                <Indexes>
202                    <Index>
203                        <Notes>This index allows the user to find recent annotations.</Notes>
204                        <IndexFields>
205                            <IndexField name="time" order="descending" />
206                        </IndexFields>
207                    </Index>
208                </Indexes>
209            </Entity>
210            <Entity name="Reaction" keyType="key-string">
211                <Notes>A [i]reaction[/i] is a chemical process catalyzed by a protein. The reaction ID
212                is generally a small number preceded by a letter.</Notes>
213                <Fields>
214                    <Field name="url" type="string" relation="ReactionURL">
215                        <Notes>HTML string containing a link to a web location that describes the
216                        reaction. This field is optional.</Notes>
217                    </Field>
218                    <Field name="rev" type="boolean">
219                        <Notes>TRUE if this reaction is reversible, else FALSE</Notes>
220                    </Field>
221                </Fields>
222            </Entity>
223            <Entity name="Compound" keyType="name-string">
224                <Notes>A [i]compound[/i] is a chemical that participates in a reaction.
225                All compounds have a unique ID and may also have one or more names.</Notes>
226                <Fields>
227                    <Field name="name-priority" type="int" relation="CompoundName">
228                        <Notes>Priority of a compound name. The name with the loweset
229                        priority is the main name of this compound.</Notes>
230                    </Field>
231                    <Field name="name" type="name-string" relation="CompoundName">
232                        <Notes>Descriptive name for the compound. A compound may
233                        have several names.</Notes>
234                    </Field>
235                    <Field name="cas-id" type="name-string" relation="CompoundCAS">
236                        <Notes>Chemical Abstract Service ID for this compound (optional).</Notes>
237                    </Field>
238                    <Field name="label" type="name-string">
239                        <Notes>Name used in reaction display strings.
240                        It is the same as the name possessing a priority of 1, but it is placed
241                        here to speed up the query used to create the display strings.</Notes>
242                    </Field>
243                </Fields>
244                <Indexes>
245                    <Index>
246                        <Notes>This index allows the user to find the compound corresponding to
247                        the specified name.</Notes>
248                        <IndexFields>
249                            <IndexField name="name" order="ascending" />
250                        </IndexFields>
251                    </Index>
252                    <Index>
253                        <Notes>This index allows the user to find the compound corresponding to
254                        the specified CAS ID.</Notes>
255                        <IndexFields>
256                            <IndexField name="cas-id" order="ascending" />
257                        </IndexFields>
258                    </Index>
259                    <Index>
260                        <Notes>This index allows the user to access the compound names in
261                        priority order.</Notes>
262                        <IndexFields>
263                            <IndexField name="id" order="ascending" />
264                            <IndexField name="name-priority" order="ascending" />
265                        </IndexFields>
266                    </Index>
267                </Indexes>
268          </Entity>          </Entity>
269          <Entity name="Subsystem" keyType="string">          <Entity name="Subsystem" keyType="string">
270              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems
271              is an important tool for recognizing parallel genetic features in different organisms.</Notes>              is an important tool for recognizing parallel genetic features in different organisms.</Notes>
272                <Fields>
273                    <Field name="curator" type="string">
274                        <Notes>Name of the person currently in charge of the subsystem.</Notes>
275                    </Field>
276                    <Field name="notes" type="text">
277                        <Notes>Descriptive notes about the subsystem.</Notes>
278                    </Field>
279                </Fields>
280            </Entity>
281            <Entity name="RoleSubset" keyType="string">
282                <Notes>A [i]role subset[/i] is a named collection of roles in a particular subsystem. The
283                subset names are generally very short, non-unique strings. The ID of the parent
284                subsystem is prefixed to the subset ID in order to make it unique.</Notes>
285          </Entity>          </Entity>
286          <Entity name="SSCell" keyType="name-string">          <Entity name="GenomeSubset" keyType="string">
287                <Notes>A [i]genome subset[/i] is a named collection of genomes that participate
288                in a particular subsystem. The subset names are generally very short, non-unique
289                strings. The ID of the parent subsystem is prefixed to the subset ID in order
290                to make it unique.</Notes>
291            </Entity>
292            <Entity name="SSCell" keyType="hash-string">
293              <Notes>Part of the process of locating and assigning features is creating a spreadsheet of              <Notes>Part of the process of locating and assigning features is creating a spreadsheet of
294              genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one              genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one
295              of the positions on the spreadsheet.</Notes>              of the positions on the spreadsheet.</Notes>
# Line 256  Line 370 
370                                          </Field>                                          </Field>
371                                  </Fields>                                  </Fields>
372                  </Entity>                  </Entity>
373          <Entity name="Coupling" keyType="medium-string">          <Entity name="Coupling" keyType="hash-string">
374              <Notes>A coupling is a relationship between two features. The features are              <Notes>A coupling is a relationship between two features. The features are
375              physically close on the contig, and there is evidence that they generally              physically close on the contig, and there is evidence that they generally
376              belong together. The key of this entity is formed by combining the coupled              belong together. The key of this entity is formed by combining the coupled
# Line 272  Line 386 
386                  </Field>                  </Field>
387              </Fields>              </Fields>
388          </Entity>          </Entity>
389          <Entity name="PCH" keyType="string">          <Entity name="PCH" keyType="hash-string">
390              <Notes>A PCH (physically close homolog) connects a clustering (which is a              <Notes>A PCH (physically close homolog) connects a clustering (which is a
391              pair of physically close features on a contig) to a second pair of physically              pair of physically close features on a contig) to a second pair of physically
392              close features that are similar to the first. Essentially, the PCH is a              close features that are similar to the first. Essentially, the PCH is a
# Line 314  Line 428 
428                  </IndexFields>                  </IndexFields>
429              </ToIndex>              </ToIndex>
430          </Relationship>          </Relationship>
431            <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="1M">
432                <Notes>This relation connects a synonym group to the features that make it
433                up.</Notes>
434            </Relationship>
435            <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M">
436                <Notes>This relationship connects a genome to all of its features. This
437                relationship is redundant in a sense, because the genome ID is part
438                of the feature ID; however, it makes the creation of certain queries more
439                convenient because you can drag in filtering information for a feature's
440                genome.</Notes>
441                <Fields>
442                    <Field name="type" type="key-string">
443                        <Notes>Feature type (eg. peg, rna)</Notes>
444                    </Field>
445                </Fields>
446                <ToIndex>
447                    <Notes>This index enables the application to view the features of a
448                    Genome sorted by type.</Notes>
449                    <IndexFields>
450                        <IndexField name="type" order="ascending" />
451                    </IndexFields>
452                </ToIndex>
453            </Relationship>
454          <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M">          <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M">
455              <Notes>This relationship connects a functional coupling to the physically              <Notes>This relationship connects a functional coupling to the physically
456              close homologs (PCHs) which affirm that the coupling is meaningful.</Notes>              close homologs (PCHs) which affirm that the coupling is meaningful.</Notes>
# Line 384  Line 521 
521              <Notes>This relationship connects subsystems to the genomes that use              <Notes>This relationship connects subsystems to the genomes that use
522              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be
523              connected to the genome features through the [b]SSCell[/b] object.</Notes>              connected to the genome features through the [b]SSCell[/b] object.</Notes>
524                <Fields>
525                    <Field name="variant-code" type="key-string">
526                        <Notes>Code indicating the subsystem variant to which this
527                        genome belongs. Each subsystem can have multiple variants. A variant
528                        code of [b]-1[/b] indicates that the genome does not have a functional
529                        variant of the subsystem. A variant code of [b]0[/b] indicates that
530                        the genome's participation is considered iffy.</Notes>
531                    </Field>
532                </Fields>
533                <ToIndex>
534                    <Notes>This index enables the application to find all of the genomes using
535                    a subsystem in order by variant code, which is how we wish to display them
536                    in the spreadsheets.</Notes>
537                    <IndexFields>
538                        <IndexField name="variant-code" order="ascending" />
539                    </IndexFields>
540                </ToIndex>
541          </Relationship>          </Relationship>
542          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">
543              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>
544                <Fields>
545                    <Field name="column-number" type="int">
546                        <Notes>Column number for this role in the specified subsystem's
547                        spreadsheet.</Notes>
548                    </Field>
549                </Fields>
550                <ToIndex>
551                    <Notes>This index enables the application to see the subsystem roles
552                    in column order. The ordering of the roles is usually significant,
553                    so it is important to preserve it.</Notes>
554                    <IndexFields>
555                        <IndexField name="column-number" order="ascending" />
556                    </IndexFields>
557                </ToIndex>
558          </Relationship>          </Relationship>
559          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">          <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M">
560              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
# Line 399  Line 567 
567          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">          <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM">
568              <Notes>This relationship connects a subsystem's spreadsheet cell to the              <Notes>This relationship connects a subsystem's spreadsheet cell to the
569              features assigned to it.</Notes>              features assigned to it.</Notes>
570                <Fields>
571                    <Field name="cluster-number" type="int">
572                        <Notes>ID of this feature's cluster. Clusters represent families of
573                        related proteins participating in a subsystem.</Notes>
574                    </Field>
575                </Fields>
576            </Relationship>
577            <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM">
578                <Notes>This relationship connects a reaction to the compounds that participate
579                in it.</Notes>
580                <Fields>
581                    <Field name="product" type="boolean">
582                        <Notes>TRUE if the compound is a product of the reaction, FALSE if
583                        it is a substrate. When a reaction is written on paper in
584                        chemical notation, the substrates are left of the arrow and the
585                        products are to the right. Sorting on this field will cause
586                        the substrates to appear first, followed by the products. If the
587                        reaction is reversible, then the notion of substrates and products
588                        is not at intuitive; however, a value here of FALSE still puts the
589                        compound left of the arrow and a value of TRUE still puts it to the
590                        right.</Notes>
591                    </Field>
592                    <Field name="stoichiometry" type="key-string">
593                        <Notes>Number of molecules of the compound that participate in a
594                        single instance of the reaction. For example, if a reaction
595                        produces two water molecules, the stoichiometry of water for the
596                        reaction would be two. When a reaction is written on paper in
597                        chemical notation, the stoichiometry is the number next to the
598                        chemical formula of the compound.</Notes>
599                    </Field>
600                    <Field name="main" type="boolean">
601                        <Notes>TRUE if this compound is one of the main participants in
602                        the reaction, else FALSE. It is permissible for none of the
603                        compounds in the reaction to be considered main, in which
604                        case this value would be FALSE for all of the relevant
605                        compounds.</Notes>
606                    </Field>
607                    <Field name="loc" type="key-string">
608                        <Notes>An optional character string that indicates the relative
609                        position of this compound in the reaction's chemical formula. The
610                        location affects the way the compounds present as we cross the
611                        relationship from the reaction side. The product/substrate flag
612                        comes first, then the value of this field, then the main flag.
613                        The default value is an empty string; however, the empty string
614                        sorts first, so if this field is used, it should probably be
615                        used for every compound in the reaction.</Notes>
616                    </Field>
617                    <Field name="discriminator" type="int">
618                        <Notes>A unique ID for this record. The discriminator does not
619                        provide any useful data, but it prevents identical records from
620                        being collapsed by the SELECT DISTINCT command used by ERDB to
621                        retrieve data.</Notes>
622                    </Field>
623                </Fields>
624                <ToIndex>
625                    <Notes>This index presents the compounds in the reaction in the
626                    order they should be displayed when writing it in chemical notation.
627                    All the substrates appear before all the products, and within that
628                    ordering, the main compounds appear first.</Notes>
629                    <IndexFields>
630                        <IndexField name="product" order="ascending" />
631                        <IndexField name="loc" order="ascending" />
632                        <IndexField name="main" order="descending" />
633                    </IndexFields>
634                </ToIndex>
635          </Relationship>          </Relationship>
636          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">          <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM">
637              <Notes>This relationship connects a feature to the contig segments that work together              <Notes>This relationship connects a feature to the contig segments that work together
# Line 498  Line 731 
731                          If no trusted users are specified in the database, the user                          If no trusted users are specified in the database, the user
732                          also implicitly trusts the user [b]FIG[/b].</Notes>                          also implicitly trusts the user [b]FIG[/b].</Notes>
733                  </Relationship>                  </Relationship>
734            <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">
735                <Notes>This relationship connects a role subset to the roles that it covers.
736                A subset is, essentially, a named group of roles belonging to a specific
737                subsystem, and this relationship effects that. Note that will a role
738                may belong to many subsystems, a subset belongs to only one subsystem,
739                and all roles in the subset must have that subsystem in common.</Notes>
740            </Relationship>
741            <Relationship name="ConsistsOfGenomes" from="GenomeSubset" to="Genome" arity="MM">
742                <Notes>This relationship connects a subset to the genomes that it covers.
743                A subset is, essentially, a named group of genomes participating in a specific
744                subsystem, and this relationship effects that. Note that while a genome
745                may belong to many subsystems, a subset belongs to only one subsystem,
746                and all genomes in the subset must have that subsystem in common.</Notes>
747            </Relationship>
748            <Relationship name="HasRoleSubset" from="Subsystem" to="RoleSubset" arity="1M">
749                <Notes>This relationship connects a subsystem to its constituent
750                role subsets. Note that some roles in a subsystem may not belong to a
751                subset, so the relationship between roles and subsystems cannot be
752                derived from the relationships going through the subset.</Notes>
753            </Relationship>
754            <Relationship name="HasGenomeSubset" from="Subsystem" to="GenomeSubset" arity="1M">
755                <Notes>This relationship connects a subsystem to its constituent
756                genome subsets. Note that some genomes in a subsystem may not belong to a
757                subset, so the relationship between genomes and subsystems cannot be
758                derived from the relationships going through the subset.</Notes>
759            </Relationship>
760            <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM">
761                <Notes>This relationship connects a role to the reactions it catalyzes.
762                The purpose of a role is to create proteins that trigger certain
763                chemical reactions. A single reaction can be triggered by many roles,
764                and a role can trigger many reactions.</Notes>
765            </Relationship>
766      </Relationships>      </Relationships>
767  </Database>  </Database>

Legend:
Removed from v.1.7  
changed lines
  Added in v.1.27

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3