[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.51, Tue Feb 5 05:46:03 2008 UTC revision 1.52, Sun Mar 23 16:32:34 2008 UTC
# Line 1  Line 1 
1  <?xml version="1.0" encoding="utf-8" ?>  <?xml version="1.0" encoding="utf-8" ?>
2  <Database>  <Database>
3      <Title>Sprout Genome and Subsystem Database</Title>      <Title>Sprout Genome and Subsystem Database</Title>
4      <Notes>The Sprout database contains the genetic data for all complete organisms in the SEED.      <Notes>The Sprout database contains the genetic data for all complete organisms in the [[SeedEnvironment]].
5      The data that is not in Sprout-- attributes, similarities, couplings-- is stored on external      The data that is not in Sprout-- attributes, similarities, couplings-- is stored on external
6      servers available to the Sprout software. The Sprout database is reloaded approximately once      servers available to the Sprout software. The Sprout database is reloaded approximately once
7      per month. There is significant redundancy in the Sprout database because it has been      per month. There is significant redundancy in the Sprout database because it has been
# Line 9  Line 9 
9      feature's functional role and a list of possible search terms.</Notes>      feature's functional role and a list of possible search terms.</Notes>
10      <Entities>      <Entities>
11          <Entity name="Genome" keyType="name-string">          <Entity name="Genome" keyType="name-string">
12              <Notes>A [i]genome[/i] contains the sequence data for a particular individual organism.</Notes>              <Notes>A [[Genome]] contains the sequence data for a particular individual organism.</Notes>
13              <Fields>              <Fields>
14                  <Field name="genus" type="name-string">                  <Field name="genus" type="name-string">
15                      <Notes>Genus of the relevant organism.</Notes>                      <Notes>Genus of the relevant organism.</Notes>
# Line 27  Line 27 
27                      by a period and a string of digits.</Notes>                      by a period and a string of digits.</Notes>
28                  </Field>                  </Field>
29                  <Field name="access-code" type="key-string">                  <Field name="access-code" type="key-string">
30                      <Notes>The access code determines which users can look at the data relating to this genome.                      <Notes>The access code field is deprecated. Its function has been replaced by
31                      Each user is associated with a set of access codes. In order to view a genome, one of                      the account management system developed for the [[RapidAnnotationServer]].</Notes>
                     the user's access codes must match this value.</Notes>  
32                  </Field>                  </Field>
33                  <Field name="complete" type="boolean">                  <Field name="complete" type="boolean">
34                      <Notes>TRUE if the genome is complete, else FALSE</Notes>                      <Notes>TRUE if the genome is complete, else FALSE</Notes>
# Line 38  Line 37 
37                      <Notes>number of base pairs in the genome</Notes>                      <Notes>number of base pairs in the genome</Notes>
38                  </Field>                  </Field>
39                  <Field name="taxonomy" type="text">                  <Field name="taxonomy" type="text">
40                      <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements                      <Notes>The taxonomy string contains the full [[Wikipedia:taxonomy]] of the organism, while individual elements
41                      separated by semi-colons (and optional white space), starting with the domain and ending with                      separated by semi-colons (and optional white space), starting with the domain and ending with
42                      the disambiguated genus and species (which is the organism's scientific name plus an                      the disambiguated genus and species (which is the organism's scientific name plus an
43                      identifying string).</Notes>                      identifying string).</Notes>
44                  </Field>                  </Field>
45                  <Field name="primary-group" type="name-string">                  <Field name="primary-group" type="name-string">
46                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group
47                      (either based on the organism name or the default value "Supporting"), whereas there can be                      per organism (either based on the organism name or the default value =Supporting=). In general,
48                      multiple named groups or even none.</Notes>                      more data is kept on organisms in NMPDR groups than on supporting organisms.</Notes>
49                    </Field>
50                    <Field name="contigs" type="int">
51                        <Notes>Number of contigs for this organism.</Notes>
52                    </Field>
53                    <Field name="pegs" type="int">
54                        <Notes>Number of [[protein encoding genes]] for this organism</Notes>
55                    </Field>
56                    <Field name="rnas" type="int">
57                        <Notes>Number of RNA features found for this organism.</Notes>
58                  </Field>                  </Field>
59              </Fields>              </Fields>
60              <Indexes>              <Indexes>
# Line 84  Line 92 
92          </Entity>          </Entity>
93          <Entity name="CDD" keyType="key-string">          <Entity name="CDD" keyType="key-string">
94              <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit              <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit
95              on a feature's protein. The ID is six-digit string assigned by the public Conserved Domain              on a feature's protein. The ID is six-digit string assigned by the public
96              Database. A CDD can occur on multiple features and a feature generally has multiple CDDs.</Notes>              [[http://www.ncbi.nlm.nih.gov/Structure/cdd/cdd.shtml Conserved Domain Database]]. A CDD
97                can occur on multiple features and a feature generally has multiple CDDs.</Notes>
98          </Entity>          </Entity>
99          <Entity name="Source" keyType="medium-string">          <Entity name="Source" keyType="medium-string">
100              <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization              <Notes>A _source_ describes a place from which genome data was taken. This can be an organization
101              or a paper citation.</Notes>              or a paper citation.</Notes>
102              <Fields>              <Fields>
103                  <Field name="URL" type="string" relation="SourceURL">                  <Field name="URL" type="string" relation="SourceURL">
104                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>
105                  </Field>                  </Field>
106                  <Field name="description" type="text">                  <Field name="description" type="text">
107                      <Notes>Description the source. The description can be a street address or a citation.</Notes>                      <Notes>Description of the source. The description can be a street address or a citation.</Notes>
108                  </Field>                  </Field>
109              </Fields>              </Fields>
110          </Entity>          </Entity>
111          <Entity name="Contig" keyType="name-string">          <Entity name="Contig" keyType="name-string">
112              <Notes>A [i]contig[/i] is a contiguous run of residues. The contig's ID consists of the              <Notes>A _contig_ is a contiguous run of residues. The contig's ID consists of the
113              genome ID followed by a name that identifies which contig this is for the parent genome. As              genome ID followed by a name that identifies which contig this is for the parent genome. As
114              is the case with all keys in this database, the individual components are separated by a              is the case with all keys in this database, the individual components are separated by a
115              period.              period. A contig can contain over a million residues. For performance reasons, therefore,
116              [p]A contig can contain over a million residues. For performance reasons, therefore,              the contig is split into multiple pieces called _sequences_. The sequences
             the contig is split into multiple pieces called [i]sequences[/i]. The sequences  
117              contain the characters that represent the residues as well as data on the quality of              contain the characters that represent the residues as well as data on the quality of
118              the residue identification.</Notes>              the residue identification.</Notes>
119          </Entity>          </Entity>
120          <Entity name="Sequence" keyType="name-string">          <Entity name="Sequence" keyType="name-string">
121              <Notes>A [i]sequence[/i] is a continuous piece of a [i]contig[/i]. Contigs are split into              <Notes>A _sequence_ is a continuous piece of a contig. Contigs are split into
122              sequences so that we don't have to have the entire contig in memory when we are              sequences so that we don't have to have the entire contig in memory when we are
123              manipulating it. The key of the sequence is the contig ID followed by the index of              manipulating it. The key of the sequence is the contig ID followed by the index of
124              the begin point.</Notes>              the begin point.</Notes>
125              <Fields>              <Fields>
126                  <Field name="sequence" type="text">                  <Field name="sequence" type="text">
127                      <Notes>String consisting of the residues. Each residue is described by a single                      <Notes>String consisting of the residues (base pairs). Each residue is described by a single
128                      character in the string.</Notes>                      character in the string.</Notes>
129                  </Field>                  </Field>
130                  <Field name="quality-vector" type="text">                  <Field name="quality-vector" type="text">
131                      <Notes>String describing the quality data for each base pair. Individual values will                      <Notes>String describing the quality data for each base pair. Individual values will
132                      be separated by periods. The value represents negative exponent of the probability                      be separated by periods. The value represents negative exponent of the probability
133                      of error. Thus, for example, a quality of 30 indicates the probability of error is                      of error. Thus, for example, a quality of 30 indicates the probability of error is
134                      10^-30. A higher quality number a better chance of a correct match. It is possible                      10^-30. A higher quality number indicates a better chance of a correct match. It is
135                      that the quality data is not known for a sequence. If that is the case, the quality                      possible that the quality data is not known for a sequence. If that is the case, the
136                      vector will contain the [b]unknown[/b].</Notes>                      quality vector will contain the string =unknown=.</Notes>
137                  </Field>                  </Field>
138              </Fields>              </Fields>
139          </Entity>          </Entity>
140          <Entity name="Feature" keyType="id-string">          <Entity name="Feature" keyType="id-string">
141              <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features              <Notes>A _feature_ (sometimes also called a [[gene]]) is a part of a genome that is of special interest. Features
142              may be spread across multiple contigs of a genome, but never across more than              may be spread across multiple contigs of a genome, but never across more than
143              one genome. Features can be assigned to roles via spreadsheet cells,              one genome. Features can be assigned to roles via spreadsheet cells,
144              and are the targets of annotation.</Notes>              and are the targets of annotation. Each feature in the database has a unique [[FigId]].</Notes>
145              <Fields>              <Fields>
146                  <Field name="feature-type" type="id-string">                  <Field name="feature-type" type="id-string">
147                      <Notes>Code indicating the type of this feature.</Notes>                      <Notes>Code indicating the type of this feature. Among the codes currently
148                        supported are =peg= for a [[protein encoding gene]], =bs= for a
149                        binding site, =opr= for an operon, and so forth.</Notes>
150                  </Field>                  </Field>
151                  <Field name="translation" type="text" relation="FeatureTranslation">                  <Field name="translation" type="text" relation="FeatureTranslation">
152                      <Notes>[i](optional)[/i] A translation of this feature's residues into character                      <Notes>_(optional)_ A translation of this feature's residues into character
153                      codes, formed by concatenating the pieces of the feature together. For a                      codes, formed by concatenating the pieces of the feature together. For a
154                      protein encoding group, this is the protein characters. For other types                      [[protein encoding gene]], the translation contains protein characters. For other types
155                      it is the DNA characters.</Notes>                      it contains DNA characters.</Notes>
156                  </Field>                  </Field>
157                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
158                      <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of                      <Notes>Upstream sequence for the feature. This includes residues preceding the feature as
159                      the feature's initial residues.</Notes>                      well as some of the feature's initial residues.</Notes>
160                  </Field>                  </Field>
161                  <Field name="assignment" type="text">                  <Field name="assignment" type="text">
162                      <Notes>Default functional assignment for this feature.</Notes>                      <Notes>Default functional assignment for this feature.</Notes>
163                  </Field>                  </Field>
164                  <Field name="active" type="boolean">                  <Field name="active" type="boolean">
165                      <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes>                      <Notes>(This field is deprecated.) TRUE if this feature is still considered valid,
166                        FALSE if it has been logically deleted.</Notes>
167                  </Field>                  </Field>
168                  <Field name="assignment-maker" type="name-string">                  <Field name="assignment-maker" type="name-string">
169                      <Notes>name of the user who made the functional assignment</Notes>                      <Notes>name of the user who made the functional assignment</Notes>
# Line 166  Line 177 
177                      functional assignment, subsystem roles, and special properties.</Notes>                      functional assignment, subsystem roles, and special properties.</Notes>
178                  </Field>                  </Field>
179                  <Field name="link" type="text" relation="FeatureLink">                  <Field name="link" type="text" relation="FeatureLink">
180                      <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The                      <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
181                      links are to other websites that have useful about the gene that the feature represents, and                      links are to other websites that have useful about the gene that the feature represents, and
182                      are coded as raw HTML, using [b]&lt;a href="[i]link[/i]"&gt;[i]text[/i]&lt;/a&gt;[/b] notation.</Notes>                      are coded as raw HTML, using &lt;a href="_link_"&gt;_text_&lt;/a&gt; notation.</Notes>
183                  </Field>                  </Field>
184                  <Field name="conservation" type="float" relation="FeatureConservation">                  <Field name="conservation" type="float" relation="FeatureConservation">
185                      <Notes>A number between 0 and 1 that indicates the degree to which this feature's DNA is                      <Notes>_(optional)_ A number between 0 and 1 that indicates the degree to which this feature's DNA is
186                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less
187                      than 1 is a reflection of the degree to which gap characters interfere in the alignment                      than 1 is a reflection of the degree to which gap characters interfere in the alignment
188                      between the feature and its close relatives.</Notes>                      between the feature and its close relatives.</Notes>
# Line 202  Line 213 
213                  <Field name="location-string" type="text">                  <Field name="location-string" type="text">
214                      <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location                      <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location
215                      strings. This gives us a fast mechanism for extracting the feature location. Otherwise,                      strings. This gives us a fast mechanism for extracting the feature location. Otherwise,
216                      we have to painstakingly paste together the IsLocatedIn records, which are themselves                      we have to painstakingly paste together the [[#IsLocatedIn]] records, which are themselves
217                      designed to help look for genes in a particular region rather than to find the location                      designed to help look for features in a particular region rather than to find the location
218                      of a gene.</Notes>                      of a feature.</Notes>
219                  </Field>                  </Field>
220              </Fields>              </Fields>
221              <Indexes>              <Indexes>
# Line 218  Line 229 
229          </Entity>          </Entity>
230          <Entity name="FeatureAlias" keyType="medium-string">          <Entity name="FeatureAlias" keyType="medium-string">
231              <Notes>Alternative names for features. A feature can have many aliases. In general,              <Notes>Alternative names for features. A feature can have many aliases. In general,
232              each alias corresponds to only one feature, but there are exceptionsis is not strictly enforced.</Notes>              each alias corresponds to only one feature, but there are many exceptions to this rule.</Notes>
233            </Entity>
234            <Entity name="SproutUser" keyType="name-string">
235                <Notes>A _user_ is a person who can make annotations and view data in the database. The
236                user object is keyed on the user's login name.</Notes>
237                <Fields>
238                    <Field name="description" type="string">
239                        <Notes>Full name or description of this user.</Notes>
240                    </Field>
241                    <Field name="access-code" type="key-string" relation="UserAccess">
242                        <Notes>This field is deprecated.</Notes>
243                    </Field>
244                </Fields>
245          </Entity>          </Entity>
246          <Entity name="SynonymGroup" keyType="id-string">          <Entity name="SynonymGroup" keyType="id-string">
247              <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features              <Notes>A _synonym group_ represents a group of features. Features that represent substantially
248              are mapped to the same synonym group, and this information is used to expand similarities.</Notes>              identical proteins or DNA sequences are mapped to the same synonym group, and this information is
249                used to expand similarities.</Notes>
250          </Entity>          </Entity>
251          <Entity name="Role" keyType="string">          <Entity name="Role" keyType="string">
252              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.              <Notes>A _role_ describes a biological function that may be fulfilled by a feature.
253              One of the main goals of the database is to record the roles of the various features.</Notes>              One of the main goals of the database is to record the roles of the various features.</Notes>
254          </Entity>          </Entity>
255          <Entity name="RoleEC" keyType="string">          <Entity name="RoleEC" keyType="string">
256              <Notes>EC code for a role.</Notes>              <Notes>EC code for a role.</Notes>
257          </Entity>          </Entity>
258          <Entity name="Annotation" keyType="name-string">          <Entity name="Annotation" keyType="name-string">
259              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations              <Notes>An _annotation_ contains supplementary information about a feature. The most
260              are currently the only objects that may be inserted directly into the database. All other              important type of annotation is the assignment of a [[functional role]]; however,
261              information is loaded from data exported by the SEED.</Notes>              other types of annotations are also possible.</Notes>
262              <Fields>              <Fields>
263                  <Field name="time" type="date">                  <Field name="time" type="date">
264                      <Notes>Date and time of the annotation.</Notes>                      <Notes>Date and time of the annotation.</Notes>
# Line 253  Line 277 
277              </Indexes>              </Indexes>
278          </Entity>          </Entity>
279          <Entity name="Reaction" keyType="key-string">          <Entity name="Reaction" keyType="key-string">
280              <Notes>A [i]reaction[/i] is a chemical process catalyzed by a protein. The reaction ID              <Notes>A _reaction_ is a chemical process catalyzed by a protein. The reaction ID
281              is generally a small number preceded by a letter.</Notes>              is generally a small number preceded by a letter.</Notes>
282              <Fields>              <Fields>
283                  <Field name="url" type="string" relation="ReactionURL">                  <Field name="url" type="string" relation="ReactionURL">
# Line 266  Line 290 
290              </Fields>              </Fields>
291          </Entity>          </Entity>
292          <Entity name="Compound" keyType="name-string">          <Entity name="Compound" keyType="name-string">
293              <Notes>A [i]compound[/i] is a chemical that participates in a reaction.              <Notes>A _compound_ is a chemical that participates in a reaction.
294              All compounds have a unique ID and may also have one or more names.</Notes>              All compounds have a unique ID and may also have one or more names.</Notes>
295              <Fields>              <Fields>
296                  <Field name="label" type="string">                  <Field name="label" type="string">
297                      <Notes>Name used in reaction display strings.                      <Notes>Name used in reaction display strings. This is the same as the name
298                      It is the same as the name possessing a priority of 1, but it is placed                      possessing a priority of 1, but it is placed here to speed up the query
299                      here to speed up the query used to create the display strings.</Notes>                      used to create the display strings.</Notes>
300                  </Field>                  </Field>
301              </Fields>              </Fields>
302          </Entity>          </Entity>
303          <Entity name="CompoundName" keyType="string">          <Entity name="CompoundName" keyType="string">
304              <Notes>A [i]compound name[/i] is a common name for the chemical represented by a              <Notes>A _compound name_ is a common name for the chemical represented by a
305              compound.</Notes>              compound.</Notes>
306          </Entity>          </Entity>
307          <Entity name="CompoundCAS" keyType="name-string">          <Entity name="CompoundCAS" keyType="name-string">
308              <Notes>This entity represents the Chemical Abstract Service ID for a compound. Each              <Notes>This entity represents the [[http://www.cas.org/ Chemical Abstract Service]] ID for a
309              Compound has at most one CAS ID.</Notes>              compound. Each Compound has at most one CAS ID.</Notes>
310          </Entity>          </Entity>
311          <Entity name="Subsystem" keyType="string">          <Entity name="Subsystem" keyType="string">
312              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems              <Notes>A _subsystem_ is a collection of roles that work together in a cell. Identification of subsystems
313              is an important tool for recognizing parallel genetic features in different organisms.</Notes>              is an important tool for recognizing parallel genetic features in different organisms. See also
314                [[Subsystem Approach]] and [[Subsystem]].</Notes>
315              <Fields>              <Fields>
316                  <Field name="curator" type="string">                  <Field name="curator" type="string">
317                      <Notes>Name of the person currently in charge of the subsystem.</Notes>                      <Notes>Name of the person currently in charge of the subsystem.</Notes>
# Line 294  Line 319 
319                  <Field name="notes" type="text">                  <Field name="notes" type="text">
320                      <Notes>Descriptive notes about the subsystem.</Notes>                      <Notes>Descriptive notes about the subsystem.</Notes>
321                  </Field>                  </Field>
322                    <Field name="description" type="text">
323                        <Notes>Description of the subsystem's function.</Notes>
324                    </Field>
325                  <Field name="classification" type="string" relation="SubsystemClass">                  <Field name="classification" type="string" relation="SubsystemClass">
326                      <Notes>Classification string, colon-delimited. This string organizes the                      <Notes>Classification string, colon-delimited. This string organizes the
327                      subsystems into a hierarchy.</Notes>                      subsystems into a hierarchy.</Notes>
# Line 301  Line 329 
329              </Fields>              </Fields>
330          </Entity>          </Entity>
331          <Entity name="RoleSubset" keyType="string">          <Entity name="RoleSubset" keyType="string">
332              <Notes>A [i]role subset[/i] is a named collection of roles in a particular subsystem. The              <Notes>A _role subset_ is a named collection of roles in a particular subsystem. The
333              subset names are generally very short, non-unique strings. The ID of the parent              subset names are generally very short, non-unique strings. The ID of the parent
334              subsystem is prefixed to the subset ID in order to make it unique.</Notes>              subsystem is prefixed to the subset ID in order to make it unique.</Notes>
335          </Entity>          </Entity>
336          <Entity name="GenomeSubset" keyType="string">          <Entity name="GenomeSubset" keyType="string">
337              <Notes>A [i]genome subset[/i] is a named collection of genomes that participate              <Notes>A _genome subset_ is a named collection of genomes that participate
338              in a particular subsystem. The subset names are generally very short, non-unique              in a particular subsystem. The subset names are generally very short, non-unique
339              strings. The ID of the parent subsystem is prefixed to the subset ID in order              strings. The ID of the parent subsystem is prefixed to the subset ID in order
340              to make it unique.</Notes>              to make it unique.</Notes>
341          </Entity>          </Entity>
342          <Entity name="SSCell" keyType="hash-string">          <Entity name="SSCell" keyType="hash-string">
343              <Notes>Part of the process of locating and assigning features is creating a spreadsheet of              <Notes>Part of the process of [[SubsystemsApproach][subsystem annotation]] of [[features]]
344              genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one              is creating a spreadsheet of genomes and roles to which features are assigned. A _spreadsheet
345              of the positions on the spreadsheet.</Notes>              cell_ represents one of the positions on the spreadsheet.</Notes>
         </Entity>  
         <Entity name="SproutUser" keyType="name-string">  
             <Notes>A [i]user[/i] is a person who can make annotations and view data in the database. The  
             user object is keyed on the user's login name.</Notes>  
             <Fields>  
                 <Field name="description" type="string">  
                     <Notes>Full name or description of this user.</Notes>  
                 </Field>  
                 <Field name="access-code" type="key-string" relation="UserAccess">  
                     <Notes>Access code possessed by this  
                     user. A user can have many access codes; a genome is accessible to the user if its  
                     access code matches any one of the user's access codes.</Notes>  
                 </Field>  
             </Fields>  
346          </Entity>          </Entity>
347          <Entity name="Property" keyType="int">          <Entity name="Property" keyType="int">
348              <Notes>A [i]property[/i] is a type of assertion that could be made about the properties of              <Notes>A _property_ is a type of assertion that could be made about the properties of
349              a particular feature. Each property instance is a key/value pair and can be associated              a particular feature. Each property instance is a key/value pair and can be associated
350              with many different features. Conversely, a feature can be associated with many key/value              with many different features. Conversely, a feature can be associated with many key/value
351              pairs, even some that notionally contradict each other. For example, there can be evidence              pairs, even some that notionally contradict each other. For example, there can be evidence
# Line 358  Line 372 
372              </Indexes>              </Indexes>
373          </Entity>          </Entity>
374          <Entity name="Diagram" keyType="name-string">          <Entity name="Diagram" keyType="name-string">
375              <Notes>A functional diagram describes the chemical reactions, often comprising a single              <Notes>A functional diagram describes a network chemical reactions, often comprising a single
376              subsystem. A diagram is identified by a short name and contains a longer descriptive name.              subsystem. A diagram is identified by a short name and contains a longer descriptive name.
377              The actual diagram shows which functional roles guide the reactions along with the inputs              The actual diagram shows which functional roles guide the reactions along with the inputs
378              and outputs; the database, however, only indicate which roles belong to a particular              and outputs; the database, however, only indicates which roles belong to a particular
379              map.</Notes>              diagram's map.</Notes>
380              <Fields>              <Fields>
381                  <Field name="name" type="text">                  <Field name="name" type="text">
382                      <Notes>Descriptive name of this diagram.</Notes>                      <Notes>Descriptive name of this diagram.</Notes>
# Line 392  Line 406 
406                  </Fields>                  </Fields>
407          </Entity>          </Entity>
408          <Entity name="Family" keyType="id-string">          <Entity name="Family" keyType="id-string">
409              <Notes>A family is a group of homologous PEGs believed to have the same function. Protein              <Notes>A _family_ (also called a [[FigFam]]) is a group of homologous features believed to have
410              families provide a mechanism for verifying the accuracy of functional assignments              the same function. Families provide a mechanism for verifying the accuracy of functional assignments
411              and are also used in determining phylogenetic trees.</Notes>              and are also used in [[Rapid Annotation]] and in determining phylogenetic trees.</Notes>
412              <Fields>              <Fields>
413                  <Field name="function" type="text">                  <Field name="function" type="text">
414                      <Notes>The functional assignment expected for all PEGs in this family.</Notes>                      <Notes>The functional assignment expected for all PEGs in this family.</Notes>
# Line 407  Line 421 
421              </Fields>              </Fields>
422          </Entity>          </Entity>
423          <Entity name="PDB" keyType="id-string">          <Entity name="PDB" keyType="id-string">
424              <Notes>A PDB is a protein database containing information that can be used to determine              <Notes>A PDB is a protein data bank entry containing information that can be used
425              the shape of the protein and the energies required to dock with it. The ID is the              to determine the shape of the protein and the energies required to dock with it.
426              four-character name used on the PDB web site.</Notes>              The ID is the four-character name used on the [[http://www.rcsb.org PDB web site]].</Notes>
427              <Fields>              <Fields>
428                  <Field name="docking-count" type="int">                  <Field name="docking-count" type="int">
429                      <Notes>The number of ligands that have been docked against this PDB.</Notes>                      <Notes>The number of ligands that have been docked against this PDB.</Notes>
# Line 426  Line 440 
440          </Entity>          </Entity>
441          <Entity name="Ligand" keyType="id-string">          <Entity name="Ligand" keyType="id-string">
442              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.
443              The ID of the ligand is an 8-digit ZINC ID number.</Notes>              The ID of the ligand is an 8-digit ID number in the [[http://zinc.docking.org ZINC database]].</Notes>
444              <Fields>              <Fields>
445                  <Field name="name" type="long-string">                  <Field name="name" type="long-string">
446                      <Notes>Chemical name of this ligand.</Notes>                      <Notes>Chemical name of this ligand.</Notes>
# Line 510  Line 524 
524              </FromIndex>              </FromIndex>
525          </Relationship>          </Relationship>
526          <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">          <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">
527              <Notes>Indicates that a docking result exists between a PDB and a ligand. The              <Notes>Indicates that a [[docking result]] exists between a PDB and a ligand. The
528              docking result describes the energy required for the ligand to dock with              docking result describes the energy required for the ligand to dock with
529              the protein described by the PDB. A lower energy indicates the ligand has a              the protein described by the PDB. A lower energy indicates the ligand has a
530              good chance of disabling the protein. At the current time, only the best              good chance of disabling the protein. At the current time, only the best
# Line 518  Line 532 
532              <Fields>              <Fields>
533                  <Field name="reason" type="id-string">                  <Field name="reason" type="id-string">
534                      <Notes>Indication of the reason for determining the docking result.                      <Notes>Indication of the reason for determining the docking result.
535                      A value of [b]Random[/b] indicates the docking was attempted as a part                      A value of =Random= indicates the docking was attempted as a part
536                      of a random survey used to determine the docking characteristics of the                      of a random survey used to determine the docking characteristics of the
537                      PDB. A value of [b]Rich[/b] indicates the docking was attempted because                      PDB. A value of =Rich= indicates the docking was attempted because
538                      a low-energy docking result was predicted for the ligand with respect                      a low-energy docking result was predicted for the ligand with respect
539                      to the PDB.</Notes>                      to the PDB.</Notes>
540                  </Field>                  </Field>
# Line 537  Line 551 
551                  </Field>                  </Field>
552                  <Field name="electrostatic-energy" type="float">                  <Field name="electrostatic-energy" type="float">
553                      <Notes>Docking energy in kcal/mol that results from the movement of                      <Notes>Docking energy in kcal/mol that results from the movement of
554                      electrons (electrostatic force) between the PDB and the ligan.</Notes>                      electrons (electrostatic force) between the PDB and the ligand.</Notes>
555                  </Field>                  </Field>
556              </Fields>              </Fields>
557              <FromIndex>              <FromIndex>
# Line 549  Line 563 
563              </FromIndex>              </FromIndex>
564              <ToIndex>              <ToIndex>
565                  <Notes>This index enables the application to view a ligand's docking results from                  <Notes>This index enables the application to view a ligand's docking results from
566                  the lowest energy (best docking) to highest energy (worst docking). Note that                  the lowest energy (best docking) to highest energy (worst docking).</Notes>
                 since we only keep the best docking results for a PDB, this index is not likely  
                 to provide useful results.</Notes>  
567              </ToIndex>              </ToIndex>
568          </Relationship>          </Relationship>
569          <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">          <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">
# Line 621  Line 633 
633          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">          <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM">
634              <Notes>This relationship connects subsystems to the genomes that use              <Notes>This relationship connects subsystems to the genomes that use
635              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be              it. If the subsystem has been curated for the genome, then the subsystem's roles will also be
636              connected to the genome features through the [b]SSCell[/b] object.</Notes>              connected to the genome features through the *SSCell* object.</Notes>
637              <Fields>              <Fields>
638                  <Field name="variant-code" type="key-string">                  <Field name="variant-code" type="key-string">
639                      <Notes>Code indicating the subsystem variant to which this                      <Notes>Code indicating the subsystem variant to which this
640                      genome belongs. Each subsystem can have multiple variants. A variant                      genome belongs. Each subsystem can have multiple variants. A variant
641                      code of [b]-1[/b] indicates that the genome does not have a functional                      code of =-1= indicates that the genome does not have a functional
642                      variant of the subsystem. A variant code of [b]0[/b] indicates that                      variant of the subsystem. A variant code of =0= indicates that
643                      the genome's participation is considered iffy.</Notes>                      the genome's participation is considered iffy.</Notes>
644                  </Field>                  </Field>
645              </Fields>              </Fields>
# Line 742  Line 754 
754              <Notes>This relationship connects a feature to the contig segments that work together              <Notes>This relationship connects a feature to the contig segments that work together
755              to effect it. The segments are numbered sequentially starting from 1. The database is              to effect it. The segments are numbered sequentially starting from 1. The database is
756              required to place an upper limit on the length of each segment. If a segment is longer              required to place an upper limit on the length of each segment. If a segment is longer
757              than the maximum, it can be broken into smaller bits.              than the maximum, it can be broken into smaller bits.  The upper limit enables applications
758              [p]The upper limit enables applications to locate all features that contain a specific              to locate all features that contain a specific residue. For example, if the upper limit
759              residue. For example, if the upper limit is 100 and we are looking for a feature that              is 100 and we are looking for a feature that contains residue 234 of contig *ABC*, we
760              contains residue 234 of contig [b]ABC[/b], we can look for features with a begin point              can look for features with a begin point between 135 and 333. The results can then be
761              between 135 and 333. The results can then be filtered by direction and length of the              filtered by direction and length of the segment.</Notes>
             segment.</Notes>  
762              <Fields>              <Fields>
763                  <Field name="locN" type="int">                  <Field name="locN" type="int">
764                      <Notes>Sequence number of this segment.</Notes>                      <Notes>Sequence number of this segment.</Notes>
# Line 762  Line 773 
773                      is forward and the point after the residue if the direction is backward.</Notes>                      is forward and the point after the residue if the direction is backward.</Notes>
774                  </Field>                  </Field>
775                  <Field name="dir" type="char">                  <Field name="dir" type="char">
776                      <Notes>Direction of the segment: [b]+[/b] if it is forward and                      <Notes>Direction of the segment: =+= if it is forward and
777                      [b]-[/b] if it is backward.</Notes>                      =-= if it is backward.</Notes>
778                  </Field>                  </Field>
779              </Fields>              </Fields>
780              <FromIndex>              <FromIndex>
# Line 812  Line 823 
823              assignment displayed is the most recent one by a user trusted              assignment displayed is the most recent one by a user trusted
824              by the current user. The current user implicitly trusts himself.              by the current user. The current user implicitly trusts himself.
825              If no trusted users are specified in the database, the user              If no trusted users are specified in the database, the user
826              also implicitly trusts the user [b]FIG[/b].</Notes>              also implicitly trusts the user =FIG=.</Notes>
827          </Relationship>          </Relationship>
828          <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">          <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM">
829              <Notes>This relationship connects a role subset to the roles that it covers.              <Notes>This relationship connects a role subset to the roles that it covers.
# Line 849  Line 860 
860          <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">          <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM">
861              <Notes>This relationship connects a feature to the subsystems in which it              <Notes>This relationship connects a feature to the subsystems in which it
862              participates. This is technically redundant information, but it is used              participates. This is technically redundant information, but it is used
863              so often that it deserves its own table.</Notes>              so often that it gets its own table for performance reasons.</Notes>
864              <Fields>              <Fields>
865                  <Field name="genome" type="name-string">                  <Field name="genome" type="name-string">
866                      <Notes>ID of the genome containing the feature</Notes>                      <Notes>ID of the genome containing the feature</Notes>

Legend:
Removed from v.1.51  
changed lines
  Added in v.1.52

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3