[Bio] / Sprout / SproutDBD.xml Repository:
ViewVC logotype

Diff of /Sprout/SproutDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.41, Fri Oct 13 21:46:45 2006 UTC revision 1.50, Mon Jul 16 19:59:33 2007 UTC
# Line 7  Line 7 
7              <Fields>              <Fields>
8                  <Field name="genus" type="name-string">                  <Field name="genus" type="name-string">
9                      <Notes>Genus of the relevant organism.</Notes>                      <Notes>Genus of the relevant organism.</Notes>
                     <DataGen pass="1">RandParam('streptococcus', 'staphyloccocus', 'felis', 'homo', 'ficticio', 'strangera', 'escherischia', 'carborunda')</DataGen>  
10                  </Field>                  </Field>
11                  <Field name="species" type="name-string">                  <Field name="species" type="name-string">
12                      <Notes>Species of the relevant organism.</Notes>                      <Notes>Species of the relevant organism.</Notes>
                     <DataGen pass="1">StringGen('PKVKVKVKVKV')</DataGen>  
13                  </Field>                  </Field>
14                  <Field name="unique-characterization" type="medium-string">                  <Field name="unique-characterization" type="medium-string">
15                      <Notes>The unique characterization identifies the particular organism instance from which the                      <Notes>The unique characterization identifies the particular organism instance from which the
16                      genome is taken. It is possible to have in the database more than one genome for a                      genome is taken. It is possible to have in the database more than one genome for a
17                      particular species, and every individual organism has variations in its DNA.</Notes>                      particular species, and every individual organism has variations in its DNA.</Notes>
18                      <DataGen>StringGen('PKVKVK999')</DataGen>                  </Field>
19                    <Field name="version" type="name-string">
20                        <Notes>version string for this genome, generally consisting of the genome ID followed
21                        by a period and a string of digits.</Notes>
22                  </Field>                  </Field>
23                  <Field name="access-code" type="key-string">                  <Field name="access-code" type="key-string">
24                      <Notes>The access code determines which users can look at the data relating to this genome.                      <Notes>The access code determines which users can look at the data relating to this genome.
25                      Each user is associated with a set of access codes. In order to view a genome, one of                      Each user is associated with a set of access codes. In order to view a genome, one of
26                      the user's access codes must match this value.</Notes>                      the user's access codes must match this value.</Notes>
                     <DataGen>RandParam('low','medium','high')</DataGen>  
27                  </Field>                  </Field>
28                  <Field name="complete" type="boolean">                  <Field name="complete" type="boolean">
29                      <Notes>TRUE if the genome is complete, else FALSE</Notes>                      <Notes>TRUE if the genome is complete, else FALSE</Notes>
30                  </Field>                  </Field>
31                    <Field name="dna-size" type="counter">
32                        <Notes>number of base pairs in the genome</Notes>
33                    </Field>
34                  <Field name="taxonomy" type="text">                  <Field name="taxonomy" type="text">
35                      <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements                      <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements
36                      separated by semi-colons (and optional white space), starting with the domain and ending with                      separated by semi-colons (and optional white space), starting with the domain and ending with
37                      the disambiguated genus and species (which is the organism's scientific name plus an                      the disambiguated genus and species (which is the organism's scientific name plus an
38                      identifying string).</Notes>                      identifying string).</Notes>
                     <DataGen pass="2">join('; ', (RandParam('bacteria', 'archaea', 'eukaryote', 'virus', 'environmental'),  
                                                   ListGen('PKVKVKVK', 5), $this->{genus}, $this->{species}))</DataGen>  
39                  </Field>                  </Field>
40                  <Field name="primary-group" type="name-string">                  <Field name="primary-group" type="name-string">
41                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group                      <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group
42                      (either based on the organism name or the default value "Supporting"), whereas there can be                      (either based on the organism name or the default value "Supporting"), whereas there can be
43                      multiple named groups or even none.</Notes>                      multiple named groups or even none.</Notes>
44                  </Field>                  </Field>
                 <Field name="group-name" type="name-string" relation="GenomeGroups">  
                     <Notes>The group identifies a special grouping of organisms that would be displayed on a particular  
                     page or of particular interest to a research group or web site. A single genome can belong to multiple  
                     such groups or none at all.</Notes>  
                 </Field>  
45              </Fields>              </Fields>
46              <Indexes>              <Indexes>
47                  <Index Unique="false">                  <Index>
48                      <Notes>This index allows the applications to find all genomes associated with                      <Notes>This index allows the applications to find all genomes associated with
49                      a specific access code, so that a complete list of the genomes users can view                      a specific access code, so that a complete list of the genomes users can view
50                      may be generated.</Notes>                      may be generated.</Notes>
# Line 59  Line 55 
55                          <IndexField name="unique-characterization" order="ascending" />                          <IndexField name="unique-characterization" order="ascending" />
56                      </IndexFields>                      </IndexFields>
57                  </Index>                  </Index>
58                  <Index Unique="false">                  <Index>
59                      <Notes>This index allows the applications to find all genomes associated with                      <Notes>This index allows the applications to find all genomes associated with
60                      a specific primary (NMPDR) group.</Notes>                      a specific primary (NMPDR) group.</Notes>
61                      <IndexFields>                      <IndexFields>
# Line 69  Line 65 
65                          <IndexField name="unique-characterization" order="ascending" />                          <IndexField name="unique-characterization" order="ascending" />
66                      </IndexFields>                      </IndexFields>
67                  </Index>                  </Index>
68                  <Index Unique="false">                  <Index>
69                      <Notes>This index allows the applications to find all genomes for a particular                      <Notes>This index allows the applications to find all genomes for a particular
70                      species.</Notes>                      species.</Notes>
71                      <IndexFields>                      <IndexFields>
# Line 80  Line 76 
76                  </Index>                  </Index>
77              </Indexes>              </Indexes>
78          </Entity>          </Entity>
79            <Entity name="CDD" keyType="key-string">
80                <Notes>A CDD is a protein domain designator. It represents the shape of a molecular unit
81                on a feature's protein. The ID is six-digit string assigned by the public Conserved Domain
82                Database. A CDD can occur on multiple features and a feature generally has multiple CDDs.</Notes>
83            </Entity>
84          <Entity name="Source" keyType="medium-string">          <Entity name="Source" keyType="medium-string">
85              <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization              <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization
86              or a paper citation.</Notes>              or a paper citation.</Notes>
87              <Fields>              <Fields>
88                  <Field name="URL" type="string" relation="SourceURL">                  <Field name="URL" type="string" relation="SourceURL">
89                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>                      <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes>
                     <DataGen>"http://www.conservativecat.com/Ferdy/TestTarget.php?Source=" . $this->{id}</DataGen>  
90                  </Field>                  </Field>
91                  <Field name="description" type="text">                  <Field name="description" type="text">
92                      <Notes>Description the source. The description can be a street address or a citation.</Notes>                      <Notes>Description the source. The description can be a street address or a citation.</Notes>
                     <DataGen>$this->{id} . ': ' . StringGen(IntGen(50,200))</DataGen>  
93                  </Field>                  </Field>
94              </Fields>              </Fields>
95          </Entity>          </Entity>
# Line 113  Line 112 
112                  <Field name="sequence" type="text">                  <Field name="sequence" type="text">
113                      <Notes>String consisting of the residues. Each residue is described by a single                      <Notes>String consisting of the residues. Each residue is described by a single
114                      character in the string.</Notes>                      character in the string.</Notes>
                     <DataGen>RandChars("ACGT", IntGen(100,400))</DataGen>  
115                  </Field>                  </Field>
116                  <Field name="quality-vector" type="text">                  <Field name="quality-vector" type="text">
117                      <Notes>String describing the quality data for each base pair. Individual values will                      <Notes>String describing the quality data for each base pair. Individual values will
# Line 122  Line 120 
120                      10^-30. A higher quality number a better chance of a correct match. It is possible                      10^-30. A higher quality number a better chance of a correct match. It is possible
121                      that the quality data is not known for a sequence. If that is the case, the quality                      that the quality data is not known for a sequence. If that is the case, the quality
122                      vector will contain the [b]unknown[/b].</Notes>                      vector will contain the [b]unknown[/b].</Notes>
                     <DataGen>unknown</DataGen>  
123                  </Field>                  </Field>
124              </Fields>              </Fields>
125          </Entity>          </Entity>
# Line 132  Line 129 
129              one genome. Features can be assigned to roles via spreadsheet cells,              one genome. Features can be assigned to roles via spreadsheet cells,
130              and are the targets of annotation.</Notes>              and are the targets of annotation.</Notes>
131              <Fields>              <Fields>
132                  <Field name="feature-type" type="string">                  <Field name="feature-type" type="id-string">
133                      <Notes>Code indicating the type of this feature.</Notes>                      <Notes>Code indicating the type of this feature.</Notes>
                     <DataGen>RandParam('peg','rna')</DataGen>  
                 </Field>  
                 <Field name="alias" type="medium-string" relation="FeatureAlias">  
                     <Notes>Alternative name for this feature. A feature can have many aliases.</Notes>  
                     <DataGen testCount="3">StringGen('Pgi|99999', 'Puni|XXXXXX', 'PAAAAAA999')</DataGen>  
134                  </Field>                  </Field>
135                  <Field name="translation" type="text" relation="FeatureTranslation">                  <Field name="translation" type="text" relation="FeatureTranslation">
136                      <Notes>[i](optional)[/i] A translation of this feature's residues into character                      <Notes>[i](optional)[/i] A translation of this feature's residues into character
137                      codes, formed by concatenating the pieces of the feature together. For a                      codes, formed by concatenating the pieces of the feature together. For a
138                      protein encoding group, this is the protein characters. For other types                      protein encoding group, this is the protein characters. For other types
139                      it is the DNA characters.</Notes>                      it is the DNA characters.</Notes>
                     <DataGen testCount="0"></DataGen>  
140                  </Field>                  </Field>
141                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">                  <Field name="upstream-sequence" type="text" relation="FeatureUpstream">
142                      <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of                      <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of
143                      the feature's initial residues.</Notes>                      the feature's initial residues.</Notes>
144                      <DataGen testCount="0"></DataGen>                  </Field>
145                    <Field name="assignment" type="text">
146                        <Notes>Default functional assignment for this feature.</Notes>
147                  </Field>                  </Field>
148                  <Field name="active" type="boolean">                  <Field name="active" type="boolean">
149                      <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes>                      <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes>
150                      <DataGen>1</DataGen>                  </Field>
151                    <Field name="assignment-maker" type="name-string">
152                        <Notes>name of the user who made the functional assignment</Notes>
153                    </Field>
154                    <Field name="assignment-quality" type="char">
155                        <Notes>quality of the functional assignment, usually a space, but may be W (indicating weak) or X
156                        (indicating experimental)</Notes>
157                  </Field>                  </Field>
158                  <Field name="keywords" type="text" searchable="1">                  <Field name="keywords" type="text" searchable="1">
159                      <Notes>This is a list of search keywords for the feature. It includes the                      <Notes>This is a list of search keywords for the feature. It includes the
# Line 164  Line 163 
163                      <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The                      <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The
164                      links are to other websites that have useful about the gene that the feature represents, and                      links are to other websites that have useful about the gene that the feature represents, and
165                      are coded as raw HTML, using [b]&lt;a href="[i]link[/i]"&gt;[i]text[/i]&lt;/a&gt;[/b] notation.</Notes>                      are coded as raw HTML, using [b]&lt;a href="[i]link[/i]"&gt;[i]text[/i]&lt;/a&gt;[/b] notation.</Notes>
                     <DataGen testCount="3">'http://www.conservativecat.com/Ferdy/TestTarget.php?Source=' . $this->{id} .  
                     "&amp;Number=" . IntGen(1,99)</DataGen>  
166                  </Field>                  </Field>
167                  <Field name="conservation" type="float" relation="FeatureConservation">                  <Field name="conservation" type="float" relation="FeatureConservation">
168                      <Notes>A number between 0 and 1 that indicates the degree to which this feature's DNA is                      <Notes>A number between 0 and 1 that indicates the degree to which this feature's DNA is
169                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less                      conserved in related genomes. A value of 1 indicates perfect conservation. A value less
170                      than 1 is a reflect of the degree to which gap characters interfere in the alignment                      than 1 is a reflection of the degree to which gap characters interfere in the alignment
171                      between the feature and its close relatives.</Notes>                      between the feature and its close relatives.</Notes>
172                  </Field>                  </Field>
173                    <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
174                        <Notes>A value indicating the essentiality of the feature, coded as HTML. In most
175                        cases, this will be a word describing whether the essentiality is confirmed (essential)
176                        or potential (potential-essential), hyperlinked to the document from which the
177                        essentiality was curated. If a feature is not essential, this field will have no
178                        values; otherwise, it may have multiple values.</Notes>
179                    </Field>
180                    <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">
181                        <Notes>A value indicating the virulence of the feature, coded as HTML. In most
182                        cases, this will be a phrase or SA number hyperlinked to the document from which
183                        the virulence information was curated. If the feature is not virulent, this field
184                        will have no values; otherwise, it may have multiple values.</Notes>
185                    </Field>
186                    <Field name="cello" type="name-string">
187                        <Notes>The cello value specifies the expected location of the protein: cytoplasm,
188                        cell wall, inner membrane, and so forth.</Notes>
189                    </Field>
190                    <Field name="iedb" type="text" relation="FeatureIEDB" special="property_search">
191                        <Notes>A value indicating whether or not the feature can be found in the
192                        Immune Epitope Database. If the feature has not been matched to that database,
193                        this field will have no values. Otherwise, it will have an epitope name and/or
194                        sequence, hyperlinked to the database.</Notes>
195                    </Field>
196                    <Field name="location-string" type="text">
197                        <Notes>Location of the feature, expressed as a comma-delimited list of Sprout location
198                        strings. This gives us a fast mechanism for extracting the feature location. Otherwise,
199                        we have to painstakingly paste together the IsLocatedIn records, which are themselves
200                        designed to help look for genes in a particular region rather than to find the location
201                        of a gene.</Notes>
202                    </Field>
203              </Fields>              </Fields>
204              <Indexes>              <Indexes>
205                  <Index>                  <Index>
206                      <Notes>This index allows the user to find the feature corresponding to                      <Notes>This index allows us to locate a feature by its CELLO value.</Notes>
                     the specified alias name.</Notes>  
207                      <IndexFields>                      <IndexFields>
208                          <IndexField name="alias" order="ascending" />                          <IndexField name="cello" order="ascending" />
209                      </IndexFields>                      </IndexFields>
210                  </Index>                  </Index>
211              </Indexes>              </Indexes>
212          </Entity>          </Entity>
213            <Entity name="FeatureAlias" keyType="medium-string">
214                <Notes>Alternative names for features. A feature can have many aliases. In general,
215                each alias corresponds to only one feature, but there are exceptionsis is not strictly enforced.</Notes>
216            </Entity>
217          <Entity name="SynonymGroup" keyType="id-string">          <Entity name="SynonymGroup" keyType="id-string">
218              <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features              <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features
219              are mapped to the same synonym group, and this information is used to expand similarities.</Notes>              are mapped to the same synonym group, and this information is used to expand similarities.</Notes>
# Line 191  Line 221 
221          <Entity name="Role" keyType="string">          <Entity name="Role" keyType="string">
222              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.              <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature.
223              One of the main goals of the database is to record the roles of the various features.</Notes>              One of the main goals of the database is to record the roles of the various features.</Notes>
224              <Fields>          </Entity>
225                  <Field name="EC" type="string" relation="RoleEC">          <Entity name="RoleEC" keyType="string">
226                      <Notes>EC code for this role.</Notes>              <Notes>EC code for a role.</Notes>
                     <DataGen testCount="1">StringGen(IntGen(20,40)) . "(" . $this->{id} . ")"</DataGen>  
                 </Field>  
                 <Field name="abbr" type="name-string">  
                     <Notes>Abbreviated name for the role, generally non-unique, but useful  
                     in column headings for HTML tables.</Notes>  
                 </Field>  
             </Fields>  
             <Indexes>  
                 <Index>  
                     <Notes>This index allows the user to find the role corresponding to  
                     an EC number.</Notes>  
                     <IndexFields>  
                         <IndexField name="EC" order="ascending" />  
                     </IndexFields>  
                 </Index>  
             </Indexes>  
227          </Entity>          </Entity>
228          <Entity name="Annotation" keyType="name-string">          <Entity name="Annotation" keyType="name-string">
229              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations              <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations
# Line 249  Line 263 
263              <Notes>A [i]compound[/i] is a chemical that participates in a reaction.              <Notes>A [i]compound[/i] is a chemical that participates in a reaction.
264              All compounds have a unique ID and may also have one or more names.</Notes>              All compounds have a unique ID and may also have one or more names.</Notes>
265              <Fields>              <Fields>
266                  <Field name="name-priority" type="int" relation="CompoundName">                  <Field name="label" type="string">
                     <Notes>Priority of a compound name. The name with the loweset  
                     priority is the main name of this compound.</Notes>  
                 </Field>  
                 <Field name="name" type="name-string" relation="CompoundName">  
                     <Notes>Descriptive name for the compound. A compound may  
                     have several names.</Notes>  
                 </Field>  
                 <Field name="cas-id" type="name-string" relation="CompoundCAS">  
                     <Notes>Chemical Abstract Service ID for this compound (optional).</Notes>  
                 </Field>  
                 <Field name="label" type="name-string">  
267                      <Notes>Name used in reaction display strings.                      <Notes>Name used in reaction display strings.
268                      It is the same as the name possessing a priority of 1, but it is placed                      It is the same as the name possessing a priority of 1, but it is placed
269                      here to speed up the query used to create the display strings.</Notes>                      here to speed up the query used to create the display strings.</Notes>
270                  </Field>                  </Field>
271              </Fields>              </Fields>
272              <Indexes>          </Entity>
273                  <Index>          <Entity name="CompoundName" keyType="string">
274                      <Notes>This index allows the user to find the compound corresponding to              <Notes>A [i]compound name[/i] is a common name for the chemical represented by a
275                      the specified name.</Notes>              compound.</Notes>
276                      <IndexFields>          </Entity>
277                          <IndexField name="name" order="ascending" />          <Entity name="CompoundCAS" keyType="name-string">
278                      </IndexFields>              <Notes>This entity represents the Chemical Abstract Service ID for a compound. Each
279                  </Index>              Compound has at most one CAS ID.</Notes>
                 <Index>  
                     <Notes>This index allows the user to find the compound corresponding to  
                     the specified CAS ID.</Notes>  
                     <IndexFields>  
                         <IndexField name="cas-id" order="ascending" />  
                     </IndexFields>  
                 </Index>  
                 <Index>  
                     <Notes>This index allows the user to access the compound names in  
                     priority order.</Notes>  
                     <IndexFields>  
                         <IndexField name="id" order="ascending" />  
                         <IndexField name="name-priority" order="ascending" />  
                     </IndexFields>  
                 </Index>  
             </Indexes>  
280          </Entity>          </Entity>
281          <Entity name="Subsystem" keyType="string">          <Entity name="Subsystem" keyType="string">
282              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems              <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems
# Line 302  Line 289 
289                      <Notes>Descriptive notes about the subsystem.</Notes>                      <Notes>Descriptive notes about the subsystem.</Notes>
290                  </Field>                  </Field>
291                  <Field name="classification" type="string" relation="SubsystemClass">                  <Field name="classification" type="string" relation="SubsystemClass">
292                      <Notes>General classification data about the subsystem.</Notes>                      <Notes>Classification string, colon-delimited. This string organizes the
293                        subsystems into a hierarchy.</Notes>
294                  </Field>                  </Field>
295              </Fields>              </Fields>
296          </Entity>          </Entity>
# Line 333  Line 321 
321                      <Notes>Access code possessed by this                      <Notes>Access code possessed by this
322                      user. A user can have many access codes; a genome is accessible to the user if its                      user. A user can have many access codes; a genome is accessible to the user if its
323                      access code matches any one of the user's access codes.</Notes>                      access code matches any one of the user's access codes.</Notes>
                     <DataGen testCount="2">RandParam('low', 'medium', 'high')</DataGen>  
324                  </Field>                  </Field>
325              </Fields>              </Fields>
326          </Entity>          </Entity>
# Line 398  Line 385 
385                      </Field>                      </Field>
386                  </Fields>                  </Fields>
387          </Entity>          </Entity>
         <Entity name="Coupling" keyType="id-string">  
             <Notes>A coupling is a relationship between two features. The features are  
             physically close on the contig, and there is evidence that they generally  
             belong together. The key of this entity is formed by combining the coupled  
             feature IDs with a space.</Notes>  
             <Fields>  
                 <Field name="score" type="int">  
                     <Notes>A number based on the set of PCHs (pairs of close homologs). A PCH  
                     indicates that two genes near each other on one genome are very similar to  
                     genes near each other on another genome. The score only counts PCHs for which  
                     the genomes are very different. (In other words, we have a pairing that persists  
                     between different organisms.) A higher score implies a stronger meaning to the  
                     clustering.</Notes>  
                 </Field>  
             </Fields>  
         </Entity>  
         <Entity name="PCH" keyType="counter">  
             <Notes>A PCH (physically close homolog) connects a clustering (which is a  
             pair of physically close features on a contig) to a second pair of physically  
             close features that are similar to the first. Essentially, the PCH is a  
             relationship between two clusterings in which the first clustering's features  
             are similar to the second clustering's features. The simplest model for  
             this would be to simply relate clusterings to each other; however, not all  
             physically close pairs qualify as clusterings, so we relate a clustering to  
             a pair of features. The key a unique ID number.</Notes>  
             <Fields>  
                 <Field name="used" type="boolean">  
                     <Notes>TRUE if this PCH is used in scoring the attached clustering,  
                     else FALSE. If a clustering has a PCH for a particular genome and many  
                     similar genomes are present, then a PCH will probably exist for the  
                     similar genomes as well. When this happens, only one of the PCHs will  
                     be scored: the others are considered duplicates of the same evidence.</Notes>  
                 </Field>  
             </Fields>  
         </Entity>  
388          <Entity name="Family" keyType="id-string">          <Entity name="Family" keyType="id-string">
389              <Notes>A family is a group of homologous PEGs believed to have the same function. Protein              <Notes>A family is a group of homologous PEGs believed to have the same function. Protein
390              families provide a mechanism for verifying the accuracy of functional assignments              families provide a mechanism for verifying the accuracy of functional assignments
# Line 448  Line 400 
400                  </Field>                  </Field>
401              </Fields>              </Fields>
402          </Entity>          </Entity>
403          <Entity name="DrugProject" keyType="name-string">          <Entity name="PDB" keyType="id-string">
404              <Notes>A drug project is a coherent sent of drug target data that came through the              <Notes>A PDB is a protein database containing information that can be used to determine
405              pipeline. In other words, data is put into the database one drug project at a time.              the shape of the protein and the energies required to dock with it. The ID is the
406              This makes it easier to manage the incoming data and to track where a particular              four-character name used on the PDB web site.</Notes>
407              piece of data originated.</Notes>              <Fields>
408          </Entity>                  <Field name="docking-count" type="int">
409          <Entity name="DrugTopic" keyType="int">                      <Notes>The number of ligands that have been docked against this PDB.</Notes>
             <Notes>A drug topic organizes the data in a project relating to a single organism  
             group's features as they apply to a specific category of activity. Categories include  
             features essential to the organism's survival, those that are targets or inhibitors  
             of anti-biotics, and those associated with virulence. Thus, a drug topic consists  
             of data from a single project for features that make good drug targets for the same  
             reason. Drug topics have an artificial, internally-generated key.</Notes>  
             <Fields>  
                 <Field name="identifier" type="name-string">  
                     <Notes>The topic identifier, consisting usually of a generalized organism name  
                     (e.g. Staphylococcus) and the last name of the project's author. More than  
                     one topic may have the same identifier, which is why this isn't the key.</Notes>  
                 </Field>  
                 <Field name="function" type="name-string">  
                     <Notes>A short phrase describing the topic.</Notes>  
                 </Field>  
                 <Field name="URL" type="string">  
                     <Notes>A URL for the paper from which the topic was gathered.</Notes>  
                 </Field>  
                 <Field name="category" type="key-string">  
                     <Notes>The code for this topic's activity category.</Notes>  
410                  </Field>                  </Field>
411              </Fields>              </Fields>
412              <Indexes>              <Indexes>
413                  <Index>                  <Index>
                     <Notes>This index enables the application to find all topics with a specified  
                     category, ordered by function.</Notes>  
414                      <IndexFields>                      <IndexFields>
415                          <IndexField name="category" order="ascending" />                          <IndexField name="docking-count" order="descending" />
416                          <IndexField name="function" order="ascending" />                          <IndexField name="id" order="ascending" />
                     </IndexFields>  
                 </Index>  
                 <Index>  
                     <Notes>This index enables the application to find all topics with a specified  
                     identifier, ordered by category.</Notes>  
                     <IndexFields>  
                         <IndexField name="identifier" order="ascending" />  
                         <IndexField name="category" order="ascending" />  
417                      </IndexFields>                      </IndexFields>
418                  </Index>                  </Index>
419              </Indexes>              </Indexes>
420          </Entity>          </Entity>
421          <Entity name="PDB" keyType="key-string">          <Entity name="Ligand" keyType="id-string">
422              <Notes>A PDB is a database of protein structure and related information of use              <Notes>A Ligand is a chemical of interest in computing docking energies against a PDB.
423              in drug targeting. The purpose of drug targeting is to analyze the ability              The ID of the ligand is an 8-digit ZINC ID number.</Notes>
             of drug molecules, or ligands, to bond to proteins. A PDB for a protein already  
             attached to a ligand is called a bound PDB. A PDB for the protein by itself is  
             called a free PDB. The key of the PDB is its code name on the Protein Data  
             Bank web site.</Notes>  
424              <Fields>              <Fields>
425                  <Field name="type" type="id-string">                  <Field name="name" type="long-string">
426                      <Notes>The type of PDB: "bound" or "free".</Notes>                      <Notes>Chemical name of this ligand.</Notes>
                 </Field>  
                 <Field name="title" type="string">  
                     <Notes>The descriptive title of this PDB.</Notes>  
427                  </Field>                  </Field>
428              </Fields>              </Fields>
429          </Entity>          </Entity>
         <Entity name="Ligand" keyType="string">  
             <Notes>A ligand is a molecule that can bind to a PDB. The CLIBE analysis  
             for a PDB is an attribute of the relationship between a PDB and a ligand.</Notes>  
         </Entity>  
430      </Entities>      </Entities>
431      <Relationships>      <Relationships>
432          <Relationship name="BindsWith" from="PDB" to="Ligand" arity="MM">          <Relationship name="IsPresentOnProteinOf" from="CDD" to="Feature" arity="MM">
433              <Notes>This relationship describes the energy required for a ligand to bind              <Notes>This relationship connects a feature to its CDD protein domains. The
434              to the protein described by a PDB. The total energy required to bind              match score is included as intersection data.</Notes>
             the ligand to the protein is described in this relationship by four  
             quantities. A negative value is energy released; a positive value is  
             energy required.</Notes>  
435              <Fields>              <Fields>
436                  <Field name="URL" type="string">                  <Field name="score" type="float">
437                      <Notes>URL for viewing the CLIBE data for this binding relationship.</Notes>                      <Notes>This is the match score between the feature and the CDD. A
438                  </Field>                      lower score is a better match.</Notes>
                 <Field name="vanderwaals-energy" type="float">  
                     <Notes>kCal/mol of energy due to Van der Waals force.</Notes>  
                 </Field>  
                 <Field name="hbond-energy" type="float">  
                     <Notes>kCal/mol of energy due to hydrogen bonding.</Notes>  
                 </Field>  
                 <Field name="ionic-energy" type="float">  
                     <Notes>kCal/mol of energy due to ionic bonding.</Notes>  
                 </Field>  
                 <Field name="solvation-energy" type="float">  
                     <Notes>kCal/mol of energy due to attraction to the solvent in which  
                     the ligand is immersed.</Notes>  
439                  </Field>                  </Field>
440              </Fields>              </Fields>
441                <FromIndex>
442                    <IndexFields>
443                        <IndexField name="score" order="ascending" />
444                    </IndexFields>
445                </FromIndex>
446          </Relationship>          </Relationship>
447          <Relationship name="ContainsAnalysisOf" from="DrugTopic" to="PDB" arity="1M">          <Relationship name="IsIdentifiedByCAS" from="Compound" to="CompoundCAS" arity="MM">
448              <Notes>This relationship describes the analysis of a free PDB as produced from a              <Notes>Relates a compound's CAS ID to the compound itself. Every CAS ID is
449              particular topic.</Notes>              associated with a compound, and some are associated with two compounds, but not
450              <Fields>              all compounds have CAS IDs.</Notes>
451                  <Field name="pass-asp-info" type="int">          </Relationship>
452                      <Notes>The number of Active Site Points at which ligands can bind to          <Relationship name="IsIdentifiedByEC" from="Role" to="RoleEC" arity="MM">
453                      the protein.</Notes>              <Notes>Relates a role to its EC number. Every EC number is associated with a
454                  </Field>              role, but not all roles have EC numbers.</Notes>
455                  <Field name="ramsol-file" type="string">          </Relationship>
456                      <Notes>The URL of a file that can be downloaded by the user and          <Relationship name="IsAliasOf" from="FeatureAlias" to="Feature" arity="MM">
457                      passed to the Ramsol program for viewing the protein.</Notes>              <Notes>Connects an alias to the feature it represents. Every alias connects
458                  </Field>              to at least 1 feature, and a feature connects to many aliases.</Notes>
459                  <Field name="pass-weight" type="float">          </Relationship>
460                      <Notes>A score for the largest pocket into which a ligand can bind. A          <Relationship name="HasCompoundName" from="Compound" to="CompoundName" arity="MM">
461                      higher score makes for a better target.</Notes>              <Notes>Connects a compound to its names. A compound generally has several
462                  </Field>              names</Notes>
463                  <Field name="pass-file" type="string">              <Fields>
464                      <Notes>The URL for a GIF file that shows the active sites on the protein.</Notes>                  <Field name="priority" type="int">
465                        <Notes>Priority of this name, with 1 being the highest priority, 2
466                        the next highest, and so forth.</Notes>
467                  </Field>                  </Field>
468              </Fields>              </Fields>
469                <FromIndex>
470                    <Notes>This index enables the application to view the names of a compound
471                    in priority order.</Notes>
472                    <IndexFields>
473                        <IndexField name="priority" order="ascending" />
474                    </IndexFields>
475                </FromIndex>
476          </Relationship>          </Relationship>
477          <Relationship name="IsBoundIn" from="PDB" to="PDB" arity="1M">          <Relationship name="IsProteinForFeature" from="PDB" to="Feature" arity="MM">
478              <Note>This relationship connects a free PDB to its bound counterparts.</Note>              <Notes>Relates a PDB to features that produce highly similar proteins.</Notes>
         </Relationship>  
         <Relationship name="DescribesProteinForFeature" from="PDB" to="Feature" arity="MM">  
             <Notes>This relationship connects a feature to a protein database (PDB) that  
             is relevant for determining drugs that target the feature.</Notes>  
479              <Fields>              <Fields>
480                  <Field name="score" type="float">                  <Field name="score" type="float">
481                      <Notes>The BLAST score for the feature as it relates to the PDB's                      <Notes>Similarity score for the comparison between the feature and
482                      protein, expressed as a small positive number. Generally only a                      the PDB protein. A lower score indicates a better match.</Notes>
483                      very low BLAST score (1e-15 or less) indicates a good match.</Notes>                  </Field>
484                  </Field>                  <Field name="start-location" type="int">
485                  <Field name="distance" type="float">                      <Notes>Starting location within the feature of the matching region.</Notes>
486                      <Notes>A distance value indicating how far the PDB's protein is                  </Field>
487                      from the feature's protein. A distance of 0 indicates a perfect                  <Field name="end-location" type="int">
488                      match.</Notes>                      <Notes>Ending location within the feature of the matching region.</Notes>
489                  </Field>                  </Field>
490              </Fields>              </Fields>
491              <FromIndex>              <ToIndex>
492                  <Notes>This index yields the Features for a PDB in order from best                  <Notes>This index enables the application to view the PDBs of a
493                  score to worst.</Notes>                  feature in order from the closest match to the furthest.</Notes>
494                  <IndexFields>                  <IndexFields>
495                      <IndexField name="score" order="ascending" />                      <IndexField name="score" order="ascending" />
496                  </IndexFields>                  </IndexFields>
497              </FromIndex>              </ToIndex>
498              <ToIndex>              <FromIndex>
499                  <Notes>This index yields the Features for a PDB in order from best                  <Notes>This index enables the application to view the features of
500                  score to worst.</Notes>                  a PDB in order from the closest match to the furthest.</Notes>
501                  <IndexFields>                  <IndexFields>
502                      <IndexField name="score" order="ascending" />                      <IndexField name="score" order="ascending" />
503                  </IndexFields>                  </IndexFields>
504              </ToIndex>              </FromIndex>
         </Relationship>  
         <Relationship name="ContainsTopic" from="DrugProject" to="DrugTopic" arity="1M">  
             <Notes>This relationship connects a drug target project to all of its  
             topics.</Notes>  
         </Relationship>  
         <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">  
             <Notes>This relationship connects a protein family to all of its PEGs and connects  
             each PEG to all of its protein families.</Notes>  
505          </Relationship>          </Relationship>
506          <Relationship name="ParticipatesInCoupling" from="Feature" to="Coupling" arity="MM">          <Relationship name="DocksWith" from="PDB" to="Ligand" arity="MM">
507              <Notes>This relationship connects a feature to all the functional couplings              <Notes>Indicates that a docking result exists between a PDB and a ligand. The
508              in which it participates. A functional coupling is a recognition of the fact              docking result describes the energy required for the ligand to dock with
509              that the features are close to each other on a chromosome, and similar              the protein described by the PDB. A lower energy indicates the ligand has a
510              features in other genomes also tend to be close.</Notes>              good chance of disabling the protein. At the current time, only the best
511              <Fields>              docking results are kept.</Notes>
512                  <Field name="pos" type="int">              <Fields>
513                      <Notes>Ordinal position of the feature in the coupling. Currently,                  <Field name="reason" type="id-string">
514                      this is either "1" or "2".</Notes>                      <Notes>Indication of the reason for determining the docking result.
515                        A value of [b]Random[/b] indicates the docking was attempted as a part
516                        of a random survey used to determine the docking characteristics of the
517                        PDB. A value of [b]Rich[/b] indicates the docking was attempted because
518                        a low-energy docking result was predicted for the ligand with respect
519                        to the PDB.</Notes>
520                    </Field>
521                    <Field name="tool" type="id-string">
522                        <Notes>Name of the tool used to produce the docking result.</Notes>
523                    </Field>
524                    <Field name="total-energy" type="float">
525                        <Notes>Total energy required for the ligand to dock with the PDB
526                        protein, in kcal/mol. A negative value means energy is released.</Notes>
527                    </Field>
528                    <Field name="vanderwalls-energy" type="float">
529                        <Notes>Docking energy in kcal/mol that results from the geometric fit
530                        (Van der Waals force) between the PDB and the ligand.</Notes>
531                    </Field>
532                    <Field name="electrostatic-energy" type="float">
533                        <Notes>Docking energy in kcal/mol that results from the movement of
534                        electrons (electrostatic force) between the PDB and the ligan.</Notes>
535                  </Field>                  </Field>
536              </Fields>              </Fields>
537              <ToIndex>              <FromIndex>
538                  <Notes>This index enables the application to view the features of                  <Notes>This index enables the application to view a PDB's docking results from
539                  a coupling in the proper order. The order influences the way the                  the lowest energy (best docking) to highest energy (worst docking).</Notes>
                 PCHs are examined.</Notes>  
540                  <IndexFields>                  <IndexFields>
541                      <IndexField name="pos" order="ascending" />                      <IndexField name="total-energy" order="ascending" />
542                  </IndexFields>                  </IndexFields>
543                </FromIndex>
544                <ToIndex>
545                    <Notes>This index enables the application to view a ligand's docking results from
546                    the lowest energy (best docking) to highest energy (worst docking). Note that
547                    since we only keep the best docking results for a PDB, this index is not likely
548                    to provide useful results.</Notes>
549              </ToIndex>              </ToIndex>
550          </Relationship>          </Relationship>
551          <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="1M">          <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM">
552                <Notes>This relationship connects a protein family to all of its PEGs and connects
553                each PEG to all of its protein families.</Notes>
554            </Relationship>
555            <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="MM">
556              <Notes>This relation connects a synonym group to the features that make it              <Notes>This relation connects a synonym group to the features that make it
557              up.</Notes>              up.</Notes>
558          </Relationship>          </Relationship>
# Line 648  Line 575 
575                  </IndexFields>                  </IndexFields>
576              </FromIndex>              </FromIndex>
577          </Relationship>          </Relationship>
         <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M">  
             <Notes>This relationship connects a functional coupling to the physically  
             close homologs (PCHs) which affirm that the coupling is meaningful.</Notes>  
         </Relationship>  
         <Relationship name="UsesAsEvidence" from="PCH" to="Feature" arity="MM">  
             <Notes>This relationship connects a PCH to the features that represent its  
             evidence. Each PCH is connected to a parent coupling that relates two features  
             on a specific genome. The PCH's evidence that the parent coupling is functional  
             is the existence of two physically close features on a different genome that  
             correspond to the features in the coupling. Those features are found on the  
             far side of this relationship.</Notes>  
             <Fields>  
                 <Field name="pos" type="int">  
                     <Notes>Ordinal position of the feature in the coupling that corresponds  
                     to our target feature. There is a one-to-one correspondence between the  
                     features connected to the PCH by this relationship and the features  
                     connected to the PCH's parent coupling. The ordinal position is used  
                     to decode that relationship. Currently, this field is either "1" or  
                     "2".</Notes>  
                 </Field>  
             </Fields>  
             <FromIndex>  
                 <Notes>This index enables the application to view the features of  
                 a PCH in the proper order.</Notes>  
                 <IndexFields>  
                     <IndexField name="pos" order="ascending" />  
                 </IndexFields>  
             </FromIndex>  
         </Relationship>  
578          <Relationship name="HasContig" from="Genome" to="Contig" arity="1M">          <Relationship name="HasContig" from="Genome" to="Contig" arity="1M">
579              <Notes>This relationship connects a genome to the contigs that contain the actual genetic              <Notes>This relationship connects a genome to the contigs that contain the actual genetic
580              information.</Notes>              information.</Notes>
# Line 739  Line 637 
637          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">          <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM">
638              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>              <Notes>This relationship connects roles to the subsystems that implement them. </Notes>
639              <Fields>              <Fields>
640                    <Field name="abbr" type="name-string">
641                        <Notes>Abbreviated name for the role, generally non-unique, but useful
642                        in column headings for HTML tables.</Notes>
643                    </Field>
644                  <Field name="column-number" type="int">                  <Field name="column-number" type="int">
645                      <Notes>Column number for this role in the specified subsystem's                      <Notes>Column number for this role in the specified subsystem's
646                      spreadsheet.</Notes>                      spreadsheet.</Notes>
# Line 858  Line 760 
760                      [b]-[/b] if it is backward.</Notes>                      [b]-[/b] if it is backward.</Notes>
761                  </Field>                  </Field>
762              </Fields>              </Fields>
763              <FromIndex Unique="false">              <FromIndex>
764                  <Notes>This index allows the application to find all the segments of a feature in                  <Notes>This index allows the application to find all the segments of a feature in
765                  the proper order.</Notes>                  the proper order.</Notes>
766                  <IndexFields>                  <IndexFields>

Legend:
Removed from v.1.41  
changed lines
  Added in v.1.50

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3