[Bio] / Sprout / SaplingDBD.xml Repository:
ViewVC logotype

Annotation of /Sprout/SaplingDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (view) (download) (as text)

1 : parrello 1.1 <?xml version="1.0" encoding="utf-8" ?>
2 :     <Database>
3 :     <Title>Sapling Bioinformatics Database</Title>
4 :     <Notes>The Sapling database is a distributable, self-contained copy of the NMPDR data.
5 :     Unlike Sprout, which is optimized for searching, Sapling is designed to be structurally
6 :     simple without sacrificing the ability to find information quickly.</Notes>
7 :     <Issues>
8 :     <Issue>Must add the new "image" data type to ERDB.</Issue>
9 :     <Issue>Must add the new "dna" data type to ERDB.</Issue>
10 :     <Issue>Diagrammer should be able to read real DBDs.</Issue>
11 :     <Issue>Diagrammer should allow editing the DBD.</Issue>
12 :     <Issue>Must add back the ability to index a secondary relation. Note that
13 :     such indexes can only have a single field.</Issue>
14 :     <Issue>We probably need some type tables that describe things like Identifier(source)
15 :     or Family(kind).</Issue>
16 :     <Issue>I'm operating on the assumption that this database will eventually grow into a
17 :     successor for Sprout, hence the name "Sapling". If I'm wrong, then it should be
18 :     renamed "Root".</Issue>
19 :     <Issue>The ERDB documentation needs to be updated to include DisplayInfo, Asides,
20 :     the "converse" attribute for relationships, and the Shapes section.</Issue>
21 :     </Issues>
22 :     <Entities>
23 :     <Entity name="Scenario" keyType="string">
24 :     <DisplayInfo theme="web" col="5" row="1"/>
25 :     <Notes>A scenario is used to verify the validity of subsystem assignments. Each
26 :     scenario converrts input compounds to output compounds using reactions.
27 :     The scenario may use all of the reactions controlled by a subsystem or only
28 :     some, and may also incorporate additional reactions.</Notes>
29 :     </Entity>
30 :     <Entity name="Compound" keyType="name-string">
31 :     <DisplayInfo theme="web" col="1" row="3"/>
32 :     <Notes>A compound is a chemical that participates in a reaction.
33 :     All compounds have a unique ID and may also have one or more names. Both
34 :     ligands and reaction components are treated as compounds.</Notes>
35 :     <Fields>
36 :     <Field name="label" type="string">
37 :     <Notes>Primary name of the compound. This is the name used in reaction
38 :     display strings.</Notes>
39 :     </Field>
40 :     <Field name="name" type="string" relation="CompoundName">
41 :     <Notes>Alternate name for the compound. A compound may have many
42 :     alternate names. The primary name should also be one of the
43 :     alternate names.</Notes>
44 :     </Field>
45 :     <Field name="cas-id" type="string" relation="CompoundCAS">
46 :     <Notes>The Chemical Abstract Service ID for the compound. A
47 :     compound may have at most one CAS ID.</Notes>
48 :     </Field>
49 :     <Field name="zinc-id" type="string" relation="CompoundZinc">
50 :     <Notes>The ZINC database ID for the compound. A compound may
51 :     have at most one ZINC ID.</Notes>
52 :     </Field>
53 :     </Fields>
54 :     <Indexes>
55 :     <Index>
56 :     <Notes>This index allows searching for compounds by name.</Notes>
57 :     <IndexFields>
58 :     <IndexField name="name" order="ascending"/>
59 :     </IndexFields>
60 :     </Index>
61 :     <Index>
62 :     <Notes>This index allows searching for compounds by CAS ID.</Notes>
63 :     <IndexFields>
64 :     <IndexField name="cas-id" order="ascending"/>
65 :     </IndexFields>
66 :     </Index>
67 :     <Index>
68 :     <Notes>This index allows searching for compounds by ZINC ID.</Notes>
69 :     <IndexFields>
70 :     <IndexField name="zinc-id" order="ascending"/>
71 :     </IndexFields>
72 :     </Index>
73 :     </Indexes>
74 :     </Entity>
75 :     <Entity name="Diagram" keyType="name-string">
76 :     <DisplayInfo theme="web" col="3" row="3"/>
77 :     <Notes>A functional diagram describes a network of chemical reactions, often comprising a single
78 :     subsystem. A diagram is identified by a short name and contains a longer descriptive name.</Notes>
79 :     <Fields>
80 :     <Field name="name" type="text">
81 :     <Notes>Descriptive name of this diagram.</Notes>
82 :     </Field>
83 :     <Field name="content" type="image" relation="DiagramContent">
84 :     <Notes>The content of the diagram, in PNG format encoded as base 64 MIME.</Notes>
85 :     </Field>
86 :     </Fields>
87 :     </Entity>
88 :     <Entity name="Reaction" keyType="key-string">
89 :     <DisplayInfo theme="web" col="5" row="3"/>
90 :     <Notes>A reaction is a chemical process that converts one set of compounds (substrate)
91 :     to another set (products). The reaction ID is generally a small number preceded by a
92 :     letter.</Notes>
93 :     <Fields>
94 :     <Field name="url" type="string" relation="ReactionURL">
95 :     <Notes>HTML string containing a link to a web location that describes the
96 :     reaction. This field is optional.</Notes>
97 :     </Field>
98 :     <Field name="rev" type="boolean">
99 :     <Notes>TRUE if this reaction is reversible, else FALSE</Notes>
100 :     </Field>
101 :     </Fields>
102 :     </Entity>
103 :     <Entity name="Subsystem" keyType="id-string">
104 :     <DisplayInfo theme="seed" col="7" row="3"/>
105 :     <Notes>A subsystem is a collection of roles that work together in a cell. Identification of subsystems
106 :     is an important tool for recognizing parallel genetic features in different organisms. The key
107 :     is an alphanumeric code string.</Notes>
108 :     <Fields>
109 :     <Field name="name" type="string">
110 :     <Notes>Displayable name of this subsystem.</Notes>
111 :     </Field>
112 :     <Field name="version" type="int">
113 :     <Notes>Version number for the subsystem. This value is incremented each time the subsystem
114 :     is backed up.</Notes>
115 :     </Field>
116 :     <Field name="curator" type="string">
117 :     <Notes>Name of the person currently in charge of the subsystem.</Notes>
118 :     </Field>
119 :     <Field name="notes" type="text">
120 :     <Notes>Descriptive notes about the subsystem.</Notes>
121 :     </Field>
122 :     <Field name="description" type="text">
123 :     <Notes>Description of the subsystem's function in the cell.</Notes>
124 :     </Field>
125 :     <Field name="classification" type="string">
126 :     <Notes>Classification string, colon-delimited. This string organizes the
127 :     subsystems into a hierarchy.</Notes>
128 :     </Field>
129 :     </Fields>
130 :     <Indexes>
131 :     <Index>
132 :     <Notes>This index is used to get the subsystems in hierarchical order.</Notes>
133 :     <IndexFields>
134 :     <IndexField name="classification" order="ascending"/>
135 :     </IndexFields>
136 :     </Index>
137 :     <Index>
138 :     <Notes>This index is used to get the subsystem by name.</Notes>
139 :     <IndexFields>
140 :     <IndexField name="name" order="ascending"/>
141 :     </IndexFields>
142 :     </Index>
143 :     </Indexes>
144 :     </Entity>
145 :     <Entity name="Publication" keyType="hash-string">
146 :     <DisplayInfo theme="web" col="1" row="7"/>
147 :     <Notes>A _publication_ is an article or citation that may be used as evidence for
148 :     assertions made in the database. The key is a hash code computed from the URL.</Notes>
149 :     <Fields>
150 :     <Field name="url" type="string">
151 :     <Notes>URL of the article or of its citation.</Notes>
152 :     </Field>
153 :     <Field name="citation" type="text">
154 :     <Notes>Citation string for the article.</Notes>
155 :     </Field>
156 :     </Fields>
157 :     <Indexes>
158 :     <Index>
159 :     <Notes>This index allows searching for the article by the author names and title.</Notes>
160 :     <IndexFields>
161 :     <IndexField name="citation" order="ascending"/>
162 :     </IndexFields>
163 :     </Index>
164 :     </Indexes>
165 :     </Entity>
166 :     <Entity name="EC" keyType="key-string">
167 :     <DisplayInfo theme="web" col="3" row="5"/>
168 :     <Notes>An EC number is a code number associated with one or more particular roles.
169 :     EC numbers are a useful tool for identifying corresponding roles in different
170 :     databases.</Notes>
171 :     </Entity>
172 :     <Entity name="Role" keyType="string">
173 :     <DisplayInfo theme="web" col="5" row="5"/>
174 :     <Notes>A role describes a biological function that may be fulfilled by a feature.
175 :     One of the main goals of the database is to assign features to roles. Most
176 :     roles are effected by the construction of proteins. Some, however, deal with
177 :     functional regulation and message transmission</Notes>
178 :     <Fields>
179 :     <Field name="hypothetical" type="boolean">
180 :     <Notes>TRUE if a role is hypothetical, else FALSE</Notes>
181 :     </Field>
182 :     </Fields>
183 :     </Entity>
184 :     <Entity name="Variant" keyType="hash-string">
185 :     <DisplayInfo theme="seed" col="7" row="5"/>
186 :     <Notes>A variant is a functional subset of a subsystem. It indicates the particular
187 :     sequence of roles used to implement a metabolic pathway. Variants are abstract
188 :     concepts used to classify machines. The key of the variant is the subsystem ID followed
189 :     by the variant code (usually a numeric string with zero or more decimal points).</Notes>
190 :     </Entity>
191 :     <Entity name="Structure" keyType="string">
192 :     <DisplayInfo theme="web" col="1" row="5"/>
193 :     <Notes>A structure represents a portion of a protein's surface. Structures are used
194 :     to assist in understanding which reactions a protein catalyzes and why. The key of a
195 :     structure is its type followed by an ID. The current types are PDB and CDD, though
196 :     additional types may be added at a later date.</Notes>
197 :     </Entity>
198 :     <Entity name="ProteinSequence" keyType="hash-string">
199 :     <DisplayInfo theme="web" col="3" row="7" caption="Protein Sequence"/>
200 :     <Notes>A protein sequence is a specific sequence of amino acids. Unlike a DNA sequence, a
201 :     protein sequence does not belong to a genome. Identical proteins generated by different
202 :     genomes are generally stored as a single ProteinSequence instance. The key is a
203 :     hash of the protein letter sequence.</Notes>
204 :     <Fields>
205 :     <Field name="sequence" type="dna">
206 :     <Notes>The sequence contains the letters corresponding to the protein's
207 :     amino acids.</Notes>
208 :     </Field>
209 :     <Field name="iedb" type="text" relation="ProteinSequenceIEDB" special="property_search">
210 :     <Notes>A value indicating whether or not the feature can be found in the
211 :     Immune Epitope Database. If the feature has not been matched to that database,
212 :     this field will have no values. Otherwise, it will have an epitope name and/or
213 :     sequence, hyperlinked to the database.</Notes>
214 :     </Field>
215 :     <Field name="signal-peptide" type="name-string">
216 :     <Notes>The signal peptide location for this feature. This is expressed as start and end
217 :     numbers with a hyphen for the relevant amino acids. So, "1-22" would indicate a signal
218 :     peptide at the beginning of the feature's protein and extending through 22 amino acid
219 :     positions. An empty string means no signal peptide is present.</Notes>
220 :     </Field>
221 :     <Field name="transmembrane-map" type="text">
222 :     <Notes>A map indicating which sections of a protein will be embedded in a membrane.
223 :     This is expressed as a comma-separated list of as start and end numbers with hyphens
224 :     for the relevant amino acids. So, "10-12, 40-60" would indicate that there are two
225 :     sections of the protein that become embedded in a membrane: the 10th through 12th
226 :     amino acids, and the 40th through the 60th. An empty string means no
227 :     transmembrane regions are known.</Notes>
228 :     </Field>
229 :     <Field name="similar-to-human" type="boolean">
230 :     <Notes>TRUE if this feature generates a protein that is similar to one found in humans,
231 :     else FALSE</Notes>
232 :     </Field>
233 :     <Field name="isoelectric-point" type="float">
234 :     <Notes>pH in the surrounding medium at which the charge on a protein is neutral.
235 :     If the pH of the medium is lower than this value, the protein will have a net
236 :     positive charge. If the pH of the medium is higher, then the protein will have a
237 :     net negative charge.</Notes>
238 :     </Field>
239 :     <Field name="molecular-weight" type="float">
240 :     <Notes>Molecular weight of this feature's protein, in daltons. A weight of 0
241 :     indicates that no protein is created.</Notes>
242 :     </Field>
243 :     </Fields>
244 :     </Entity>
245 :     <Entity name="Feature" keyType="id-string">
246 :     <DisplayInfo theme="seed" col="5" row="9"/>
247 :     <Notes>A feature (sometimes also called a gene) is a part of a genome that is of special
248 :     interest. Features may be spread across multiple DNA sequences (contigs) of a genome, but
249 :     never across more than one genome. Each feature in the database has a unique FIG ID.</Notes>
250 :     <Fields>
251 :     <Field name="feature-type" type="id-string">
252 :     <Notes>Code indicating the type of this feature. Among the codes currently
253 :     supported are "peg" for a protein encoding gene, "bs" for a
254 :     binding site, "opr" for an operon, and so forth.</Notes>
255 :     </Field>
256 :     <Field name="link" type="text" relation="FeatureLink">
257 :     <Notes>Web hyperlink for this feature. A feature can have no hyperlinks or it can have many. The
258 :     links are to other websites that have useful about the gene that the feature represents, and
259 :     are coded as raw HTML, using an anchor href tag.</Notes>
260 :     </Field>
261 :     <Field name="essential" type="text" relation="FeatureEssential" special="property_search">
262 :     <Notes>A value indicating the essentiality of the feature, coded as HTML. In most
263 :     cases, this will be a word describing whether the essentiality is confirmed (essential)
264 :     or potential (potential-essential), hyperlinked to the document from which the
265 :     essentiality was curated. If a feature is not essential, this field will have no
266 :     values; otherwise, it may have multiple values.</Notes>
267 :     </Field>
268 :     <Field name="virulent" type="text" relation="FeatureVirulent" special="property_search">
269 :     <Notes>A value indicating the virulence of the feature, coded as HTML. In most
270 :     cases, this will be a phrase or SA number hyperlinked to the document from which
271 :     the virulence information was curated. If the feature is not virulent, this field
272 :     will have no values; otherwise, it may have multiple values.</Notes>
273 :     </Field>
274 :     <Field name="sequence-length" type="counter">
275 :     <Notes>Number of base pairs in this feature.</Notes>
276 :     </Field>
277 :     </Fields>
278 :     </Entity>
279 :     <Entity name="Machine" keyType="key-string">
280 :     <DisplayInfo theme="seed" col="7" row="7"/>
281 :     <Notes>A machine is a collection of features that implements a metabolic pathway. Machines
282 :     are the physical instances of variants. Each machine corresponds to a row in a subsystem
283 :     spreadsheet. The key is the variant key followed by a colon and the Genome ID.</Notes>
284 :     <Fields>
285 :     <Field name="type" type="key-string">
286 :     <Notes>The machine type indicates how it relates to the parent variant. A type
287 :     of "vacant" means that the machine does not appear to actually exist in the
288 :     organism. A type of "incomplete" means that the machine appears to be missing
289 :     many reactions. In all other cases, the type is "normal".</Notes>
290 :     </Field>
291 :     </Fields>
292 :     </Entity>
293 :     <Entity name="Identifier" keyType="string">
294 :     <DisplayInfo theme="seed" col="4" row="10"/>
295 :     <Notes>An identifier is an alternate name for a feature.</Notes>
296 :     <Fields>
297 :     <Field name="source" type="key-string">
298 :     <Notes>Specific type of the identifier, such as its source database or category.
299 :     The type can usually be decoded to convert the identifier to a URL.</Notes>
300 :     </Field>
301 :     </Fields>
302 :     <Indexes>
303 :     <Index>
304 :     <Notes>This index allows all the identifiers of a specified type to be located.</Notes>
305 :     <IndexFields>
306 :     <IndexField name="source" order="ascending"/>
307 :     </IndexFields>
308 :     </Index>
309 :     </Indexes>
310 :     </Entity>
311 :     <Entity name="Assignment" keyType="hash-string">
312 :     <DisplayInfo col="5" row="7" theme="seed"/>
313 :     <Notes>An assignment connects a feature to its putative role. The key of the
314 :     assignment is the feature ID followed by a timestamp.</Notes>
315 :     </Entity>
316 :     <Entity name="EvidenceClass" keyType="name-string">
317 :     <DisplayInfo col="6" row="9" theme="seed"/>
318 :     <Notes>An evidence class describes a general type of evidence code. An actual evidence
319 :     code consists of its class (e.g. "dlit", "ff") and an optional modifier. The modifier
320 :     is contained in the relationship between the class and the target assignment.</Notes>
321 :     <Fields>
322 :     <Field name="format" type="string">
323 :     <Notes>The format string is an example showing how the modifier portion of the
324 :     evidence code is formatted. It may contain HTML markup.</Notes>
325 :     </Field>
326 :     <Field name="short-description" type="string">
327 :     <Notes>The short description is a brief noun phrase explanation of the
328 :     evidence class.</Notes>
329 :     </Field>
330 :     <Field name="description" type="text">
331 :     <Notes>The description is a long text description of the evidence class and its
332 :     format string.</Notes>
333 :     </Field>
334 :     </Fields>
335 :     </Entity>
336 :     <Entity name="Family" keyType="name-string">
337 :     <DisplayInfo theme="seed" col="5" row="11"/>
338 :     <Notes>A family is a group of features united by a particular determination algorithm.
339 :     The algorithm will frequently-- but not always-- signify a functional role.</Notes>
340 :     </Entity>
341 :     <Entity name="Genome" keyType="name-string">
342 :     <DisplayInfo theme="nmpdr" col="7" row="9" caption="Genome Organism"/>
343 :     <Notes>Genome objects are organized in a hierarchy. At the bottom are the true genomes and
344 :     meta-genomes that connect to the rest of the database. Above them are a hierarchy
345 :     based on taxonomic classification.</Notes>
346 :     <Fields>
347 :     <Field name="full-name" type="name-string">
348 :     <Notes>Full name of the genome. This is either the taxonomic classification name
349 :     or a genus/species/strain name.</Notes>
350 :     </Field>
351 :     <Field name="level" type="int">
352 :     <Notes>Taxonomic classification level. A level of 0 indicates that this is
353 :     a specific strain with DNA attached. Higher levels indicate progressively
354 :     larger classifications. Each level number represents a specific type of
355 :     classification. Sub-species is always 1, species is always 2, genus is always
356 :     3, and so forth, up to 99 for domain. This means that as you travel up the
357 :     taxonomy tree, the ranks will be non-sequential.</Notes>
358 :     </Field>
359 :     <Field name="domain" type="name-string">
360 :     <Notes>Domain for this genome or taxonomic classification. The domain is
361 :     the highest level of the taxonomy tree.</Notes>
362 :     </Field>
363 :     <Field name="version" type="name-string">
364 :     <Notes>Version string for this genome, generally consisting of the genome ID followed
365 :     by a period and a string of digits.</Notes>
366 :     </Field>
367 :     <Field name="complete" type="boolean">
368 :     <Notes>TRUE if the genome is complete, else FALSE</Notes>
369 :     </Field>
370 :     <Field name="dna-size" type="counter">
371 :     <Notes>number of base pairs in the genome</Notes>
372 :     </Field>
373 :     <Field name="primary-group" type="name-string">
374 :     <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR
375 :     group per organism. An empty string indicates the organism is supporting. In general,
376 :     more data is kept on organisms in NMPDR groups than on supporting organisms.</Notes>
377 :     </Field>
378 :     <Field name="contigs" type="int">
379 :     <Notes>Number of contigs for this organism.</Notes>
380 :     </Field>
381 :     <Field name="pegs" type="int">
382 :     <Notes>Number of protein encoding genes for this organism</Notes>
383 :     </Field>
384 :     <Field name="rnas" type="int">
385 :     <Notes>Number of RNA features found for this organism.</Notes>
386 :     </Field>
387 :     </Fields>
388 :     <Indexes>
389 :     <Index>
390 :     <Notes>This index allows the applications to find all genomes associated with
391 :     a specific primary (NMPDR) group.</Notes>
392 :     <IndexFields>
393 :     <IndexField name="primary-group" order="ascending"/>
394 :     <IndexField name="full-name" order="ascending"/>
395 :     </IndexFields>
396 :     </Index>
397 :     <Index>
398 :     <Notes>This index allows the applications to find all genomes in lexical
399 :     order by name. Organisms will show up first, alphabetical by species and
400 :     strain name, followed by the various taxonomic classifications grouped by
401 :     increasing inclusivity. (In other words,</Notes>
402 :     <IndexFields>
403 :     <IndexField name="level" order="ascending"/>
404 :     <IndexField name="full-name" order="ascending"/>
405 :     </IndexFields>
406 :     </Index>
407 :     </Indexes>
408 :     </Entity>
409 :     <Entity name="Pairing" keyType="name-string">
410 :     <DisplayInfo theme="seed" col="3" row="9"/>
411 :     <Notes>A pairing indicates that two protein sequences are found close together on one or
412 :     more DNA sequences. Not all possible pairings are stored in the database; only those that
413 :     are considered for some reason to be significant for annotation purposes. The pairing
414 :     includes a score that indicates how many of the DNA sequences are significantly
415 :     dissimilar. A higher score indicates a stronger pairing. The key of the pairing is the
416 :     concatenation of the protein sequence keys in alphabetical order.</Notes>
417 :     <Asides>Because the protein sequence key is a hash of the sequence letters, the key of a pairing between two
418 :     sequences is computable from the sequences themselves. Theoretically, the pairing
419 :     is unordered: (A,B) and (B,A) are the same pairing. It is frequently the case,
420 :     however, that we need to refer to the "first" or "second" protein in the pairing.
421 :     When this happens, the first one is always the protein with the alphabetically
422 :     lesser key. The IsInPair relationship automatically shows the proteins in this
423 :     order.</Asides>
424 :     <Fields>
425 :     <Field name="score" type="int">
426 :     <Notes>Coupling score for this pairing. A higher score indicates a stronger
427 :     coupling.</Notes>
428 :     </Field>
429 :     </Fields>
430 :     </Entity>
431 :     <Entity name="EvidenceSet" keyType="int">
432 :     <DisplayInfo theme="seed" col="3" row="11" caption="Evidence Set"/>
433 :     <Notes>An evidence set indicates evidence for a functional connection between protein
434 :     sequence pairs. The protein sequences possessing the connection are the ones that
435 :     participate in the evidence set's pairings.</Notes>
436 :     <Asides>The pairings for a particular evidence set
437 :     will contain protein sequences that are significantly similar. In other words, if
438 :     (A,B) and (X,Y) are both pairings in a single evidence set, then (A =~ X) and
439 :     (B =~ Y) or (A =~ Y) and (B =~ X).</Asides>
440 :     <Fields>
441 :     <Field name="score" type="int">
442 :     <Notes>Score for this evidence set. The score indicates the number of
443 :     significantly different genomes represented by the pairings.</Notes>
444 :     </Field>
445 :     </Fields>
446 :     </Entity>
447 :     <Entity name="DnaSequence" keyType="name-string">
448 :     <DisplayInfo theme="nmpdr" col="7" row="11" caption="DNA Sequence"/>
449 :     <Notes>A DNA sequence (sometimes called a "contig") is a contiguous sequence of base pairs
450 :     belonging to a single genome. The key of the DNA sequence is the genome ID followed by
451 :     the contig ID.</Notes>
452 :     <Fields>
453 :     <Field name="length" type="counter">
454 :     <Notes>Number of base pairs in the DNA sequence.</Notes>
455 :     </Field>
456 :     <Field name="bases" type="text" relation="DnaSequenceBases">
457 :     <Notes>A string of letters representing the nucleotides of the sequence.</Notes>
458 :     </Field>
459 :     </Fields>
460 :     </Entity>
461 :     </Entities>
462 :     <Relationships>
463 :     <Relationship name="IsTargetOf" from="Role" to="Assignment" arity="1M" converse="Targets">
464 :     <DisplayInfo theme="seed" caption="Is\nTarget\nOf"/>
465 :     <Notes>This relationship connects an assignment to the target role. A role has
466 :     many assignments, but an assignment targets exactly one role.</Notes>
467 :     </Relationship>
468 :     <Relationship name="IsAnnotatedBy" from="Feature" to="Assignment" arity="1M" converse="Annotates">
469 :     <DisplayInfo theme="seed" caption="Annotates"/>
470 :     <Notes>This relationship connects a feature to the assignments that annotate it.
471 :     A feature may have several assignments, but an assignment annotates exactly one
472 :     feature.</Notes>
473 :     <Fields>
474 :     <Field name="time-stamp" type="date">
475 :     <Notes>Time at which the assignment was made.</Notes>
476 :     </Field>
477 :     <Field name="annotator" type="string">
478 :     <Notes>Name of the annotator who made the assignment.</Notes>
479 :     </Field>
480 :     <Field name="active" type="boolean">
481 :     <Notes>TRUE if this assignment is active; FALSE if it has been
482 :     superceded.</Notes>
483 :     </Field>
484 :     </Fields>
485 :     <FromIndex>
486 :     <Notes>This index presents the assignments in order from the most
487 :     recent to the least recent, with active assignments first.</Notes>
488 :     <IndexFields>
489 :     <IndexField name="active" order="descending"/>
490 :     <IndexField name="time-stamp" order="descending"/>
491 :     </IndexFields>
492 :     </FromIndex>
493 :     </Relationship>
494 :     <Relationship name="IsEvidencedBy" from="Assignment" to="EvidenceClass" arity="MM" converse="IsEvidenceFor">
495 :     <DisplayInfo theme="seed" caption="Is\nEvidenced\nBy" fixed="1" col="6" row="8" />
496 :     <Notes>This relationship contains the evidence for an assignment. An assignment will
497 :     have one or more evidence codes, and each evidence class will justify an enormous
498 :     number of assignments. The intersection data contains details about the evidence.</Notes>
499 :     <Fields>
500 :     <Field name="modifier" type="string">
501 :     <Notes>A modifier for the evidence class. The modifier is concatenated to the
502 :     class to form the complete evidence code. Frequently, the modifier will be the
503 :     ID of a family, subsystem, or evidence set.</Notes>
504 :     </Field>
505 :     </Fields>
506 :     </Relationship>
507 :     <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">
508 :     <DisplayInfo caption="Has As\nTerminus"/>
509 :     <Notes>A terminus for a scenario is a compound that acts as its input or output. A compound
510 :     can be the terminus for many scenarios, and a scenario will have many termini. The relationship
511 :     attributes indicate whether the compound is an input to the scenario or an output. In some
512 :     cases, there may be multiple alternative output groups. This is also indicated by the
513 :     attributes.</Notes>
514 :     <Fields>
515 :     <Field name="group-number" type="int">
516 :     <Notes>If zero, then the compound is an input. Otherwise, this is the index number
517 :     of the output group. Each output group represents an alternative set of output
518 :     compounds.</Notes>
519 :     </Field>
520 :     </Fields>
521 :     <ToIndex>
522 :     <Notes>This index allows the application to view a scenario's compounds by group.</Notes>
523 :     <IndexFields>
524 :     <IndexField name="group-number" type="int"/>
525 :     </IndexFields>
526 :     </ToIndex>
527 :     </Relationship>
528 :     <Relationship name="HasAlias" from="Feature" to="Identifier" arity="MM" converse="IsAliasOf">
529 :     <DisplayInfo theme="seed" fixed="1" col="4" row="9" caption="Has Alias"/>
530 :     <Notes>An identifier is an alias for multiple features. A feature may have multiple alias
531 :     identifiers.</Notes>
532 :     </Relationship>
533 :     <Relationship name="Justifies" from="EvidenceSet" to="Family" arity="MM" converse="IsJustifiedBy">
534 :     <DisplayInfo theme="seed" caption="Is\nJustified\nBy"/>
535 :     <Notes>A family may use multiple sets as evidence. In general, an evidence set will
536 :     justify two families-- one for each side of the pairing.</Notes>
537 :     </Relationship>
538 :     <Relationship name="IsDeterminedBy" from="EvidenceSet" to="Pairing" arity="MM" converse="Determines">
539 :     <DisplayInfo theme="seed" caption="Determines"/>
540 :     <Notes>An evidence set exists because it has pairings in it, and this relationship
541 :     connects the evidence set to its constituent pairings. A pairing cam belong to
542 :     multiple evidence sets.</Notes>
543 :     <Fields>
544 :     <Field name="inverted" type="boolean">
545 :     <Notes>A pairing is an unordered pair of protein sequences, but its
546 :     similarity to other pairings in an evidence set is ordered. Let (A,B) be
547 :     a pairing and (X,Y) be another pairing in the same set. If this flag is
548 :     FALSE, then (A =~ X) and (B =~ Y). If this flag is TRUE, then (A =~ Y) and
549 :     (B =~ X).</Notes>
550 :     </Field>
551 :     </Fields>
552 :     </Relationship>
553 :     <Relationship name="IsInPair" from="ProteinSequence" to="Pairing" arity="MM" converse="Contains">
554 :     <DisplayInfo theme="seed" caption="Is In\nPair"/>
555 :     <Notes>A pairing contains exactly two protein sequences. A protein sequence can
556 :     belong to multiple pairings. When going from a protein sequence to its pairings,
557 :     they are presented in alphabetical order by sequence key.</Notes>
558 :     </Relationship>
559 :     <Relationship name="HasMember" from="Family" to="Feature" arity="1M" converse="IsMemberOf">
560 :     <DisplayInfo theme="seed" caption="Is\nMember\nOf" row="11.5" col="5"/>
561 :     <Notes>This relationship connects each feature family to its constituent
562 :     features. A family always has many features, but a single feature can
563 :     be found in at most one family.</Notes>
564 :     </Relationship>
565 :     <Relationship name="IsClassOf" from="Genome" to="Genome" arity="1M" converse="IsClassifiedAs">
566 :     <DisplayInfo theme="nmpdr" col="8" row="9" fixed="1" caption="Is\nClass\nOf"/>
567 :     <Notes>The recursive IsClassOf relationship organizes Genomes into a hierarchy
568 :     based on the standard taxonomy. Only genomes at the bottom of the hierarchy have
569 :     actual DNA attached.</Notes>
570 :     </Relationship>
571 :     <Relationship name="ConsistsOf" from="Variant" to="Role" arity="MM">
572 :     <DisplayInfo theme="seed" connected="1" caption="Belongs To"/>
573 :     <Notes>A variant is essentially a sequence of roles. Roles can belong to many
574 :     variants. Some roles will not belong to any variants.</Notes>
575 :     </Relationship>
576 :     <Relationship name="Contains" from="Diagram" to="Compound" arity="MM" converse="IsContainedIn">
577 :     <DisplayInfo theme="web" caption="Is\nContained\nIn"/>
578 :     <Notes>This relationship indicates that a compound appears on a particular diagram.
579 :     The same compound can appear on many diagrams, and a diagram always contains many
580 :     compounds.</Notes>
581 :     </Relationship>
582 :     <Relationship name="Includes" from="Subsystem" to="Role" arity="MM" converse="IsIncludedIn">
583 :     <DisplayInfo theme="seed" caption="Includes"/>
584 :     <Notes>A subsystem is defined by its roles. The subsystem's variants contain slightly
585 :     different sets of roles, but all of the roles in a variant must be connected to the
586 :     parent subsystem by this relationship.</Notes>
587 :     <Fields>
588 :     <Field name="sequence" type="counter">
589 :     <Notes>Sequence number of the role within the subsystem. When the roles
590 :     are formed into a variant, they will generally appear in sequence order.</Notes>
591 :     </Field>
592 :     </Fields>
593 :     <FromIndex>
594 :     <Notes>This index insures that the roles of the subsystem are presented in sequence
595 :     order.</Notes>
596 :     <IndexFields>
597 :     <IndexField name="sequence" order="ascending"/>
598 :     </IndexFields>
599 :     </FromIndex>
600 :     </Relationship>
601 :     <Relationship name="Describes" from="Subsystem" to="Variant" arity="1M" converse="IsDescribedBy">
602 :     <DisplayInfo theme="seed"/>
603 :     <Notes>This relationship connects a subsystem to the individual variants used
604 :     to implement it. Each variant contains a slightly different subset of the
605 :     roles in the parent subsystem.</Notes>
606 :     </Relationship>
607 :     <Relationship name="Shows" from="Diagram" to="Reaction" arity="MM" converse="IsShowedOn">
608 :     <DisplayInfo theme="web"/>
609 :     <Notes>This relationship connects a diagram to its reactions. A diagram shows multiple
610 :     reactions, and a reaction can be on many diagrams.</Notes>
611 :     </Relationship>
612 :     <Relationship name="Performs" theme="web" from="Reaction" to="Role" arity="MM">
613 :     <DisplayInfo theme="web"/>
614 :     <Notes>A reaction performs many roles. A role can be performed by many
615 :     reactions.</Notes>
616 :     </Relationship>
617 :     <Relationship name="IsImplementedBy" from="Variant" to="Machine" arity="1M" converse="Implements">
618 :     <DisplayInfo theme="seed" caption="Is\nImplemented\nBy"/>
619 :     <Notes>This relationship connects a variant to the physical machines that implement
620 :     it in the genomes. A variant is implemented by many machines, but a machine belongs to
621 :     only one variant.</Notes>
622 :     </Relationship>
623 :     <Relationship name="Involves" from="Reaction" to="Compound" arity="MM" converse="IsInvolvedIn">
624 :     <DisplayInfo theme="web" col="3" row="4" fixed="1" caption="Is\nInvolved\nIn"/>
625 :     <Notes>This relationship connects a reaction to the compounds that participate in
626 :     it. A reaction involves many compounds, and a compound can be involved in many reactions.
627 :     The relationship attributes indicate whether a compound is a product or substrate of the
628 :     reaction, as well as its stoichiometry.</Notes>
629 :     <Fields>
630 :     <Field name="product" type="boolean">
631 :     <Notes>TRUE if the compound is a product of the reaction, FALSE if
632 :     it is a substrate. When a reaction is written on paper in
633 :     chemical notation, the substrates are left of the arrow and the
634 :     products are to the right. Sorting on this field will cause
635 :     the substrates to appear first, followed by the products. If the
636 :     reaction is reversible, then the notion of substrates and products
637 :     is not intuitive; however, a value here of FALSE still puts the
638 :     compound left of the arrow and a value of TRUE still puts it to the
639 :     right.</Notes>
640 :     </Field>
641 :     <Field name="stoichiometry" type="key-string">
642 :     <Notes>Number of molecules of the compound that participate in a
643 :     single instance of the reaction. For example, if a reaction
644 :     produces two water molecules, the stoichiometry of water for the
645 :     reaction would be two. When a reaction is written on paper in
646 :     chemical notation, the stoichiometry is the number next to the
647 :     chemical formula of the compound.</Notes>
648 :     </Field>
649 :     <Field name="main" type="boolean">
650 :     <Notes>TRUE if this compound is one of the main participants in
651 :     the reaction, else FALSE. It is permissible for none of the
652 :     compounds in the reaction to be considered main, in which
653 :     case this value would be FALSE for all of the relevant
654 :     compounds.</Notes>
655 :     </Field>
656 :     <Field name="loc" type="key-string">
657 :     <Notes>An optional character string that indicates the relative
658 :     position of this compound in the reaction's chemical formula. The
659 :     location affects the way the compounds present as we cross the
660 :     relationship from the reaction side. The product/substrate flag
661 :     comes first, then the value of this field, then the main flag.
662 :     The default value is an empty string; however, the empty string
663 :     sorts first, so if this field is used, it should probably be
664 :     used for every compound in the reaction.</Notes>
665 :     </Field>
666 :     <Field name="discriminator" type="int">
667 :     <Notes>A unique ID for this record. The discriminator does not
668 :     provide any useful data, but it prevents identical records from
669 :     being collapsed by the SELECT DISTINCT command used by ERDB to
670 :     retrieve data.</Notes>
671 :     </Field>
672 :     </Fields>
673 :     <ToIndex>
674 :     <Notes>This index presents the compounds in the reaction in the
675 :     order they should be displayed when writing it in chemical notation.
676 :     All the substrates appear before all the products, and within that
677 :     ordering, the main compounds appear first.</Notes>
678 :     <IndexFields>
679 :     <IndexField name="product" order="ascending"/>
680 :     <IndexField name="loc" order="ascending"/>
681 :     <IndexField name="main" order="descending"/>
682 :     </IndexFields>
683 :     </ToIndex>
684 :     </Relationship>
685 :     <Relationship name="IsSourceOf" from="Machine" to="Assignment" arity="1M" converse="HasSource">
686 :     <DisplayInfo theme="seed" caption="Has Source"/>
687 :     <Notes>This relationship connects a machine to the assignments made in its name.
688 :     A machine is the source of many assignments, but an assignment belongs to at most
689 :     one machine.</Notes>
690 :     </Relationship>
691 :     <Relationship name="Uses" theme="seed" from="Genome" to="Machine" arity="1M" converse="IsUsedBy">
692 :     <DisplayInfo theme="seed" caption="Is\nUsed\nBy"/>
693 :     <Notes>This relationship connects a genome to the machines that form its
694 :     metabolic pathways. A genome can use many machines, but a machine is used by exactly
695 :     one genome.</Notes>
696 :     </Relationship>
697 :     <Relationship name="Catalyzes" from="ProteinSequence" to="Role" arity="MM" converse="IsCatalyzedBy">
698 :     <DisplayInfo theme="web" caption="Is\nCatalyzed\nBy"/>
699 :     <Notes>This relationship connects a protein sequence to the functional roles it
700 :     catalyzes in the cell. A protein sequence can catalyze many roles, and a role can
701 :     be catalyzed by many protein sequences. Roles that perform regulatory or message
702 :     transmission functions do not participate in this relationship.</Notes>
703 :     </Relationship>
704 :     <Relationship name="IsProducedBy" from="ProteinSequence" to="Feature" arity="1M" converse="Produces">
705 :     <DisplayInfo caption="Is\nProduced\nBy" theme="seed" row="10" col="1.5"/>
706 :     <Notes>This relationship connects a feature to the protein sequence it produces (if any).
707 :     Many features can produce the same protein sequence, but each feature produces at most
708 :     one protein sequence. Many features do not produce a protein sequence at all.</Notes>
709 :     </Relationship>
710 :     <Relationship name="IsLocatedIn" from="Feature" to="DnaSequence" arity="MM" converse="IsLocusFor">
711 :     <DisplayInfo theme="seed" caption="Is\nLocated\nIn" fixed="1" row="11" col="6" />
712 :     <Notes>A feature is a set of DNA sequence fragments. Most features are a single contiquous
713 :     fragment, so they are located in only one DNA sequence; however, fragments have a maximum
714 :     length, so even a single contiguous feature may participate in this relationship multiple
715 :     times. A few features belong to multiple DNA sequences. In that case, however, all the
716 :     DNA sequences belong to the same genome. A DNA sequence itself will frequently have
717 :     thousands of features connected to it.</Notes>
718 :     <Fields>
719 :     <Field name="locN" type="int">
720 :     <Notes>Sequence number of this segment.</Notes>
721 :     </Field>
722 :     <Field name="beg" type="int">
723 :     <Notes>Index (1-based) of the first residue in the contig that
724 :     belongs to the segment.</Notes>
725 :     </Field>
726 :     <Field name="len" type="int">
727 :     <Notes>Number of residues in the segment. A length of 0 identifies
728 :     a specific point between residues. This is the point before the residue if the direction
729 :     is forward and the point after the residue if the direction is backward.</Notes>
730 :     </Field>
731 :     <Field name="dir" type="char">
732 :     <Notes>Direction of the segment: "+" if it is forward and
733 :     "-" if it is backward.</Notes>
734 :     </Field>
735 :     </Fields>
736 :     <FromIndex>
737 :     <Notes>This index allows the application to find all the segments of a feature in
738 :     the proper order.</Notes>
739 :     <IndexFields>
740 :     <IndexField name="locN" order="ascending"/>
741 :     </IndexFields>
742 :     </FromIndex>
743 :     <ToIndex>
744 :     <Notes>This index is the one used by applications to find all the feature
745 :     segments that contain a specific residue.</Notes>
746 :     <IndexFields>
747 :     <IndexField name="beg" order="ascending"/>
748 :     </IndexFields>
749 :     </ToIndex>
750 :     </Relationship>
751 :     <Relationship name="IsOwnerOf" from="Genome" to="Feature" arity="1M" converse="IsOwnedBy">
752 :     <DisplayInfo caption="Is\nOwned\nBy" theme="seed" fixed="1" row="10" col="6" />
753 :     <Notes>This relationship connects each feature to its parent genome.</Notes>
754 :     </Relationship>
755 :     <Relationship name="IsMadeUpOf" from="Genome" to="DnaSequence" arity="1M" converse="MakesUp">
756 :     <DisplayInfo theme="nmpdr" caption="Is\nMade Up\nOf"/>
757 :     <Notes>This relationship connects each genome to the DNA sequences that make it up.</Notes>
758 :     </Relationship>
759 :     <Relationship name="Exposes" from="ProteinSequence" to="Structure" arity="MM" converse="IsExposedBy">
760 :     <DisplayInfo theme="web" caption="Is\nExposed\nBy"/>
761 :     <Notes>This relationship connects a protein sequence to the chemically active structures
762 :     on its surface. A protein sequence exposes many structures, and a particular structure
763 :     may occur on many proteins.</Notes>
764 :     </Relationship>
765 :     <Relationship name="Attracts" from="Structure" to="Compound" arity="MM" converse="IsAttractedTo">
766 :     <DisplayInfo theme="web" caption="Is\nAttracted\nTo"/>
767 :     <Notes>This relationship connects a compound to the protein structures that attract it.
768 :     This is an incomplete relationship that exists to service drug targeting queries. Only
769 :     the attractions whose parameters have been determined through modeling or
770 :     experimentation are included. The goal is to determine the docking energy between
771 :     the compound and the protein structure.</Notes>
772 :     <Fields>
773 :     <Field name="reason" type="id-string">
774 :     <Notes>Indication of the reason for determining the docking energy.
775 :     A value of "Random" indicates the docking was attempted as a part
776 :     of a random survey used to determine the docking characteristics of a
777 :     protein structure. A value of "Rich" indicates the docking was attempted
778 :     because a low-energy docking result was predicted for the compound.</Notes>
779 :     </Field>
780 :     <Field name="tool" type="id-string">
781 :     <Notes>Name of the tool used to compute the docking energy.</Notes>
782 :     </Field>
783 :     <Field name="total-energy" type="float">
784 :     <Notes>Total energy required for the compound to dock with the structure,
785 :     in kcal/mol. A negative value means energy is released.</Notes>
786 :     </Field>
787 :     <Field name="vanderwalls-energy" type="float">
788 :     <Notes>Docking energy in kcal/mol that results from the geometric fit
789 :     (Van der Waals force) between the structure and the compound.</Notes>
790 :     </Field>
791 :     <Field name="electrostatic-energy" type="float">
792 :     <Notes>Docking energy in kcal/mol that results from the movement of
793 :     electrons (electrostatic force) between the structure and the
794 :     compound.</Notes>
795 :     </Field>
796 :     </Fields>
797 :     <FromIndex>
798 :     <Notes>This index enables the application to view a structure's docking results from
799 :     the lowest energy (best docking) to highest energy (worst docking).</Notes>
800 :     <IndexFields>
801 :     <IndexField name="total-energy" order="ascending"/>
802 :     </IndexFields>
803 :     </FromIndex>
804 :     <ToIndex>
805 :     <Notes>This index enables the application to view a compound's docking results from
806 :     the lowest energy (best docking) to highest energy (worst docking).</Notes>
807 :     <IndexFields>
808 :     <IndexField name="total-energy" order="ascending"/>
809 :     </IndexFields>
810 :     </ToIndex>
811 :     </Relationship>
812 :     <Relationship name="IsTerminusFor" from="Compound" to="Scenario" arity="MM" converse="HasAsTerminus">
813 :     <DisplayInfo theme="web" caption="Has As\nTerminus"/>
814 :     <Notes>A terminus for a scenario is a compound that acts as its input or output. A
815 :     compound can be the terminus for many scenarios, and a scenario will have many termini.
816 :     The relationship attributes indicate whether the compound is an input to the scenario or
817 :     an output. In some cases, there may be multiple alternative output groups. This is also
818 :     indicated by the attributes.</Notes>
819 :     <Fields>
820 :     <Field name="group-number" type="int">
821 :     <Notes>The group number is 0 for an input compound; otherwise, it is the
822 :     number of the output group to which the compound belongs. Output groups
823 :     represent alternative outputs for the scenario. A compound in multiple
824 :     output groups will appear multiple times in this relationship.</Notes>
825 :     </Field>
826 :     </Fields>
827 :     <ToIndex>
828 :     <Notes>This index presents the terminal compounds for a scenario in group
829 :     order.</Notes>
830 :     <IndexFields>
831 :     <IndexField name="group-number" order="ascending"/>
832 :     </IndexFields>
833 :     </ToIndex>
834 :     </Relationship>
835 :     <Relationship name="Overlaps" from="Scenario" to="Diagram" arity="MM" converse="IncludesPartOf">
836 :     <DisplayInfo theme="web"/>
837 :     <Notes>A Scenario overlaps a diagram when the diagram displays a portion of the reactions
838 :     that make up the scenario. A scenario may overlap many diagrams, and a diagram may
839 :     be include portions of many scenarios.</Notes>
840 :     </Relationship>
841 :     <Relationship name="HasParticipant" from="Scenario" to="Reaction" arity="MM" converse="ParticipatesIn">
842 :     <DisplayInfo theme="web" caption="\nParticipates\nIn"/>
843 :     <Notes>A scenario consists of many participant reactions that convert the input compounds
844 :     to output compounds. A single reaction may participate in many scenarios.</Notes>
845 :     </Relationship>
846 :     <Relationship name="IsValidatedBy" from="Subsystem" to="Scenario" arity="1M" converse="Validates">
847 :     <DisplayInfo theme="seed" caption="Is\nValidated\nBy"/>
848 :     <Notes>This relationship connects a scenario to the subsystem it validates. A scenario
849 :     validates exactly one subsystem, but a subsystem may have multiple scenarios used for
850 :     validation.</Notes>
851 :     </Relationship>
852 :     <Relationship name="Concerns" from="Publication" to="ProteinSequence" arity="MM" converse="IsATopicOf">
853 :     <DisplayInfo theme="web"/>
854 :     <Notes>This relationship connects a publication to the protein sequences it
855 :     describes.</Notes>
856 :     </Relationship>
857 :     <Relationship name="Identifies" from="EC" to="Role" arity="1M" converse="IsIdentifiedBy">
858 :     <DisplayInfo theme="web"/>
859 :     <Notes>This relationship connects an EC number code to its relevant roles. A role will
860 :     only have one EC number, but an EC number can identify multiple roles.</Notes>
861 :     </Relationship>
862 :     </Relationships>
863 :     <Shapes>
864 :     </Shapes>
865 :     </Database>

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3