Parent Directory
|
Revision Log
Changed the notes on subsystem classifications to reflect the fact that the heirarchy levels are colon-delimited instead of space-delimited.
<?xml version="1.0" encoding="utf-8" ?> <Database> <Title>Sprout Genome and Subsystem Database</Title> <Entities> <Entity name="Genome" keyType="name-string"> <Notes>A [i]genome[/i] contains the sequence data for a particular individual organism.</Notes> <Fields> <Field name="genus" type="name-string"> <Notes>Genus of the relevant organism.</Notes> <DataGen pass="1">RandParam('streptococcus', 'staphyloccocus', 'felis', 'homo', 'ficticio', 'strangera', 'escherischia', 'carborunda')</DataGen> </Field> <Field name="species" type="name-string"> <Notes>Species of the relevant organism.</Notes> <DataGen pass="1">StringGen('PKVKVKVKVKV')</DataGen> </Field> <Field name="unique-characterization" type="medium-string"> <Notes>The unique characterization identifies the particular organism instance from which the genome is taken. It is possible to have in the database more than one genome for a particular species, and every individual organism has variations in its DNA.</Notes> <DataGen>StringGen('PKVKVK999')</DataGen> </Field> <Field name="access-code" type="key-string"> <Notes>The access code determines which users can look at the data relating to this genome. Each user is associated with a set of access codes. In order to view a genome, one of the user's access codes must match this value.</Notes> <DataGen>RandParam('low','medium','high')</DataGen> </Field> <Field name="complete" type="boolean"> <Notes>TRUE if the genome is complete, else FALSE</Notes> </Field> <Field name="taxonomy" type="text"> <Notes>The taxonomy string contains the full taxonomy of the organism, while individual elements separated by semi-colons (and optional white space), starting with the domain and ending with the disambiguated genus and species (which is the organism's scientific name plus an identifying string).</Notes> <DataGen pass="2">join('; ', (RandParam('bacteria', 'archaea', 'eukaryote', 'virus', 'environmental'), ListGen('PKVKVKVK', 5), $this->{genus}, $this->{species}))</DataGen> </Field> <Field name="primary-group" type="name-string"> <Notes>The primary NMPDR group for this organism. There is always exactly one NMPDR group (either based on the organism name or the default value "Supporting"), whereas there can be multiple named groups or even none.</Notes> </Field> <Field name="group-name" type="name-string" relation="GenomeGroups"> <Notes>The group identifies a special grouping of organisms that would be displayed on a particular page or of particular interest to a research group or web site. A single genome can belong to multiple such groups or none at all.</Notes> </Field> </Fields> <Indexes> <Index Unique="false"> <Notes>This index allows the applications to find all genomes associated with a specific access code, so that a complete list of the genomes users can view may be generated.</Notes> <IndexFields> <IndexField name="access-code" order="ascending" /> <IndexField name="genus" order="ascending" /> <IndexField name="species" order="ascending" /> <IndexField name="unique-characterization" order="ascending" /> </IndexFields> </Index> <Index Unique="false"> <Notes>This index allows the applications to find all genomes associated with a specific primary (NMPDR) group.</Notes> <IndexFields> <IndexField name="primary-group" order="ascending" /> <IndexField name="genus" order="ascending" /> <IndexField name="species" order="ascending" /> <IndexField name="unique-characterization" order="ascending" /> </IndexFields> </Index> <Index Unique="false"> <Notes>This index allows the applications to find all genomes for a particular species.</Notes> <IndexFields> <IndexField name="genus" order="ascending" /> <IndexField name="species" order="ascending" /> <IndexField name="unique-characterization" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="Source" keyType="medium-string"> <Notes>A [i]source[/i] describes a place from which genome data was taken. This can be an organization or a paper citation.</Notes> <Fields> <Field name="URL" type="string" relation="SourceURL"> <Notes>URL the paper cited or of the organization's web site. This field optional.</Notes> <DataGen>"http://www.conservativecat.com/Ferdy/TestTarget.php?Source=" . $this->{id}</DataGen> </Field> <Field name="description" type="text"> <Notes>Description the source. The description can be a street address or a citation.</Notes> <DataGen>$this->{id} . ': ' . StringGen(IntGen(50,200))</DataGen> </Field> </Fields> </Entity> <Entity name="Contig" keyType="name-string"> <Notes>A [i]contig[/i] is a contiguous run of residues. The contig's ID consists of the genome ID followed by a name that identifies which contig this is for the parent genome. As is the case with all keys in this database, the individual components are separated by a period. [p]A contig can contain over a million residues. For performance reasons, therefore, the contig is split into multiple pieces called [i]sequences[/i]. The sequences contain the characters that represent the residues as well as data on the quality of the residue identification.</Notes> </Entity> <Entity name="Sequence" keyType="name-string"> <Notes>A [i]sequence[/i] is a continuous piece of a [i]contig[/i]. Contigs are split into sequences so that we don't have to have the entire contig in memory when we are manipulating it. The key of the sequence is the contig ID followed by the index of the begin point.</Notes> <Fields> <Field name="sequence" type="text"> <Notes>String consisting of the residues. Each residue is described by a single character in the string.</Notes> <DataGen>RandChars("ACGT", IntGen(100,400))</DataGen> </Field> <Field name="quality-vector" type="text"> <Notes>String describing the quality data for each base pair. Individual values will be separated by periods. The value represents negative exponent of the probability of error. Thus, for example, a quality of 30 indicates the probability of error is 10^-30. A higher quality number a better chance of a correct match. It is possible that the quality data is not known for a sequence. If that is the case, the quality vector will contain the [b]unknown[/b].</Notes> <DataGen>unknown</DataGen> </Field> </Fields> </Entity> <Entity name="Feature" keyType="id-string"> <Notes>A [i]feature[/i] is a part of a genome that is of special interest. Features may be spread across multiple contigs of a genome, but never across more than one genome. Features can be assigned to roles via spreadsheet cells, and are the targets of annotation.</Notes> <Fields> <Field name="feature-type" type="string"> <Notes>Code indicating the type of this feature.</Notes> <DataGen>RandParam('peg','rna')</DataGen> </Field> <Field name="alias" type="medium-string" relation="FeatureAlias"> <Notes>Alternative name for this feature. A feature can have many aliases.</Notes> <DataGen testCount="3">StringGen('Pgi|99999', 'Puni|XXXXXX', 'PAAAAAA999')</DataGen> </Field> <Field name="translation" type="text" relation="FeatureTranslation"> <Notes>[i](optional)[/i] A translation of this feature's residues into character codes, formed by concatenating the pieces of the feature together. For a protein encoding group, this is the protein characters. For other types it is the DNA characters.</Notes> <DataGen testCount="0"></DataGen> </Field> <Field name="upstream-sequence" type="text" relation="FeatureUpstream"> <Notes>Upstream sequence the feature. This includes residues preceding the feature as well as some of the feature's initial residues.</Notes> <DataGen testCount="0"></DataGen> </Field> <Field name="assignment" type="text"> <Notes>Default functional assignment for this feature.</Notes> </Field> <Field name="active" type="boolean"> <Notes>TRUE if this feature is still considered valid, FALSE if it has been logically deleted.</Notes> <DataGen>1</DataGen> </Field> <Field name="keywords" type="text" searchable="1"> <Notes>This is a list of search keywords for the feature. It includes the functional assignment, subsystem roles, and special properties.</Notes> </Field> <Field name="link" type="text" relation="FeatureLink"> <Notes>Web hyperlink for this feature. A feature have no hyperlinks or it can have many. The links are to other websites that have useful about the gene that the feature represents, and are coded as raw HTML, using [b]<a href="[i]link[/i]">[i]text[/i]</a>[/b] notation.</Notes> <DataGen testCount="3">'http://www.conservativecat.com/Ferdy/TestTarget.php?Source=' . $this->{id} . "&Number=" . IntGen(1,99)</DataGen> </Field> <Field name="conservation" type="float" relation="FeatureConservation"> <Notes>A number between 0 and 1 that indicates the degree to which this feature's DNA is conserved in related genomes. A value of 1 indicates perfect conservation. A value less than 1 is a reflect of the degree to which gap characters interfere in the alignment between the feature and its close relatives.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index allows the user to find the feature corresponding to the specified alias name.</Notes> <IndexFields> <IndexField name="alias" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="SynonymGroup" keyType="id-string"> <Notes>A [i]synonym group[/i] represents a group of features. Substantially identical features are mapped to the same synonym group, and this information is used to expand similarities.</Notes> </Entity> <Entity name="Role" keyType="string"> <Notes>A [i]role[/i] describes a biological function that may be fulfilled by a feature. One of the main goals of the database is to record the roles of the various features.</Notes> <Fields> <Field name="EC" type="string" relation="RoleEC"> <Notes>EC code for this role.</Notes> <DataGen testCount="1">StringGen(IntGen(20,40)) . "(" . $this->{id} . ")"</DataGen> </Field> <Field name="abbr" type="name-string"> <Notes>Abbreviated name for the role, generally non-unique, but useful in column headings for HTML tables.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index allows the user to find the role corresponding to an EC number.</Notes> <IndexFields> <IndexField name="EC" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="Annotation" keyType="name-string"> <Notes>An [i]annotation[/i] contains supplementary information about a feature. Annotations are currently the only objects that may be inserted directly into the database. All other information is loaded from data exported by the SEED.</Notes> <Fields> <Field name="time" type="date"> <Notes>Date and time of the annotation.</Notes> </Field> <Field name="annotation" type="text"> <Notes>Text of the annotation.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index allows the user to find recent annotations.</Notes> <IndexFields> <IndexField name="time" order="descending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="Reaction" keyType="key-string"> <Notes>A [i]reaction[/i] is a chemical process catalyzed by a protein. The reaction ID is generally a small number preceded by a letter.</Notes> <Fields> <Field name="url" type="string" relation="ReactionURL"> <Notes>HTML string containing a link to a web location that describes the reaction. This field is optional.</Notes> </Field> <Field name="rev" type="boolean"> <Notes>TRUE if this reaction is reversible, else FALSE</Notes> </Field> </Fields> </Entity> <Entity name="Compound" keyType="name-string"> <Notes>A [i]compound[/i] is a chemical that participates in a reaction. All compounds have a unique ID and may also have one or more names.</Notes> <Fields> <Field name="name-priority" type="int" relation="CompoundName"> <Notes>Priority of a compound name. The name with the loweset priority is the main name of this compound.</Notes> </Field> <Field name="name" type="name-string" relation="CompoundName"> <Notes>Descriptive name for the compound. A compound may have several names.</Notes> </Field> <Field name="cas-id" type="name-string" relation="CompoundCAS"> <Notes>Chemical Abstract Service ID for this compound (optional).</Notes> </Field> <Field name="label" type="name-string"> <Notes>Name used in reaction display strings. It is the same as the name possessing a priority of 1, but it is placed here to speed up the query used to create the display strings.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index allows the user to find the compound corresponding to the specified name.</Notes> <IndexFields> <IndexField name="name" order="ascending" /> </IndexFields> </Index> <Index> <Notes>This index allows the user to find the compound corresponding to the specified CAS ID.</Notes> <IndexFields> <IndexField name="cas-id" order="ascending" /> </IndexFields> </Index> <Index> <Notes>This index allows the user to access the compound names in priority order.</Notes> <IndexFields> <IndexField name="id" order="ascending" /> <IndexField name="name-priority" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="Subsystem" keyType="string"> <Notes>A [i]subsystem[/i] is a collection of roles that work together in a cell. Identification of subsystems is an important tool for recognizing parallel genetic features in different organisms.</Notes> <Fields> <Field name="curator" type="string"> <Notes>Name of the person currently in charge of the subsystem.</Notes> </Field> <Field name="notes" type="text"> <Notes>Descriptive notes about the subsystem.</Notes> </Field> <Field name="classification" type="string" relation="SubsystemClass"> <Notes>Classification string, colon-delimited. This string organizes the subsystems into a hierarchy.</Notes> </Field> </Fields> </Entity> <Entity name="RoleSubset" keyType="string"> <Notes>A [i]role subset[/i] is a named collection of roles in a particular subsystem. The subset names are generally very short, non-unique strings. The ID of the parent subsystem is prefixed to the subset ID in order to make it unique.</Notes> </Entity> <Entity name="GenomeSubset" keyType="string"> <Notes>A [i]genome subset[/i] is a named collection of genomes that participate in a particular subsystem. The subset names are generally very short, non-unique strings. The ID of the parent subsystem is prefixed to the subset ID in order to make it unique.</Notes> </Entity> <Entity name="SSCell" keyType="hash-string"> <Notes>Part of the process of locating and assigning features is creating a spreadsheet of genomes and roles to which features are assigned. A [i]spreadsheet cell[/i] represents one of the positions on the spreadsheet.</Notes> </Entity> <Entity name="SproutUser" keyType="name-string"> <Notes>A [i]user[/i] is a person who can make annotations and view data in the database. The user object is keyed on the user's login name.</Notes> <Fields> <Field name="description" type="string"> <Notes>Full name or description of this user.</Notes> </Field> <Field name="access-code" type="key-string" relation="UserAccess"> <Notes>Access code possessed by this user. A user can have many access codes; a genome is accessible to the user if its access code matches any one of the user's access codes.</Notes> <DataGen testCount="2">RandParam('low', 'medium', 'high')</DataGen> </Field> </Fields> </Entity> <Entity name="Property" keyType="int"> <Notes>A [i]property[/i] is a type of assertion that could be made about the properties of a particular feature. Each property instance is a key/value pair and can be associated with many different features. Conversely, a feature can be associated with many key/value pairs, even some that notionally contradict each other. For example, there can be evidence that a feature is essential to the organism's survival and evidence that it is superfluous.</Notes> <Fields> <Field name="property-name" type="name-string"> <Notes>Name of this property.</Notes> </Field> <Field name="property-value" type="string"> <Notes>Value associated with this property. For each property name, there must by a property record for all of its possible values.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index enables the application to find all values for a specified property name, or any given name/value pair.</Notes> <IndexFields> <IndexField name="property-name" order="ascending" /> <IndexField name="property-value" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="Diagram" keyType="name-string"> <Notes>A functional diagram describes the chemical reactions, often comprising a single subsystem. A diagram is identified by a short name and contains a longer descriptive name. The actual diagram shows which functional roles guide the reactions along with the inputs and outputs; the database, however, only indicate which roles belong to a particular map.</Notes> <Fields> <Field name="name" type="text"> <Notes>Descriptive name of this diagram.</Notes> </Field> </Fields> </Entity> <Entity name="ExternalAliasOrg" keyType="name-string"> <Notes>An external alias is a feature name for a functional assignment that is not a FIG ID. Functional assignments for external aliases are kept in a separate section of the database. This table contains a description of the relevant organism for an external alias functional assignment.</Notes> <Fields> <Field name="org" type="text"> <Notes>Descriptive name of the target organism for this external alias.</Notes> </Field> </Fields> </Entity> <Entity name="ExternalAliasFunc" keyType="name-string"> <Notes>An external alias is a feature name for a functional assignment that is not a FIG ID. Functional assignments for external aliases are kept in a separate section of the database. This table contains the functional role for the external alias functional assignment.</Notes> <Fields> <Field name="func" type="text"> <Notes>Functional role for this external alias.</Notes> </Field> </Fields> </Entity> <Entity name="Coupling" keyType="id-string"> <Notes>A coupling is a relationship between two features. The features are physically close on the contig, and there is evidence that they generally belong together. The key of this entity is formed by combining the coupled feature IDs with a space.</Notes> <Fields> <Field name="score" type="int"> <Notes>A number based on the set of PCHs (pairs of close homologs). A PCH indicates that two genes near each other on one genome are very similar to genes near each other on another genome. The score only counts PCHs for which the genomes are very different. (In other words, we have a pairing that persists between different organisms.) A higher score implies a stronger meaning to the clustering.</Notes> </Field> </Fields> </Entity> <Entity name="PCH" keyType="counter"> <Notes>A PCH (physically close homolog) connects a clustering (which is a pair of physically close features on a contig) to a second pair of physically close features that are similar to the first. Essentially, the PCH is a relationship between two clusterings in which the first clustering's features are similar to the second clustering's features. The simplest model for this would be to simply relate clusterings to each other; however, not all physically close pairs qualify as clusterings, so we relate a clustering to a pair of features. The key a unique ID number.</Notes> <Fields> <Field name="used" type="boolean"> <Notes>TRUE if this PCH is used in scoring the attached clustering, else FALSE. If a clustering has a PCH for a particular genome and many similar genomes are present, then a PCH will probably exist for the similar genomes as well. When this happens, only one of the PCHs will be scored: the others are considered duplicates of the same evidence.</Notes> </Field> </Fields> </Entity> <Entity name="Family" keyType="id-string"> <Notes>A family is a group of homologous PEGs believed to have the same function. Protein families provide a mechanism for verifying the accuracy of functional assignments and are also used in determining phylogenetic trees.</Notes> <Fields> <Field name="function" type="text"> <Notes>The functional assignment expected for all PEGs in this family.</Notes> </Field> <Field name="size" type="int"> <Notes>The number of proteins in this family. This may be larger than the number of PEGs included in the family, since the family may also contain external IDs.</Notes> </Field> </Fields> </Entity> <Entity name="DrugProject" keyType="name-string"> <Notes>A drug project is a coherent sent of drug target data that came through the pipeline. In other words, data is put into the database one drug project at a time. This makes it easier to manage the incoming data and to track where a particular piece of data originated.</Notes> </Entity> <Entity name="DrugTopic" keyType="int"> <Notes>A drug topic organizes the data in a project relating to a single organism group's features as they apply to a specific category of activity. Categories include features essential to the organism's survival, those that are targets or inhibitors of anti-biotics, and those associated with virulence. Thus, a drug topic consists of data from a single project for features that make good drug targets for the same reason. Drug topics have an artificial, internally-generated key.</Notes> <Fields> <Field name="identifier" type="name-string"> <Notes>The topic identifier, consisting usually of a generalized organism name (e.g. Staphylococcus) and the last name of the project's author. More than one topic may have the same identifier, which is why this isn't the key.</Notes> </Field> <Field name="tag" type="name-string"> <Notes>A short phrase describing the topic.</Notes> </Field> <Field name="URL" type="string"> <Notes>A URL for the paper from which the topic was gathered.</Notes> </Field> <Field name="category" type="key-string"> <Notes>The code for this topic's activity category.</Notes> </Field> </Fields> <Indexes> <Index> <Notes>This index enables the application to find all topics with a specified category, ordered by tag.</Notes> <IndexFields> <IndexField name="category" order="ascending" /> <IndexField name="tag" order="ascending" /> </IndexFields> </Index> <Index> <Notes>This index enables the application to find all topics with a specified identifier, ordered by category.</Notes> <IndexFields> <IndexField name="identifier" order="ascending" /> <IndexField name="category" order="ascending" /> </IndexFields> </Index> </Indexes> </Entity> <Entity name="PDB" keyType="key-string"> <Notes>A PDB is a database of protein structure and related information of use in drug targeting. The purpose of drug targeting is to analyze the ability of drug molecules, or ligands, to bond to proteins. A PDB for a protein already attached to a ligand is called a bound PDB. A PDB for the protein by itself is called a free PDB. The key of the PDB is its code name on the Protein Data Bank web site.</Notes> <Fields> <Field name="type" type="id-string"> <Notes>The type of PDB: "bound" or "free".</Notes> </Field> <Field name="title" type="string"> <Notes>The descriptive title of this PDB.</Notes> </Field> </Fields> </Entity> <Entity name="Ligand" keyType="string"> <Notes>A ligand is a molecule that can bind to a PDB. The CLIBE analysis for a PDB is an attribute of the relationship between a PDB and a ligand.</Notes> </Entity> </Entities> <Relationships> <Relationship name="BindsWith" from="PDB" to="Ligand" arity="MM"> <Notes>This relationship describes the energy required for a ligand to bind to the protein described by a PDB. The total energy required to bind the ligand to the protein is described in this relationship by four quantities. A negative value is energy released; a positive value is energy required.</Notes> <Fields> <Field name="URL" type="string"> <Notes>URL for viewing the CLIBE data for this binding relationship.</Notes> </Field> <Field name="vanderwaals-energy" type="float"> <Notes>kCal/mol of energy due to Van der Waals force.</Notes> </Field> <Field name="hbond-energy" type="float"> <Notes>kCal/mol of energy due to hydrogen bonding.</Notes> </Field> <Field name="ionic-energy" type="float"> <Notes>kCal/mol of energy due to ionic bonding.</Notes> </Field> <Field name="solvation-energy" type="float"> <Notes>kCal/mol of energy due to attraction to the solvent in which the ligand is immersed.</Notes> </Field> </Fields> </Relationship> <Relationship name="ContainsAnalysisOf" from="DrugTopic" to="PDB" arity="1M"> <Notes>This relationship describes the analysis of a free PDB as produced from a particular topic.</Notes> <Fields> <Field name="pass-asp-info" type="int"> <Notes>The number of Active Site Points at which ligands can bind to the protein.</Notes> </Field> <Field name="ramsol-file" type="string"> <Notes>The URL of a file that can be downloaded by the user and passed to the Ramsol program for viewing the protein.</Notes> </Field> <Field name="pass-weight" type="float"> <Notes>A score for the largest pocket into which a ligand can bind. A higher score makes for a better target.</Notes> </Field> <Field name="pass-file" type="string"> <Notes>The URL for a GIF file that shows the active sites on the protein.</Notes> </Field> </Fields> </Relationship> <Relationship name="IsBoundIn" from="PDB" to="PDB" arity="1M"> <Note>This relationship connects a free PDB to its bound counterparts.</Note> </Relationship> <Relationship name="DescribesProteinForFeature" from="PDB" to="Feature" arity="MM"> <Notes>This relationship connects a feature to a protein database (PDB) that is relevant for determining drugs that target the feature.</Notes> <Fields> <Field name="score" type="float"> <Notes>The BLAST score for the feature as it relates to the PDB's protein, expressed as a small positive number. Generally only a very low BLAST score (1e-15 or less) indicates a good match.</Notes> </Field> <Field name="distance" type="float"> <Notes>A distance value indicating how far the PDB's protein is from the feature's protein. A distance of 0 indicates a perfect match.</Notes> </Field> </Fields> <FromIndex> <Notes>This index yields the Features for a PDB in order from best score to worst.</Notes> <IndexFields> <IndexField name="score" order="ascending" /> </IndexFields> </FromIndex> <ToIndex> <Notes>This index yields the Features for a PDB in order from best score to worst.</Notes> <IndexFields> <IndexField name="score" order="ascending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="ContainsTopic" from="DrugProject" to="DrugTopic" arity="1M"> <Notes>This relationship connects a drug target project to all of its topics.</Notes> </Relationship> <Relationship name="IsFamilyForFeature" from="Family" to="Feature" arity="MM"> <Notes>This relationship connects a protein family to all of its PEGs and connects each PEG to all of its protein families.</Notes> </Relationship> <Relationship name="ParticipatesInCoupling" from="Feature" to="Coupling" arity="MM"> <Notes>This relationship connects a feature to all the functional couplings in which it participates. A functional coupling is a recognition of the fact that the features are close to each other on a chromosome, and similar features in other genomes also tend to be close.</Notes> <Fields> <Field name="pos" type="int"> <Notes>Ordinal position of the feature in the coupling. Currently, this is either "1" or "2".</Notes> </Field> </Fields> <ToIndex> <Notes>This index enables the application to view the features of a coupling in the proper order. The order influences the way the PCHs are examined.</Notes> <IndexFields> <IndexField name="pos" order="ascending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="IsSynonymGroupFor" from="SynonymGroup" to="Feature" arity="1M"> <Notes>This relation connects a synonym group to the features that make it up.</Notes> </Relationship> <Relationship name="HasFeature" from="Genome" to="Feature" arity="1M"> <Notes>This relationship connects a genome to all of its features. This relationship is redundant in a sense, because the genome ID is part of the feature ID; however, it makes the creation of certain queries more convenient because you can drag in filtering information for a feature's genome.</Notes> <Fields> <Field name="type" type="key-string"> <Notes>Feature type (eg. peg, rna)</Notes> </Field> </Fields> <FromIndex> <Notes>This index enables the application to view the features of a Genome sorted by type.</Notes> <IndexFields> <IndexField name="type" order="ascending" /> </IndexFields> </FromIndex> </Relationship> <Relationship name="IsEvidencedBy" from="Coupling" to="PCH" arity="1M"> <Notes>This relationship connects a functional coupling to the physically close homologs (PCHs) which affirm that the coupling is meaningful.</Notes> </Relationship> <Relationship name="UsesAsEvidence" from="PCH" to="Feature" arity="MM"> <Notes>This relationship connects a PCH to the features that represent its evidence. Each PCH is connected to a parent coupling that relates two features on a specific genome. The PCH's evidence that the parent coupling is functional is the existence of two physically close features on a different genome that correspond to the features in the coupling. Those features are found on the far side of this relationship.</Notes> <Fields> <Field name="pos" type="int"> <Notes>Ordinal position of the feature in the coupling that corresponds to our target feature. There is a one-to-one correspondence between the features connected to the PCH by this relationship and the features connected to the PCH's parent coupling. The ordinal position is used to decode that relationship. Currently, this field is either "1" or "2".</Notes> </Field> </Fields> <FromIndex> <Notes>This index enables the application to view the features of a PCH in the proper order.</Notes> <IndexFields> <IndexField name="pos" order="ascending" /> </IndexFields> </FromIndex> </Relationship> <Relationship name="HasContig" from="Genome" to="Contig" arity="1M"> <Notes>This relationship connects a genome to the contigs that contain the actual genetic information.</Notes> </Relationship> <Relationship name="ComesFrom" from="Genome" to="Source" arity="MM"> <Notes>This relationship connects a genome to the sources that mapped it. A genome can come from a single source or from a cooperation among multiple sources.</Notes> </Relationship> <Relationship name="IsMadeUpOf" from="Contig" to="Sequence" arity="1M"> <Notes>A contig is stored in the database as an ordered set of sequences. By splitting the contig into sequences, we get a performance boost from only needing to keep small portions of a contig in memory at any one time. This relationship connects the contig to its constituent sequences.</Notes> <Fields> <Field name="len" type="int"> <Notes>Length of the sequence.</Notes> </Field> <Field name="start-position" type="int"> <Notes>Index (1-based) of the point in the contig where this sequence starts.</Notes> </Field> </Fields> <FromIndex> <Notes>This index enables the application to find all of the sequences in a contig in order, and makes it easier to find a particular residue section.</Notes> <IndexFields> <IndexField name="start-position" order="ascending" /> <IndexField name="len" order="ascending" /> </IndexFields> </FromIndex> </Relationship> <Relationship name="IsTargetOfAnnotation" from="Feature" to="Annotation" arity="1M"> <Notes>This relationship connects a feature to its annotations.</Notes> </Relationship> <Relationship name="MadeAnnotation" from="SproutUser" to="Annotation" arity="1M"> <Notes>This relationship connects an annotation to the user who made it.</Notes> </Relationship> <Relationship name="ParticipatesIn" from="Genome" to="Subsystem" arity="MM"> <Notes>This relationship connects subsystems to the genomes that use it. If the subsystem has been curated for the genome, then the subsystem's roles will also be connected to the genome features through the [b]SSCell[/b] object.</Notes> <Fields> <Field name="variant-code" type="key-string"> <Notes>Code indicating the subsystem variant to which this genome belongs. Each subsystem can have multiple variants. A variant code of [b]-1[/b] indicates that the genome does not have a functional variant of the subsystem. A variant code of [b]0[/b] indicates that the genome's participation is considered iffy.</Notes> </Field> </Fields> <ToIndex> <Notes>This index enables the application to find all of the genomes using a subsystem in order by variant code, which is how we wish to display them in the spreadsheets.</Notes> <IndexFields> <IndexField name="variant-code" order="ascending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="OccursInSubsystem" from="Role" to="Subsystem" arity="MM"> <Notes>This relationship connects roles to the subsystems that implement them. </Notes> <Fields> <Field name="column-number" type="int"> <Notes>Column number for this role in the specified subsystem's spreadsheet.</Notes> </Field> </Fields> <ToIndex> <Notes>This index enables the application to see the subsystem roles in column order. The ordering of the roles is usually significant, so it is important to preserve it.</Notes> <IndexFields> <IndexField name="column-number" order="ascending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="IsGenomeOf" from="Genome" to="SSCell" arity="1M"> <Notes>This relationship connects a subsystem's spreadsheet cell to the genome for the spreadsheet column.</Notes> </Relationship> <Relationship name="IsRoleOf" from="Role" to="SSCell" arity="1M"> <Notes>This relationship connects a subsystem's spreadsheet cell to the role for the spreadsheet row.</Notes> </Relationship> <Relationship name="ContainsFeature" from="SSCell" to="Feature" arity="MM"> <Notes>This relationship connects a subsystem's spreadsheet cell to the features assigned to it.</Notes> <Fields> <Field name="cluster-number" type="int"> <Notes>ID of this feature's cluster. Clusters represent families of related proteins participating in a subsystem.</Notes> </Field> </Fields> </Relationship> <Relationship name="IsAComponentOf" from="Compound" to="Reaction" arity="MM"> <Notes>This relationship connects a reaction to the compounds that participate in it.</Notes> <Fields> <Field name="product" type="boolean"> <Notes>TRUE if the compound is a product of the reaction, FALSE if it is a substrate. When a reaction is written on paper in chemical notation, the substrates are left of the arrow and the products are to the right. Sorting on this field will cause the substrates to appear first, followed by the products. If the reaction is reversible, then the notion of substrates and products is not at intuitive; however, a value here of FALSE still puts the compound left of the arrow and a value of TRUE still puts it to the right.</Notes> </Field> <Field name="stoichiometry" type="key-string"> <Notes>Number of molecules of the compound that participate in a single instance of the reaction. For example, if a reaction produces two water molecules, the stoichiometry of water for the reaction would be two. When a reaction is written on paper in chemical notation, the stoichiometry is the number next to the chemical formula of the compound.</Notes> </Field> <Field name="main" type="boolean"> <Notes>TRUE if this compound is one of the main participants in the reaction, else FALSE. It is permissible for none of the compounds in the reaction to be considered main, in which case this value would be FALSE for all of the relevant compounds.</Notes> </Field> <Field name="loc" type="key-string"> <Notes>An optional character string that indicates the relative position of this compound in the reaction's chemical formula. The location affects the way the compounds present as we cross the relationship from the reaction side. The product/substrate flag comes first, then the value of this field, then the main flag. The default value is an empty string; however, the empty string sorts first, so if this field is used, it should probably be used for every compound in the reaction.</Notes> </Field> <Field name="discriminator" type="int"> <Notes>A unique ID for this record. The discriminator does not provide any useful data, but it prevents identical records from being collapsed by the SELECT DISTINCT command used by ERDB to retrieve data.</Notes> </Field> </Fields> <ToIndex> <Notes>This index presents the compounds in the reaction in the order they should be displayed when writing it in chemical notation. All the substrates appear before all the products, and within that ordering, the main compounds appear first.</Notes> <IndexFields> <IndexField name="product" order="ascending" /> <IndexField name="loc" order="ascending" /> <IndexField name="main" order="descending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="IsLocatedIn" from="Feature" to="Contig" arity="MM"> <Notes>This relationship connects a feature to the contig segments that work together to effect it. The segments are numbered sequentially starting from 1. The database is required to place an upper limit on the length of each segment. If a segment is longer than the maximum, it can be broken into smaller bits. [p]The upper limit enables applications to locate all features that contain a specific residue. For example, if the upper limit is 100 and we are looking for a feature that contains residue 234 of contig [b]ABC[/b], we can look for features with a begin point between 135 and 333. The results can then be filtered by direction and length of the segment.</Notes> <Fields> <Field name="locN" type="int"> <Notes>Sequence number of this segment.</Notes> </Field> <Field name="beg" type="int"> <Notes>Index (1-based) of the first residue in the contig that belongs to the segment.</Notes> </Field> <Field name="len" type="int"> <Notes>Number of residues in the segment. A length of 0 identifies a specific point between residues. This is the point before the residue if the direction is forward and the point after the residue if the direction is backward.</Notes> </Field> <Field name="dir" type="char"> <Notes>Direction of the segment: [b]+[/b] if it is forward and [b]-[/b] if it is backward.</Notes> </Field> </Fields> <FromIndex Unique="false"> <Notes>This index allows the application to find all the segments of a feature in the proper order.</Notes> <IndexFields> <IndexField name="locN" order="ascending" /> </IndexFields> </FromIndex> <ToIndex> <Notes>This index is the one used by applications to find all the feature segments that contain a specific residue.</Notes> <IndexFields> <IndexField name="beg" order="ascending" /> </IndexFields> </ToIndex> </Relationship> <Relationship name="HasProperty" from="Feature" to="Property" arity="MM"> <Notes>This relationship connects a feature to its known property values. The relationship contains text data that indicates the paper or organization that discovered evidence that the feature possesses the property. So, for example, if two papers presented evidence that a feature is essential, there would be an instance of this relationship for both.</Notes> <Fields> <Field name="evidence" type="text"> <Notes>URL or citation of the paper or institution that reported evidence of the relevant feature possessing the specified property value.</Notes> </Field> </Fields> </Relationship> <Relationship name="RoleOccursIn" from="Role" to="Diagram" arity="MM"> <Notes>This relationship connects a role to the diagrams on which it appears. A role frequently identifies an enzyme, and can appear in many diagrams. A diagram generally contains many different roles.</Notes> </Relationship> <Relationship name="HasSSCell" from="Subsystem" to="SSCell" arity="1M"> <Notes>This relationship connects a subsystem to the spreadsheet cells used to analyze and display it. The cells themselves can be thought of as a grid with Roles on one axis and Genomes on the other. The various features of the subsystem are then assigned to the cells.</Notes> </Relationship> <Relationship name="IsTrustedBy" from="SproutUser" to="SproutUser" arity="MM"> <Notes>This relationship identifies the users trusted by each particular user. When viewing functional assignments, the assignment displayed is the most recent one by a user trusted by the current user. The current user implicitly trusts himself. If no trusted users are specified in the database, the user also implicitly trusts the user [b]FIG[/b].</Notes> </Relationship> <Relationship name="ConsistsOfRoles" from="RoleSubset" to="Role" arity="MM"> <Notes>This relationship connects a role subset to the roles that it covers. A subset is, essentially, a named group of roles belonging to a specific subsystem, and this relationship effects that. Note that will a role may belong to many subsystems, a subset belongs to only one subsystem, and all roles in the subset must have that subsystem in common.</Notes> </Relationship> <Relationship name="ConsistsOfGenomes" from="GenomeSubset" to="Genome" arity="MM"> <Notes>This relationship connects a subset to the genomes that it covers. A subset is, essentially, a named group of genomes participating in a specific subsystem, and this relationship effects that. Note that while a genome may belong to many subsystems, a subset belongs to only one subsystem, and all genomes in the subset must have that subsystem in common.</Notes> </Relationship> <Relationship name="HasRoleSubset" from="Subsystem" to="RoleSubset" arity="1M"> <Notes>This relationship connects a subsystem to its constituent role subsets. Note that some roles in a subsystem may not belong to a subset, so the relationship between roles and subsystems cannot be derived from the relationships going through the subset.</Notes> </Relationship> <Relationship name="HasGenomeSubset" from="Subsystem" to="GenomeSubset" arity="1M"> <Notes>This relationship connects a subsystem to its constituent genome subsets. Note that some genomes in a subsystem may not belong to a subset, so the relationship between genomes and subsystems cannot be derived from the relationships going through the subset.</Notes> </Relationship> <Relationship name="Catalyzes" from="Role" to="Reaction" arity="MM"> <Notes>This relationship connects a role to the reactions it catalyzes. The purpose of a role is to create proteins that trigger certain chemical reactions. A single reaction can be triggered by many roles, and a role can trigger many reactions.</Notes> </Relationship> <Relationship name="HasRoleInSubsystem" from="Feature" to="Subsystem" arity="MM"> <Notes>This relationship connects a feature to the subsystems in which it participates. This is technically redundant information, but it is used so often that it deserves its own table.</Notes> <Fields> <Field name="genome" type="name-string"> <Notes>ID of the genome containing the feature</Notes> </Field> <Field name="type" type="key-string"> <Notes>Feature type (eg. peg, rna)</Notes> </Field> </Fields> <ToIndex> <Notes>This index enables the application to view the features of a subsystem sorted by genome and feature type.</Notes> <IndexFields> <IndexField name="genome" order="ascending" /> <IndexField name="type" order="ascending" /> </IndexFields> </ToIndex> </Relationship> </Relationships> </Database>
MCS Webmaster | ViewVC Help |
Powered by ViewVC 1.0.3 |