Sample ERDB Database
This XML file defines a small genetic database that demonstrates the features of
the ERDB database engine.
A Genome contains the sequence data for a particular individual organism.
Scientific name of this genome, usually consisting of the genus,
species, and unique characterization.
Version string for this genome, generally consisting of the genome ID followed
by a period and a string of digits.
Number of [[protein encoding genes]] for this organism
Number of RNA features found for this organism.
Indication of this organism's behavior relating to environmental oxygen.
Y/N/? flag indicating whether or not this organism is pathogenic.
This index allows the applications to search for genome by scientific
name.
A contig is a contiguous run of base pairs. The contig's ID consists of the
genome ID followed by a name that identifies which contig this is for the parent
genome.
String consisting of the base pairs.
A feature (sometimes also called a "gene") is a part of a genome that
is of special interest. Features may be spread across multiple contigs of a
genome, but never across more than one genome. Features can be assigned to
roles via spreadsheet cells, and are the targets of annotation. Each feature
in the database has a unique FIG ID.
Code indicating the type of this feature. Among the codes currently
supported are "peg" for a protein encoding gene, "bs" for a
binding site, "opr" for an operon, and so forth.
(optional) A translation of this feature's residues into
protein character codes, formed by concatenating the pieces
of the feature together. Only protein encoding genes have
translations.
Default functional assignment for this feature.
Name of the user who made the functional assignment
Quality of the functional assignment, usually a space, but may be
W (indicating weak) or X (indicating experimental)
This relationship connects a genome to all of its features. This
relationship is redundant in a sense, because the genome ID is part
of the feature ID; however, it makes the creation of certain queries more
convenient because you can drag in filtering information for a feature's
genome.
Feature type (eg. peg, rna)
This index enables the application to view the features of a
Genome sorted by type.
This relationship connects a genome to the contigs that contain the actual genetic
information.
This relationship connects a feature to the contig segments that work together
to effect it. The segments are numbered sequentially starting from 1. The database is
required to place an upper limit on the length of each segment. If a segment is longer
than the maximum, it can be broken into smaller bits. The upper limit enables applications
to locate all features that contain a specific residue. For example, if the upper limit
is 100 and we are looking for a feature that contains residue 234 of contig *ABC*, we
can look for features with a begin point between 135 and 333. The results can then be
filtered by direction and length of the segment.
Sequence number of this segment.
Index (1-based) of the first residue in the contig that
belongs to the segment.
Number of residues in the segment. A length of 0 identifies
a specific point between residues. This is the point before the residue if the direction
is forward and the point after the residue if the direction is backward.
Direction of the segment: + if it is forward and
- if it is backward.
This index allows the application to find all the segments of a feature in
the proper order.
This index is the one used by applications to find all the feature
segments that contain a specific residue.