[Bio] / Sprout / SimBlocksDBD.xml Repository:
ViewVC logotype

View of /Sprout/SimBlocksDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (download) (as text) (annotate)
Thu Jun 9 19:06:55 2005 UTC (15 years ago) by parrello
Branch: MAIN
Changes since 1.1: +58 -28 lines
*** empty log message ***

<?xml version="1.0" encoding="UTF-8"?>
<Database>
    <Title>Similarity Block Database</Title>
    <Entities>
        <Entity name="Genome" keyType="name-string">
            <Notes>A [i]genome[/i] contains the sequence data for a particular
			individual organism.</Notes>
			<Fields>
				<Field name="description" type="string">
					<Notes>Brief description of this genome.</Notes>
				</Field>
			</Fields>
		</Entity>
        <Entity name="Contig" keyType="key-string">
            <Notes>A [i]contig[/i] is a contiguous run of nucleotides. The contig's
			ID consists of the genome ID followed by a name that identifies
			which contig this is for the parent genome. The individual components
			are separated by a colon.</Notes>
        </Entity>
        <Entity name="GroupBlock" keyType="int">
            <Notes>A [i]group block[/i] is a set of similar genome regions.
			A group block can represent a gene or an inter-genic region.
			The result is that every position in a contig belongs to exactly
			one block, though some will belong to several.</Notes>
            <Fields>
                <Field name="len" type="int">
					<Notes>Number of nucleotides in the regions belonging to
					this block. This may include insertion markers ([b]-[/b]).</Notes>
				</Field>
				<Field name="pattern" type="text">
					<Notes>A representation of the nucleotides in the group,
					with question marks substituted for positions that are
					not identical for all group members.</Notes>
				</Field>
				<Field name="variance" type="float">
					<Notes>The proportion of nucleotides that vary between
					regions in this group. For example, a value of 0 means all
					regions are identical at every position. A value of
					0.5 means all regions are identical at exactly half of
					the positions. For a block length of 100, a value
					of 0.03 means all regions are identical at every position
					but 3. The variance does not indicate the degree
					of dissimilarity, just how much of each region needs to be
					examined for SNPs.</Notes>
				</Field>
				<Field name="snip-count" type="int">
					<Notes>The number of positions at which the nucleotides
					vary between regions in this group. The variance value
					is this number divided by the block length.</Notes>
				</Field>
				<Field name="description" type="string">
					<Notes>Descriptive name of this block. This will be
					the gene name for gene blocks, and a generated
					string for inter-genic blocks.</Notes>
				</Field>
            </Fields>
        </Entity>
		<Entity name="Region" keyType="name-string">
			<Notes>A [i]region[/i] describes a location in a contig, and
			essentially bridges the gap between blocks and contigs. Each
			instance of this object corresponds to a single segment on
			a contig. The key is the region's sprout-style location
			string.</Notes>
            <Fields>
				<Field name="contigID" type="key-string">
					<Notes>Name of the contig containing this region.</Notes>
				</Field>
                <Field name="position" type="int">
					<Notes>Index (1-based) of the region's leftmost nucleotide
					in the contig.</Notes>
				</Field>
                <Field name="direction" type="char">
					<Notes>[b]+[/b] for a forward region, [b]-[/b] for a reverse
					region.</Notes>
				</Field>
				<Field name="content" type="text">
					<Notes>Nucleotide sequence of variance in this region
					(upper case). For a forward region, this is the exact
					content of each position of variance in the region.
					For a reverse region, it is the complement in
					reverse order.</Notes>
				</Field>
				<Field name="len" type="int">
					<Notes>Length of this region. This may be slightly smaller
					than the block length.</Notes>
				</Field>
				<Field name="peg" type="name-string">
					<Notes>PEG identifier for this block if it is a gene block,
					or aa string generated from the nearby PEGs if it is an
					inter-genic block</Notes>
				</Field>
            </Fields>
		</Entity>
    </Entities>
    <Relationships>
        <Relationship name="ContainsRegion" from="Contig" to="Region" arity="1M">
            <Notes>This relationship connects contigs to the regions on
			them.</Notes>
			<Fields>
                <Field name="position" type="int">
					<Notes>Index (1-based) of the region's leftmost nucleotide
					in the contig.</Notes>
				</Field>
				<Field name="len" type="int">
					<Notes>Length of this region. This may be slightly smaller
					than the block length.</Notes>
				</Field>
			</Fields>
            <ToIndex>
                <Notes>This index enables the application to find all of the
				regions in a contig in the order they are present in the
				contig.</Notes>
                <IndexFields>
                    <IndexField name="position" order="ascending" />
                    <IndexField name="len" order="descending" />
                </IndexFields>
            </ToIndex>
        </Relationship>
		<Relationship name="IncludesRegion" from="GroupBlock" to="Region" arity="1M">
			<Notes>This relationship connects a block to the regions it covers. Note
			that since the ID of the region is its Sprout-style location string,
			often it is not necessary to cross to the [b]Region[/b] table when
			accessing this relationship.</Notes>
		</Relationship>
        <Relationship name="HasInstanceOf" from="Genome" to="GroupBlock" arity="MM">
            <Notes>This relationship connects a genome to the groups represented
			in its contigs. It provides a fast was to get an ordered list of
			groups for a genome. The group lists for genomes can then be
			merged to determine the common groups of a set of genomes.</Notes>
        </Relationship>
		<Relationship name="ConsistsOf" from="Genome" to="Contig" arity="1M">
			<Notes>This relationship connects a genome to its contigs.</Notes>
		</Relationship>
    </Relationships>
</Database>

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3