[Bio] / Sprout / SimBlocksDBD.xml Repository:
ViewVC logotype

View of /Sprout/SimBlocksDBD.xml

Parent Directory Parent Directory | Revision Log Revision Log

Revision 1.1 - (download) (as text) (annotate)
Wed May 4 03:24:43 2005 UTC (15 years, 1 month ago) by parrello
Branch: MAIN
*** empty log message ***

<?xml version="1.0" encoding="UTF-8"?>
    <Title>Similarity Block Database</Title>
        <Entity name="Genome" keyType="name-string">
            <Notes>A [i]genome[/i] contains the sequence data for a particular
			individual organism.</Notes>
				<Field name="description" type="string">
					<Notes>Brief description of this genome.</Notes>
        <Entity name="Contig" keyType="name-string">
            <Notes>A [i]contig[/i] is a contiguous run of nucleotides. The contig's
			ID consists of the genome ID followed by a name that identifies
			which contig this is for the parent genome. The individual components
			are separated by a colon.</Notes>
				<Field name="len" type="int">
					<Notes>Number of nucleotides in this contig.</Notes>
        <Entity name="GroupBlock" keyType="name-string">
            <Notes>A [i]group block[/i] is a set of similar genome regions. All the
			regions are the same length, although they may go in different
			directions. The ID of the group will be a single letter and a set
			of digits. The initial letter is [b]K[/b] for a group generated by
			similarities and [b]S[/b] for a singleton group describing a
			region with no similarities. The result is that every position
			in a contig belongs to at least one group, though some will
			belong to several.</Notes>
                <Field name="len" type="int">
					<Notes>Number of nucleotides in the regions belonging to
					this group.</Notes>
				<Field name="pattern" type="text">
					<Notes>A representation of the nucleotides in the group,
					with question marks substituted for positions that are
					not identical for all group members.</Notes>
				<Field name="variance" type="float">
					<Notes>The proportion of nucleotides that vary between
					regions in this group. For example, a value of 0 means all
					regions are identical at every position. A value of
					0.5 means all regions are identical at exactly half of
					the positions. For a DNA sequence of length 100, a value
					of 0.03 means all regions are identical at every position
					but 3. The variance does not indicate the degree
					of dissimilarity, just how much of each region needs to be
					examined for SNPs.</Notes>
        <Relationship name="ContainsRegionIn" from="GroupBlock" to="Contig" arity="MM">
            <Notes>This relationship connects contigs to the group blocks represented on
			them. Each instance in this relationship represents a region on a
                <Field name="position" type="int">
					<Notes>Index (1-based) of the region's leftmost nucleotide
					in the contig.</Notes>
                <Field name="direction" type="char">
					<Notes>[b]+[/b] for a forward region, [b]-[/b] for a reverse
				<Field name="content" type="text">
					<Notes>Nucleotide sequence of variance in this region
					(upper case). For a forward region, this is the exact
					content of each position of variance in the region.
					For a reverse region, it is the complement in
					reverse order.</Notes>
				<Field name="len" type="int">
					<Notes>Length of this region. The length is redundant, but
					we place it here anyway so that we can use it to sort
					the regions.</Notes>
                <Notes>This index enables the application to find all of the
				regions in a contig in the order they are present in the
                    <IndexField name="position" order="ascending" />
                    <IndexField name="len" order="descending" />
        <Relationship name="HasInstanceOf" from="Genome" to="GroupBlock" arity="MM">
            <Notes>This relationship connects a genome to the groups represented
			in its contigs. It provides a fast was to get an ordered list of
			groups for a genome. The group lists for genomes can then be
			merged to determine the common groups of a set of genomes.</Notes>
		<Relationship name="ConsistsOf" from="Genome" to="Contig" arity="1M">
			<Notes>This relationship connects a genome to its contigs.</Notes>

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3