[Bio] / Clearinghouse / genome.html Repository:
ViewVC logotype

View of /Clearinghouse/genome.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (download) (as text) (annotate)
Fri Apr 29 20:48:34 2005 UTC (14 years, 5 months ago) by olson
Branch: MAIN
CVS Tags: myrast_33, HEAD
Changes since 1.1: +1 -0 lines
Add some validation to uploads.
Tweak the browser some more.

<title>Clearinghouse type definition: GENOME</title>
<h1>Clearinghouse type definition: GENOME</h1>

An object of type GENOME is a <a href="http://song.sourceforge.net/gff3-jan04.shtml">GFF3</a>
document with a proscribed set of clearinghouse-specific annotations in order to
maintain information about the genome being described. See <a href="#4">(4)<a> below for details.

We define the following workflow for preparing a genome for uploading to the clearinghouse.

<ol>

<li> <p>Input data: taxon ID, data containing contigs and features.
</li>

<li> <p>Register taxon ID T with the clearinghouse to get a genome ID of
   the form T.V.
</li>

<li> <p>For each feature type in the data, register the feature type and
   count of features of that type with the clearinghouse.  This
   registration returns the starting index to be used for naming the
   features. For newly-registered genomes, the starting index will be
   1. Common feature types include "peg" and "rna".
</li>

<a name="4"/><li> <p>We now have the information required to construct a well-formed GFF
   file to describe the genome. It must include the following
   attributes:
<pre>
#seed   genome_id	T.V
#seed	name		Genus Species
#seed   taxon_id	T
#seed   taxonomy 	Bacteria; .... ; Genus; Species
#seed   project		My Project
#seed   genome_md5	abcdabcdabcdabcdabcdabcdabcdabcd
</pre>

   Each feature must include a Dbxref that declares the registered
   feature id for that feature. For instance:

<pre>
Dbxref=fig|T.V.peg.3
</pre>
</li>

<li> The GFF may now be uploaded. The upload process will validate the
   given GFF for errors in syntax, missing required attributes,
   missing feature DBxrefs, and inconsistencies with the registered
   genome and feature IDs. The SEED program
   validate_genome_gff provides the basis for this validation.
</li>

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3