[Bio] / FigTutorial / SEED_administration_issues.html Repository:
ViewVC logotype

Diff of /FigTutorial/SEED_administration_issues.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.12, Thu Aug 5 23:26:06 2004 UTC revision 1.13, Wed Aug 18 22:34:01 2004 UTC
# Line 39  Line 39 
39      Computing "Pins" and "Clusters"      Computing "Pins" and "Clusters"
40  </A>  </A>
41    
42    <li><A HREF="#auto_annotation">
43        Automatic Annotation of Genomes
44    </A>
45    
46  </ul>  </ul>
47    
48    
# Line 828  Line 832 
832          compute_pins_and_clusters 562.4          compute_pins_and_clusters 562.4
833  </pre>  </pre>
834  would compute and add entries for all of the <i>pegs</i> in genome 562.4.  would compute and add entries for all of the <i>pegs</i> in genome 562.4.
835    
836    <h2 id="auto_annotation">
837       Automatic Annotation of Genomes
838    </h2>
839    The SEED provides a simple but limited capability for automated assignment
840    of protein-encoding gene function based on similarity.
841    Candidate functions are assigned scores based on the combined strengths
842    of all BLASTP similarities to genes carrying that particular assignment,
843    weighted by the provenance and assignment-confidence for each similar gene.
844    The final automated function assignment is then determined from the
845    list of candidate functions and their associated scores.
846    
847    Automated assignment is a four-step process:
848    <ol>
849    <li> Create a list of PEGs to be automatically assigned.
850    If one wishes to make assignments to an entire organism or set of organisms
851    that are already installed in the SEED, the simplest method for creating
852    this list is to type the following command:
853    <pre>
854        pegs Genome1 Genome2 Genome3 ... > ~/Tmp/peg.list
855    </pre>
856    
857    <p>
858    <li> Next, create a list of candidate function-assignments using the following
859    command:
860    <pre>
861       auto_assign < ~/Tmp/peg.list > ~/Tmp/candidate.funcs
862    </pre>
863    (NOTE: The `auto_assign` command has some additional optional parameters;
864    for example, if one knows that all the PEGs in 'peg.list' are from
865    prokaryotic organisms, one can make use of this additional informaation
866    by invoking `auto_assign` as follows:
867    <pre>
868       auto_assign prokaryote < ~/Tmp/peg.list > ~/Tmp/candidate.funcs
869    </pre>
870    Also, if one wishes to use an alternate file of similarity data named 'simfile'
871    instead of the precomputed similarities stored in the SEED, one can instead type:
872    <pre>
873       auto_assign sims=simfile < ~/Tmp/peg.list > ~/Tmp/candidate.funcs
874    </pre>
875    Finally, `auto_assign` can read a set of alternate parameters from a file,
876    but we recommend that you stick with the default settings, and not exploit this
877    last feature unless you are a qualified SEED wizard.)
878    <p>
879    
880    <li> Next, create a SEED format assigned-functions file as follows:
881    <pre>
882        make_calls < ~/Tmp/candidate.funcs > ~/Tmp/assigned_functions
883    </pre>
884    Alternately, if you wish to suppress the class of "non-informative" function assignments
885    such as "Hypothetical protein," "Unclassified protein," "predicted gene," ect.,
886    you may do so using the '-no_hypos' flag:
887    <pre>
888        make_calls -no_hypos < ~/Tmp/candidate.funcs > ~/Tmp/assigned_functions
889    </pre>
890    
891    <li> Finally, install the automated assignments in the seed using the command
892    <pre>
893        fig auto_assignF ~/Tmp/assigned_functions
894    </pre>
895    
896    </ol>
897    
898    It should be once again noted that the SEED's automated assignment algorithm
899    is quite simple and crude, being only slightly better than simply assigning
900    the function of the highest-scoring BLASTP hit; however, it at least provides
901    a "quick and dirty" starting point for making an initial assessment of a genome,
902    which may then be clraned up and refined by skilled genome annotators.
903    
904    
905    
906    
907    
908    

Legend:
Removed from v.1.12  
changed lines
  Added in v.1.13

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3