[Bio] / Sprout / LoadSproutTables.pl Repository:
ViewVC logotype

Diff of /Sprout/LoadSproutTables.pl

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.6, Sun Sep 11 17:34:05 2005 UTC revision 1.12, Mon Nov 7 20:29:46 2005 UTC
# Line 2  Line 2 
2    
3  =head1 Load Sprout Tables  =head1 Load Sprout Tables
4    
5  Create the load files for a group of Sprout tables. The parameters are the names of  =head2 Introduction
6  the table groups whose data is to be created. The legal table group names are given below.  
7    This script creates the load files for Sprout tables and optionally loads them.
8    The parameters are the names of the table groups whose data is to be created.
9    The legal table group names are given below.
10    
11  =over 4  =over 4
12    
# Line 24  Line 27 
27  =item Subsystem  =item Subsystem
28    
29  Loads B<Subsystem>, B<Role>, B<SSCell>, B<ContainsFeature>, B<IsGenomeOf>,  Loads B<Subsystem>, B<Role>, B<SSCell>, B<ContainsFeature>, B<IsGenomeOf>,
30  B<IsRoleOf>, B<OccursInSubsystem>, B<ParticipatesIn>, B<HasSSCell>.  B<IsRoleOf>, B<OccursInSubsystem>, B<ParticipatesIn>, B<HasSSCell>,
31    B<Catalyzes>, B<ConsistsOfRoles>, B<RoleSubset>, B<HasRoleSubset>,
32    B<ConsistsOfGenomes>, B<GenomeSubset>, B<HasGenomeSubset>
33    
34  =item Annotation  =item Annotation
35    
# Line 55  Line 60 
60    
61  Loads B<ExternalAliasOrg>, B<ExternalAliasFunc>.  Loads B<ExternalAliasOrg>, B<ExternalAliasFunc>.
62    
63    =item Reaction
64    
65    Loads B<ReactionURL>, B<Compound>, B<CompoundName>,
66    B<CompoundCAS>, B<IsAComponentOf>, B<Reaction>.
67    
68  =item *  =item *
69    
70  Loads all of the above tables.  Loads all of the above tables.
71    
72  =back  =back
73    
74  There are two command-line options, given below. Note that in the command line, spaces  The command-line options are given below.
 inside parameters should be represented by C<\b>.  
75    
76  =over 4  =over 4
77    
# Line 82  Line 91 
91    
92  Desired tracing level. The default is 3.  Desired tracing level. The default is 3.
93    
94    =item limitedFeatures
95    
96    Only generate the B<Feature> and B<IsLocatedIn> tables when processing the feature group.
97    
98    =item dbLoad
99    
100    If TRUE, the database tables will be loaded automatically from the load files created.
101    
102  =back  =back
103    
104    =head2 Usage
105    
106    To load all the Sprout tables and then validate the result, you need to issue three
107    commands.
108    
109        LoadSproutTables -dbLoad "*"
110        TestSproutLoad
111        index_sprout
112    
113    All three commands send output to the console. In addition, C<LoadSproutTables> and
114    C<TestSproutLoad> write tracing information to C<trace.log> in the FIG temporary
115    directory (B<$FIG_Config::Tmp>). At the bottom of the log file will be a complete
116    list of errors. If errors occur in C<LoadSproutTables>, then the data must be corrected
117    and the offending table group reloaded. So, for example, if there are errors in the
118    load of the B<MadeAnnotation> and B<Compound> tables, you would need to run
119    
120        LoadSproutTables -dbLoad Annotation Reaction
121    
122    because B<MadeAnnotation> is in the C<Annotation> group, and B<Compound> is in the
123    C<Reaction> group. You can omit the C<dbLoad> option to create the load files without
124    loading the database, and you can add a C<trace> option to change the trace level.
125    The command below creates the Genome-related load files with a trace level of 3 and
126    does not load them into the Sprout database.
127    
128        LoadSproutTables -trace=3 Genome
129    
130    C<LoadSproutTables> takes a long time to run, so setting the trace level to 3 helps
131    to give you an idea of the progress.
132    
133    Once the Sprout database is loaded, B<TestSproutLoad> can be used to verify the load
134    against the FIG data. Again, the end of the C<trace.log> file will contain a summary
135    of the errors found. Like C<LoadSproutTables>, C<TestSproutLoad> is a time-consuming
136    script, so you may want to set the trace level to 3 to see visible progress.
137    
138        TestSproutLoad -trace=3
139    
140    Unlike C<LoadSproutTables>, in C<TestSproutLoad>, the individual errors found are
141    mixed in with the trace messages. They are all, however, marked with a trace type
142    of B<Problem>, as shown in the fragment below.
143    
144        11/02/2005 19:15:16 <main>: Processing feature fig|100226.1.peg.7742.
145        11/02/2005 19:15:17 <main>: Processing feature fig|100226.1.peg.7741.
146        11/02/2005 19:15:17 <Problem>: assignment "Short-chain dehydrodenase ...
147        11/02/2005 19:15:17 <Problem>: assignment "putative oxidoreductase." ...
148        11/02/2005 19:15:17 <Problem>: Incorrect assignment for fig|100226.1.peg.7741...
149        11/02/2005 19:15:17 <Problem>: Incorrect number of annotations found in ...
150        11/02/2005 19:15:17 <main>: Processing feature fig|100226.1.peg.7740.
151        11/02/2005 19:15:18 <main>: Processing feature fig|100226.1.peg.7739.
152    
153    The test may reveal that some tables need to be reloaded, or that a software
154    problem has crept into the Sprout.
155    
156    Once all the tables have the correct data, C<index_sprout> can be run to create the
157    Glimpse indexes.
158    
159  =cut  =cut
160    
161  use strict;  use strict;
# Line 96  Line 168 
168  use File::Path;  use File::Path;
169  use SproutLoad;  use SproutLoad;
170  use Stats;  use Stats;
171    use SFXlate;
172    
173  # Get the command-line parameters and options.  # Get the command-line parameters and options.
174  my ($options, @parameters) = Tracer::ParseCommand({ geneFile => "", subsysFile => "",  my ($options, @parameters) = Tracer::ParseCommand({ geneFile => "", subsysFile => "",
175                                                      trace => 3 },                                                      trace => 3, limitedFeatures => 0,
176                                                                 @ARGV);                                                      dbLoad => 0 }, @ARGV);
177  # Set up tracing.  # Set up tracing.
178  TSetup("$options->{trace} SproutLoad ERDBLoad ERDB Stats Tracer Load", "+>$FIG_Config::temp/trace.log");  TSetup("$options->{trace} SproutLoad ERDBLoad ERDB Stats Tracer Load", "+>$FIG_Config::temp/trace.log");
179  # Create the sprout loader object.  # Create the sprout loader object. Note that the Sprout object does not
180    # open the database unless the "dbLoad" option is turned on.
181  my $fig = FIG->new();  my $fig = FIG->new();
182  my $sprout = Sprout->new($FIG_Config::sproutDB, { noDBOpen => 1 });  my $sprout = SFXlate->new_sprout_only(undef, undef, undef, ! $options->{dbLoad});
183  my $spl = SproutLoad->new($sprout, $fig, $options->{geneFile}, $options->{subsysFile});  my $spl = SproutLoad->new($sprout, $fig, $options->{geneFile}, $options->{subsysFile}, $options);
184  # Process the parameters.  # Process the parameters.
185  for my $group (@parameters) {  for my $group (@parameters) {
186      Trace("Processing load group $group.") if T(2);      Trace("Processing load group $group.") if T(2);
# Line 144  Line 218 
218      if ($group eq 'External' || $group eq '*') {      if ($group eq 'External' || $group eq '*') {
219          $spl->LoadExternalData();          $spl->LoadExternalData();
220      }      }
221        if ($group eq 'Reaction' || $group eq '*') {
222            $spl->LoadReactionData();
223        }
224    
225  }  }
226  Trace("Load complete.") if T(2);  Trace("Load complete.") if T(2);

Legend:
Removed from v.1.6  
changed lines
  Added in v.1.12

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3