[Bio] / Sprout / LoadSproutTables.pl Repository:
ViewVC logotype

Diff of /Sprout/LoadSproutTables.pl

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.1, Wed Jul 27 20:05:24 2005 UTC revision 1.12, Mon Nov 7 20:29:46 2005 UTC
# Line 2  Line 2 
2    
3  =head1 Load Sprout Tables  =head1 Load Sprout Tables
4    
5  Load a group of Sprout tables from the command line. The parameters are the names of  =head2 Introduction
6  the table groups to load. The legal table group names are given below.  
7    This script creates the load files for Sprout tables and optionally loads them.
8    The parameters are the names of the table groups whose data is to be created.
9    The legal table group names are given below.
10    
11  =over 4  =over 4
12    
# Line 19  Line 22 
22  =item Feature  =item Feature
23    
24  Loads B<Feature>, B<FeatureAlias>, B<FeatureTranslation>, B<FeatureUpstream>,  Loads B<Feature>, B<FeatureAlias>, B<FeatureTranslation>, B<FeatureUpstream>,
25  B<IsLocatedIn>, B<IsBidirectionalBestHitOf>, B<FeatureLink>.  B<IsLocatedIn>, B<FeatureLink>.
26    
27  =item Subsystem  =item Subsystem
28    
29  Loads B<Subsystem>, B<Role>, B<SSCell>, B<Diagram>, B<ContainsFeature>, B<IsGenomeOf>,  Loads B<Subsystem>, B<Role>, B<SSCell>, B<ContainsFeature>, B<IsGenomeOf>,
30  B<IsRoleOf>, B<OccursInSubsystem>, B<ParticipatesIn>, B<HasSSCell>.  B<IsRoleOf>, B<OccursInSubsystem>, B<ParticipatesIn>, B<HasSSCell>,
31    B<Catalyzes>, B<ConsistsOfRoles>, B<RoleSubset>, B<HasRoleSubset>,
32    B<ConsistsOfGenomes>, B<GenomeSubset>, B<HasGenomeSubset>
33    
34    =item Annotation
35    
36    Loads B<SproutUser>, B<UserAccess>, B<Annotation>, B<IsTargetOfAnnotation>,
37    B<MadeAnnotation>.
38    
39    =item Diagram
40    
41    Loads B<Diagram>, B<RoleOccursIn>.
42    
43    =item Property
44    
45    Loads B<Property>, B<HasProperty>.
46    
47    =item BBH
48    
49    Loads B<IsBidirectionalBestHitOf>.
50    
51    =item Group
52    
53    Loads B<GenomeGroups>.
54    
55    =item Source
56    
57    Loads B<Source>, B<ComesFrom>, B<SourceURL>.
58    
59    =item External
60    
61    Loads B<ExternalAliasOrg>, B<ExternalAliasFunc>.
62    
63    =item Reaction
64    
65    Loads B<ReactionURL>, B<Compound>, B<CompoundName>,
66    B<CompoundCAS>, B<IsAComponentOf>, B<Reaction>.
67    
68    =item *
69    
70    Loads all of the above tables.
71    
72  =back  =back
73    
74  There are two command-line options, given below. Note that in the command line, spaces  The command-line options are given below.
 inside parameters should be represented by C<\b>.  
75    
76  =over 4  =over 4
77    
# Line 49  Line 91 
91    
92  Desired tracing level. The default is 3.  Desired tracing level. The default is 3.
93    
94    =item limitedFeatures
95    
96    Only generate the B<Feature> and B<IsLocatedIn> tables when processing the feature group.
97    
98    =item dbLoad
99    
100    If TRUE, the database tables will be loaded automatically from the load files created.
101    
102  =back  =back
103    
104    =head2 Usage
105    
106    To load all the Sprout tables and then validate the result, you need to issue three
107    commands.
108    
109        LoadSproutTables -dbLoad "*"
110        TestSproutLoad
111        index_sprout
112    
113    All three commands send output to the console. In addition, C<LoadSproutTables> and
114    C<TestSproutLoad> write tracing information to C<trace.log> in the FIG temporary
115    directory (B<$FIG_Config::Tmp>). At the bottom of the log file will be a complete
116    list of errors. If errors occur in C<LoadSproutTables>, then the data must be corrected
117    and the offending table group reloaded. So, for example, if there are errors in the
118    load of the B<MadeAnnotation> and B<Compound> tables, you would need to run
119    
120        LoadSproutTables -dbLoad Annotation Reaction
121    
122    because B<MadeAnnotation> is in the C<Annotation> group, and B<Compound> is in the
123    C<Reaction> group. You can omit the C<dbLoad> option to create the load files without
124    loading the database, and you can add a C<trace> option to change the trace level.
125    The command below creates the Genome-related load files with a trace level of 3 and
126    does not load them into the Sprout database.
127    
128        LoadSproutTables -trace=3 Genome
129    
130    C<LoadSproutTables> takes a long time to run, so setting the trace level to 3 helps
131    to give you an idea of the progress.
132    
133    Once the Sprout database is loaded, B<TestSproutLoad> can be used to verify the load
134    against the FIG data. Again, the end of the C<trace.log> file will contain a summary
135    of the errors found. Like C<LoadSproutTables>, C<TestSproutLoad> is a time-consuming
136    script, so you may want to set the trace level to 3 to see visible progress.
137    
138        TestSproutLoad -trace=3
139    
140    Unlike C<LoadSproutTables>, in C<TestSproutLoad>, the individual errors found are
141    mixed in with the trace messages. They are all, however, marked with a trace type
142    of B<Problem>, as shown in the fragment below.
143    
144        11/02/2005 19:15:16 <main>: Processing feature fig|100226.1.peg.7742.
145        11/02/2005 19:15:17 <main>: Processing feature fig|100226.1.peg.7741.
146        11/02/2005 19:15:17 <Problem>: assignment "Short-chain dehydrodenase ...
147        11/02/2005 19:15:17 <Problem>: assignment "putative oxidoreductase." ...
148        11/02/2005 19:15:17 <Problem>: Incorrect assignment for fig|100226.1.peg.7741...
149        11/02/2005 19:15:17 <Problem>: Incorrect number of annotations found in ...
150        11/02/2005 19:15:17 <main>: Processing feature fig|100226.1.peg.7740.
151        11/02/2005 19:15:18 <main>: Processing feature fig|100226.1.peg.7739.
152    
153    The test may reveal that some tables need to be reloaded, or that a software
154    problem has crept into the Sprout.
155    
156    Once all the tables have the correct data, C<index_sprout> can be run to create the
157    Glimpse indexes.
158    
159  =cut  =cut
160    
161  use strict;  use strict;
162  use Tracer;  use Tracer;
163  use DocUtils;  use DocUtils;
 use TestUtils;  
164  use Cwd;  use Cwd;
165  use FIG;  use FIG;
166  use SFXlate;  use SFXlate;
# Line 64  Line 168 
168  use File::Path;  use File::Path;
169  use SproutLoad;  use SproutLoad;
170  use Stats;  use Stats;
171    use SFXlate;
172    
173  # Get the command-line parameters and options.  # Get the command-line parameters and options.
174  my ($options, @parameters) = Tracer::ParseCommand({ geneFile => "", subsysFile => "",  my ($options, @parameters) = Tracer::ParseCommand({ geneFile => "", subsysFile => "",
175                                                      trace => 3 },                                                      trace => 3, limitedFeatures => 0,
176                                                                 @ARGV);                                                      dbLoad => 0 }, @ARGV);
177  # Set up tracing.  # Set up tracing.
178  TSetup("$options->{trace} SproutLoad ERDBLoad ERDB Tracer Load", "+>$FIG_Config::temp/trace.log");  TSetup("$options->{trace} SproutLoad ERDBLoad ERDB Stats Tracer Load", "+>$FIG_Config::temp/trace.log");
179  # Create the sprout loader object.  # Create the sprout loader object. Note that the Sprout object does not
180    # open the database unless the "dbLoad" option is turned on.
181  my $fig = FIG->new();  my $fig = FIG->new();
182  my $sprout = SFXlate->new_sprout_only();  my $sprout = SFXlate->new_sprout_only(undef, undef, undef, ! $options->{dbLoad});
183  my $spl = SproutLoad->new($sprout, $fig, $options->{geneFile},  my $spl = SproutLoad->new($sprout, $fig, $options->{geneFile}, $options->{subsysFile}, $options);
                           $options->{subsysFile});  
184  # Process the parameters.  # Process the parameters.
185  for my $group (@parameters) {  for my $group (@parameters) {
186      Trace("Processing load group $group.") if T(2);      Trace("Processing load group $group.") if T(2);
187      my $stats;      my $stats;
188      if ($group eq 'Genome') {      if ($group eq 'Genome' || $group eq '*') {
189          $spl->LoadGenomeData();          $spl->LoadGenomeData();
190      } elsif ($group eq 'Feature') {      }
191        if ($group eq 'Feature' || $group eq '*') {
192          $spl->LoadFeatureData();          $spl->LoadFeatureData();
193      } elsif ($group eq 'Coupling') {      }
194        if ($group eq 'Coupling' || $group eq '*') {
195          $spl->LoadCouplingData();          $spl->LoadCouplingData();
196      } elsif ($group eq 'Subsystem') {      }
197        if ($group eq 'Subsystem' || $group eq '*') {
198          $spl->LoadSubsystemData();          $spl->LoadSubsystemData();
199      } elsif ($group eq 'Property') {      }
200        if ($group eq 'Property' || $group eq '*') {
201          $spl->LoadPropertyData();          $spl->LoadPropertyData();
     } else {  
         Confess("Invalid group name $group.");  
202      }      }
203        if ($group eq 'Diagram' || $group eq '*') {
204            $spl->LoadDiagramData();
205        }
206        if ($group eq 'Annotation' || $group eq '*') {
207            $spl->LoadAnnotationData();
208        }
209        if ($group eq 'BBH' || $group eq '*') {
210            $spl->LoadBBHData();
211        }
212        if ($group eq 'Group' || $group eq '*') {
213            $spl->LoadGroupData();
214        }
215        if ($group eq 'Source' || $group eq '*') {
216            $spl->LoadSourceData();
217        }
218        if ($group eq 'External' || $group eq '*') {
219            $spl->LoadExternalData();
220        }
221        if ($group eq 'Reaction' || $group eq '*') {
222            $spl->LoadReactionData();
223        }
224    
225  }  }
226  Trace("Load complete.") if T(2);  Trace("Load complete.") if T(2);
227    

Legend:
Removed from v.1.1  
changed lines
  Added in v.1.12

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3