1 |
package ERDB; |
package ERDB; |
2 |
|
|
3 |
use strict; |
use strict; |
|
use Carp; |
|
4 |
use Tracer; |
use Tracer; |
5 |
use DBKernel; |
use DBrtns; |
6 |
use Data::Dumper; |
use Data::Dumper; |
7 |
use XML::Simple; |
use XML::Simple; |
8 |
use DBQuery; |
use DBQuery; |
9 |
use DBObject; |
use DBObject; |
10 |
use Stats; |
use Stats; |
11 |
use Time::HiRes qw(gettimeofday); |
use Time::HiRes qw(gettimeofday); |
12 |
|
use Digest::MD5 qw(md5_base64); |
13 |
|
use FIG; |
14 |
|
|
15 |
=head1 Entity-Relationship Database Package |
=head1 Entity-Relationship Database Package |
16 |
|
|
34 |
relation that contains two fields-- the feature ID (C<id>) and the alias name (C<alias>). |
relation that contains two fields-- the feature ID (C<id>) and the alias name (C<alias>). |
35 |
The B<FEATURE> entity also contains an optional virulence number. This is implemented |
The B<FEATURE> entity also contains an optional virulence number. This is implemented |
36 |
as a separate relation C<FeatureVirulence> which contains an ID (C<id>) and a virulence number |
as a separate relation C<FeatureVirulence> which contains an ID (C<id>) and a virulence number |
37 |
(C<virulence>). If the virulence of a feature I<ABC> is known to be 6, there will be one row in the |
(C<virulence>). If the virulence of a feature I<ABC> is known to be 6, there will be one row in |
38 |
C<FeatureVirulence> relation possessing the value I<ABC> as its ID and 6 as its virulence number. |
the C<FeatureVirulence> relation possessing the value I<ABC> as its ID and 6 as its virulence |
39 |
If the virulence of I<ABC> is not known, there will not be any rows for it in C<FeatureVirulence>. |
number. If the virulence of I<ABC> is not known, there will not be any rows for it in |
40 |
|
C<FeatureVirulence>. |
41 |
|
|
42 |
Entities are connected by binary relationships implemented using single relations possessing the |
Entities are connected by binary relationships implemented using single relations possessing the |
43 |
same name as the relationship itself and that has an I<arity> of 1-to-1 (C<11>), 1-to-many (C<1M>), |
same name as the relationship itself and that has an I<arity> of 1-to-1 (C<11>), 1-to-many (C<1M>), |
69 |
was inserted by the L</InsertObject> method. |
was inserted by the L</InsertObject> method. |
70 |
|
|
71 |
To facilitate testing, the ERDB module supports automatic generation of test data. This process |
To facilitate testing, the ERDB module supports automatic generation of test data. This process |
72 |
is described in the L</GenerateEntity> and L</GenerateConnection> methods. |
is described in the L</GenerateEntity> and L</GenerateConnection> methods, though it is not yet |
73 |
|
fully implemented. |
74 |
|
|
75 |
|
=head2 XML Database Description |
76 |
|
|
77 |
|
=head3 Data Types |
78 |
|
|
79 |
|
The ERDB system supports the following data types. Note that there are numerous string |
80 |
|
types depending on the maximum length. Some database packages limit the total number of |
81 |
|
characters you have in an index key; to insure the database works in all environments, |
82 |
|
the type of string should be the shortest one possible that supports all the known values. |
83 |
|
|
84 |
|
=over 4 |
85 |
|
|
86 |
|
=item char |
87 |
|
|
88 |
|
single ASCII character |
89 |
|
|
90 |
|
=item int |
91 |
|
|
92 |
|
32-bit signed integer |
93 |
|
|
94 |
|
=item counter |
95 |
|
|
96 |
|
32-bit unsigned integer |
97 |
|
|
98 |
|
=item date |
99 |
|
|
100 |
|
64-bit unsigned integer, representing a PERL date/time value |
101 |
|
|
102 |
|
=item text |
103 |
|
|
104 |
|
long string; Text fields cannot be used in indexes or sorting and do not support the |
105 |
|
normal syntax of filter clauses, but can be up to a billion character in length |
106 |
|
|
107 |
|
=item float |
108 |
|
|
109 |
|
double-precision floating-point number |
110 |
|
|
111 |
|
=item boolean |
112 |
|
|
113 |
|
single-bit numeric value; The value is stored as a 16-bit signed integer (for |
114 |
|
compatability with certain database packages), but the only values supported are |
115 |
|
0 and 1. |
116 |
|
|
117 |
|
=item id-string |
118 |
|
|
119 |
|
variable-length string, maximum 25 characters |
120 |
|
|
121 |
|
=item key-string |
122 |
|
|
123 |
|
variable-length string, maximum 40 characters |
124 |
|
|
125 |
|
=item name-string |
126 |
|
|
127 |
|
variable-length string, maximum 80 characters |
128 |
|
|
129 |
|
=item medium-string |
130 |
|
|
131 |
|
variable-length string, maximum 160 characters |
132 |
|
|
133 |
|
=item string |
134 |
|
|
135 |
|
variable-length string, maximum 255 characters |
136 |
|
|
137 |
|
=item hash-string |
138 |
|
|
139 |
|
variable-length string, maximum 22 characters |
140 |
|
|
141 |
|
=back |
142 |
|
|
143 |
|
The hash-string data type has a special meaning. The actual key passed into the loader will |
144 |
|
be a string, but it will be digested into a 22-character MD5 code to save space. Although the |
145 |
|
MD5 algorithm is not perfect, it is extremely unlikely two strings will have the same |
146 |
|
digest. Therefore, it is presumed the keys will be unique. When the database is actually |
147 |
|
in use, the hashed keys will be presented rather than the original values. For this reason, |
148 |
|
they should not be used for entities where the key is meaningful. |
149 |
|
|
150 |
|
=head3 Global Tags |
151 |
|
|
152 |
|
The entire database definition must be inside a B<Database> tag. The display name of |
153 |
|
the database is given by the text associated with the B<Title> tag. The display name |
154 |
|
is only used in the automated documentation. It has no other effect. The entities and |
155 |
|
relationships are listed inside the B<Entities> and B<Relationships> tags, |
156 |
|
respectively. None of these tags have attributes. |
157 |
|
|
158 |
|
<Database> |
159 |
|
<Title>... display title here...</Title> |
160 |
|
<Entities> |
161 |
|
... entity definitions here ... |
162 |
|
</Entities> |
163 |
|
<Relationships> |
164 |
|
... relationship definitions here... |
165 |
|
</Relationships> |
166 |
|
</Database> |
167 |
|
|
168 |
|
Entities, relationships, indexes, and fields all allow a text tag called B<Notes>. |
169 |
|
The text inside the B<Notes> tag contains comments that will appear when the database |
170 |
|
documentation is generated. Within a B<Notes> tag, you may use C<[i]> and C<[/i]> for |
171 |
|
italics, C<[b]> and C<[/b]> for bold, and C<[p]> for a new paragraph. |
172 |
|
|
173 |
|
=head3 Fields |
174 |
|
|
175 |
|
Both entities and relationships have fields described by B<Field> tags. A B<Field> |
176 |
|
tag can have B<Notes> associated with it. The complete set of B<Field> tags for an |
177 |
|
object mus be inside B<Fields> tags. |
178 |
|
|
179 |
|
<Entity ... > |
180 |
|
<Fields> |
181 |
|
... Field tags ... |
182 |
|
</Fields> |
183 |
|
</Entity> |
184 |
|
|
185 |
|
The attributes for the B<Field> tag are as follows. |
186 |
|
|
187 |
|
=over 4 |
188 |
|
|
189 |
|
=item name |
190 |
|
|
191 |
|
Name of the field. The field name should contain only letters, digits, and hyphens (C<->), |
192 |
|
and the first character should be a letter. Most underlying databases are case-insensitive |
193 |
|
with the respect to field names, so a best practice is to use lower-case letters only. Finally, |
194 |
|
the name C<search-relevance> has special meaning for full-text searches and should not be |
195 |
|
used as a field name. |
196 |
|
|
197 |
|
=item type |
198 |
|
|
199 |
|
Data type of the field. The legal data types are given above. |
200 |
|
|
201 |
|
=item relation |
202 |
|
|
203 |
|
Name of the relation containing the field. This should only be specified for entity |
204 |
|
fields. The ERDB system does not support optional fields or multi-occurring fields |
205 |
|
in the primary relation of an entity. Instead, they are put into secondary relations. |
206 |
|
So, for example, in the C<Genome> entity, the C<group-name> field indicates a special |
207 |
|
grouping used to select a subset of the genomes. A given genome may not be in any |
208 |
|
groups or may be in multiple groups. Therefore, C<group-name> specifies a relation |
209 |
|
value. The relation name specified must be a valid table name. By convention, it is |
210 |
|
usually the entity name followed by a qualifying word (e.g. C<GenomeGroup>). In an |
211 |
|
entity, the fields without a relation attribute are said to belong to the |
212 |
|
I<primary relation>. This relation has the same name as the entity itself. |
213 |
|
|
214 |
|
=item searchable |
215 |
|
|
216 |
|
If specified, then the field is a candidate for full-text searching. A single full-text |
217 |
|
index will be created for each relation with at least one searchable field in it. |
218 |
|
For best results, this option should only be used for string or text fields. |
219 |
|
|
220 |
|
=back |
221 |
|
|
222 |
|
=head3 Indexes |
223 |
|
|
224 |
|
An entity can have multiple alternate indexes associated with it. The fields must |
225 |
|
be from the primary relation. The alternate indexes assist in ordering results |
226 |
|
from a query. A relationship can have up to two indexes-- a I<to-index> and a |
227 |
|
I<from-index>. These order the results when crossing the relationship. For |
228 |
|
example, in the relationship C<HasContig> from C<Genome> to C<Contig>, the |
229 |
|
from-index would order the contigs of a ganome, and the to-index would order |
230 |
|
the genomes of a contig. A relationship's index must specify only fields in |
231 |
|
the relationship. |
232 |
|
|
233 |
|
The indexes for an entity must be listed inside the B<Indexes> tag. The from-index |
234 |
|
of a relationship is specified using the B<FromIndex> tag; the to-index is specified |
235 |
|
using the B<ToIndex> tag. |
236 |
|
|
237 |
|
Each index can contain a B<Notes> tag. In addition, it will have an B<IndexFields> |
238 |
|
tag containing the B<IndexField> tags. These specify, in order, the fields used in |
239 |
|
the index. The attributes of an B<IndexField> tag are as follows. |
240 |
|
|
241 |
|
=over 4 |
242 |
|
|
243 |
|
=item name |
244 |
|
|
245 |
|
Name of the field. |
246 |
|
|
247 |
|
=item order |
248 |
|
|
249 |
|
Sort order of the field-- C<ascending> or C<descending>. |
250 |
|
|
251 |
|
=back |
252 |
|
|
253 |
|
The B<Index>, B<FromIndex>, and B<ToIndex> tags themselves have no attributes. |
254 |
|
|
255 |
|
=head3 Object and Field Names |
256 |
|
|
257 |
|
By convention entity and relationship names use capital casing (e.g. C<Genome> or |
258 |
|
C<HasRegionsIn>. Most underlying databases, however, are aggressively case-insensitive |
259 |
|
with respect to relation names, converting them internally to all-upper case or |
260 |
|
all-lower case. |
261 |
|
|
262 |
|
If syntax or parsing errors occur when you try to load or use an ERDB database, the |
263 |
|
most likely reason is that one of your objects has an SQL reserved word as its name. |
264 |
|
The list of SQL reserved words keeps increasing; however, most are unlikely to show |
265 |
|
up as a noun or declarative verb phrase. The exceptions are C<Group>, C<User>, |
266 |
|
C<Table>, C<Index>, C<Object>, C<Date>, C<Number>, C<Update>, C<Time>, C<Percent>, |
267 |
|
C<Memo>, C<Order>, and C<Sum>. This problem can crop up in field names as well. |
268 |
|
|
269 |
|
Every entity has a field called C<id> that acts as its primary key. Every relationship |
270 |
|
has fields called C<from-link> and C<to-link> that contain copies of the relevant |
271 |
|
entity IDs. These are essentially ERDB's reserved words, and should not be used |
272 |
|
for user-defined field names. |
273 |
|
|
274 |
|
=head3 Entities |
275 |
|
|
276 |
|
An entity is described by the B<Entity> tag. The entity can contain B<Notes>, an |
277 |
|
B<Indexes> tag containing one or more secondary indexes, and a B<Fields> tag |
278 |
|
containing one or more fields. The attributes of the B<Entity> tag are as follows. |
279 |
|
|
280 |
|
=over 4 |
281 |
|
|
282 |
|
=item name |
283 |
|
|
284 |
|
Name of the entity. The entity name, by convention, uses capital casing (e.g. C<Genome> |
285 |
|
or C<GroupBlock>) and should be a noun or noun phrase. |
286 |
|
|
287 |
|
=item keyType |
288 |
|
|
289 |
|
Data type of the primary key. The primary key is always named C<id>. |
290 |
|
|
291 |
|
=back |
292 |
|
|
293 |
|
=head3 Relationships |
294 |
|
|
295 |
|
A relationship is described by the C<Relationship> tag. Within a relationship, |
296 |
|
there can be a C<Notes> tag, a C<Fields> tag containing the intersection data |
297 |
|
fields, a C<FromIndex> tag containing the from-index, and a C<ToIndex> tag containing |
298 |
|
the to-index. |
299 |
|
|
300 |
|
The C<Relationship> tag has the following attributes. |
301 |
|
|
302 |
|
=over 4 |
303 |
|
|
304 |
|
=item name |
305 |
|
|
306 |
|
Name of the relationship. The relationship name, by convention, uses capital casing |
307 |
|
(e.g. C<ContainsRegionIn> or C<HasContig>), and should be a declarative verb |
308 |
|
phrase, designed to fit between the from-entity and the to-entity (e.g. |
309 |
|
Block C<ContainsRegionIn> Genome). |
310 |
|
|
311 |
|
=item from |
312 |
|
|
313 |
|
Name of the entity from which the relationship starts. |
314 |
|
|
315 |
|
=item to |
316 |
|
|
317 |
|
Name of the entity to which the relationship proceeds. |
318 |
|
|
319 |
|
=item arity |
320 |
|
|
321 |
|
Relationship type: C<1M> for one-to-many and C<MM> for many-to-many. |
322 |
|
|
323 |
|
=back |
324 |
|
|
325 |
=cut |
=cut |
326 |
|
|
329 |
# Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. |
# Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. |
330 |
# "maxLen" is the maximum permissible length of the incoming string data used to populate a field |
# "maxLen" is the maximum permissible length of the incoming string data used to populate a field |
331 |
# of the specified type. "dataGen" is PERL string that will be evaluated if no test data generation |
# of the specified type. "dataGen" is PERL string that will be evaluated if no test data generation |
332 |
#string is specified in the field definition. |
# string is specified in the field definition. "avgLen" is the average byte length for estimating |
333 |
my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, dataGen => "StringGen('A')" }, |
# record sizes. "sort" is the key modifier for the sort command. |
334 |
int => { sqlType => 'INTEGER', maxLen => 20, dataGen => "IntGen(0, 99999999)" }, |
my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, avgLen => 1, sort => "", dataGen => "StringGen('A')" }, |
335 |
string => { sqlType => 'VARCHAR(255)', maxLen => 255, dataGen => "StringGen(IntGen(10,250))" }, |
int => { sqlType => 'INTEGER', maxLen => 20, avgLen => 4, sort => "n", dataGen => "IntGen(0, 99999999)" }, |
336 |
text => { sqlType => 'TEXT', maxLen => 1000000000, dataGen => "StringGen(IntGen(80,1000))" }, |
counter => { sqlType => 'INTEGER UNSIGNED', maxLen => 20, avgLen => 4, sort => "n", dataGen => "IntGen(0, 99999999)" }, |
337 |
date => { sqlType => 'BIGINT', maxLen => 80, dataGen => "DateGen(-7, 7, IntGen(0,1400))" }, |
string => { sqlType => 'VARCHAR(255)', maxLen => 255, avgLen => 100, sort => "", dataGen => "StringGen(IntGen(10,250))" }, |
338 |
float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, dataGen => "FloatGen(0.0, 100.0)" }, |
text => { sqlType => 'TEXT', maxLen => 1000000000, avgLen => 500, sort => "", dataGen => "StringGen(IntGen(80,1000))" }, |
339 |
boolean => { sqlType => 'SMALLINT', maxLen => 1, dataGen => "IntGen(0, 1)" }, |
date => { sqlType => 'BIGINT', maxLen => 80, avgLen => 8, sort => "n", dataGen => "DateGen(-7, 7, IntGen(0,1400))" }, |
340 |
|
float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, avgLen => 8, sort => "g", dataGen => "FloatGen(0.0, 100.0)" }, |
341 |
|
boolean => { sqlType => 'SMALLINT', maxLen => 1, avgLen => 1, sort => "n", dataGen => "IntGen(0, 1)" }, |
342 |
|
'hash-string' => |
343 |
|
{ sqlType => 'VARCHAR(22)', maxLen => 22, avgLen => 22, sort => "", dataGen => "SringGen(22)" }, |
344 |
|
'id-string' => |
345 |
|
{ sqlType => 'VARCHAR(25)', maxLen => 25, avgLen => 25, sort => "", dataGen => "SringGen(22)" }, |
346 |
'key-string' => |
'key-string' => |
347 |
{ sqlType => 'VARCHAR(40)', maxLen => 40, dataGen => "StringGen(IntGen(10,40))" }, |
{ sqlType => 'VARCHAR(40)', maxLen => 40, avgLen => 10, sort => "", dataGen => "StringGen(IntGen(10,40))" }, |
348 |
'name-string' => |
'name-string' => |
349 |
{ sqlType => 'VARCHAR(80)', maxLen => 80, dataGen => "StringGen(IntGen(10,80))" }, |
{ sqlType => 'VARCHAR(80)', maxLen => 80, avgLen => 40, sort => "", dataGen => "StringGen(IntGen(10,80))" }, |
350 |
'medium-string' => |
'medium-string' => |
351 |
{ sqlType => 'VARCHAR(160)', maxLen => 160, dataGen => "StringGen(IntGen(10,160))" }, |
{ sqlType => 'VARCHAR(160)', maxLen => 160, avgLen => 40, sort => "", dataGen => "StringGen(IntGen(10,160))" }, |
352 |
); |
); |
353 |
|
|
354 |
# Table translating arities into natural language. |
# Table translating arities into natural language. |
370 |
|
|
371 |
=head3 new |
=head3 new |
372 |
|
|
373 |
C<< my $database = ERDB::new($dbh, $metaFileName); >> |
C<< my $database = ERDB->new($dbh, $metaFileName); >> |
374 |
|
|
375 |
Create a new ERDB object. |
Create a new ERDB object. |
376 |
|
|
395 |
my $metaData = _LoadMetaData($metaFileName); |
my $metaData = _LoadMetaData($metaFileName); |
396 |
# Create the object. |
# Create the object. |
397 |
my $self = { _dbh => $dbh, |
my $self = { _dbh => $dbh, |
398 |
_metaData => $metaData, |
_metaData => $metaData |
|
_options => $options, |
|
399 |
}; |
}; |
400 |
# Bless and return it. |
# Bless and return it. |
401 |
bless $self; |
bless $self, $class; |
402 |
return $self; |
return $self; |
403 |
} |
} |
404 |
|
|
405 |
=head3 ShowMetaData |
=head3 ShowMetaData |
406 |
|
|
407 |
C<< $database->ShowMetaData($fileName); >> |
C<< $erdb->ShowMetaData($fileName); >> |
408 |
|
|
409 |
This method outputs a description of the database. This description can be used to help users create |
This method outputs a description of the database. This description can be used to help users create |
410 |
the data to be loaded into the relations. |
the data to be loaded into the relations. |
431 |
my $relationshipList = $metadata->{Relationships}; |
my $relationshipList = $metadata->{Relationships}; |
432 |
# Open the output file. |
# Open the output file. |
433 |
open(HTMLOUT, ">$filename") || Confess("Could not open MetaData display file $filename: $!"); |
open(HTMLOUT, ">$filename") || Confess("Could not open MetaData display file $filename: $!"); |
434 |
|
Trace("Building MetaData table of contents.") if T(4); |
435 |
# Write the HTML heading stuff. |
# Write the HTML heading stuff. |
436 |
print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; |
print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; |
437 |
print HTMLOUT "</head>\n<body>\n"; |
print HTMLOUT "</head>\n<body>\n"; |
438 |
|
# Write the documentation. |
439 |
|
print HTMLOUT $self->DisplayMetaData(); |
440 |
|
# Close the document. |
441 |
|
print HTMLOUT "</body>\n</html>\n"; |
442 |
|
# Close the file. |
443 |
|
close HTMLOUT; |
444 |
|
} |
445 |
|
|
446 |
|
=head3 DisplayMetaData |
447 |
|
|
448 |
|
C<< my $html = $erdb->DisplayMetaData(); >> |
449 |
|
|
450 |
|
Return an HTML description of the database. This description can be used to help users create |
451 |
|
the data to be loaded into the relations and form queries. The output is raw includable HTML |
452 |
|
without any HEAD or BODY tags. |
453 |
|
|
454 |
|
=over 4 |
455 |
|
|
456 |
|
=item filename |
457 |
|
|
458 |
|
The name of the output file. |
459 |
|
|
460 |
|
=back |
461 |
|
|
462 |
|
=cut |
463 |
|
|
464 |
|
sub DisplayMetaData { |
465 |
|
# Get the parameters. |
466 |
|
my ($self) = @_; |
467 |
|
# Get the metadata and the title string. |
468 |
|
my $metadata = $self->{_metaData}; |
469 |
|
# Get the title string. |
470 |
|
my $title = $metadata->{Title}; |
471 |
|
# Get the entity and relationship lists. |
472 |
|
my $entityList = $metadata->{Entities}; |
473 |
|
my $relationshipList = $metadata->{Relationships}; |
474 |
|
# Declare the return variable. |
475 |
|
my $retVal = ""; |
476 |
|
# Open the output file. |
477 |
|
Trace("Building MetaData table of contents.") if T(4); |
478 |
# Here we do the table of contents. It starts as an unordered list of section names. Each |
# Here we do the table of contents. It starts as an unordered list of section names. Each |
479 |
# section contains an ordered list of entity or relationship subsections. |
# section contains an ordered list of entity or relationship subsections. |
480 |
print HTMLOUT "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; |
$retVal .= "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; |
481 |
# Loop through the Entities, displaying a list item for each. |
# Loop through the Entities, displaying a list item for each. |
482 |
foreach my $key (sort keys %{$entityList}) { |
foreach my $key (sort keys %{$entityList}) { |
483 |
# Display this item. |
# Display this item. |
484 |
print HTMLOUT "<li><a href=\"#$key\">$key</a></li>\n"; |
$retVal .= "<li><a href=\"#$key\">$key</a></li>\n"; |
485 |
} |
} |
486 |
# Close off the entity section and start the relationship section. |
# Close off the entity section and start the relationship section. |
487 |
print HTMLOUT "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; |
$retVal .= "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; |
488 |
# Loop through the Relationships. |
# Loop through the Relationships. |
489 |
foreach my $key (sort keys %{$relationshipList}) { |
foreach my $key (sort keys %{$relationshipList}) { |
490 |
# Display this item. |
# Display this item. |
491 |
my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); |
my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); |
492 |
print HTMLOUT "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; |
$retVal .= "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; |
493 |
} |
} |
494 |
# Close off the relationship section and list the join table section. |
# Close off the relationship section and list the join table section. |
495 |
print HTMLOUT "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; |
$retVal .= "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; |
496 |
# Close off the table of contents itself. |
# Close off the table of contents itself. |
497 |
print HTMLOUT "</ul>\n"; |
$retVal .= "</ul>\n"; |
498 |
# Now we start with the actual data. Denote we're starting the entity section. |
# Now we start with the actual data. Denote we're starting the entity section. |
499 |
print HTMLOUT "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; |
$retVal .= "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; |
500 |
# Loop through the entities. |
# Loop through the entities. |
501 |
for my $key (sort keys %{$entityList}) { |
for my $key (sort keys %{$entityList}) { |
502 |
|
Trace("Building MetaData entry for $key entity.") if T(4); |
503 |
# Create the entity header. It contains a bookmark and the entity name. |
# Create the entity header. It contains a bookmark and the entity name. |
504 |
print HTMLOUT "<a name=\"$key\"></a><h3>$key</h3>\n"; |
$retVal .= "<a name=\"$key\"></a><h3>$key</h3>\n"; |
505 |
# Get the entity data. |
# Get the entity data. |
506 |
my $entityData = $entityList->{$key}; |
my $entityData = $entityList->{$key}; |
507 |
# If there's descriptive text, display it. |
# If there's descriptive text, display it. |
508 |
if (my $notes = $entityData->{Notes}) { |
if (my $notes = $entityData->{Notes}) { |
509 |
print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
$retVal .= "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
510 |
} |
} |
511 |
# Now we want a list of the entity's relationships. First, we set up the relationship subsection. |
# Now we want a list of the entity's relationships. First, we set up the relationship subsection. |
512 |
print HTMLOUT "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; |
$retVal .= "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; |
513 |
# Loop through the relationships. |
# Loop through the relationships. |
514 |
for my $relationship (sort keys %{$relationshipList}) { |
for my $relationship (sort keys %{$relationshipList}) { |
515 |
# Get the relationship data. |
# Get the relationship data. |
519 |
# Get the relationship sentence and append the arity. |
# Get the relationship sentence and append the arity. |
520 |
my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); |
my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); |
521 |
# Display the relationship data. |
# Display the relationship data. |
522 |
print HTMLOUT "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; |
$retVal .= "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; |
523 |
} |
} |
524 |
} |
} |
525 |
# Close off the relationship list. |
# Close off the relationship list. |
526 |
print HTMLOUT "</ul>\n"; |
$retVal .= "</ul>\n"; |
527 |
# Get the entity's relations. |
# Get the entity's relations. |
528 |
my $relationList = $entityData->{Relations}; |
my $relationList = $entityData->{Relations}; |
529 |
# Create a header for the relation subsection. |
# Create a header for the relation subsection. |
530 |
print HTMLOUT "<h4>Relations for <b>$key</b></h4>\n"; |
$retVal .= "<h4>Relations for <b>$key</b></h4>\n"; |
531 |
# Loop through the relations, displaying them. |
# Loop through the relations, displaying them. |
532 |
for my $relation (sort keys %{$relationList}) { |
for my $relation (sort keys %{$relationList}) { |
533 |
my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); |
my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); |
534 |
print HTMLOUT $htmlString; |
$retVal .= $htmlString; |
535 |
} |
} |
536 |
} |
} |
537 |
# Denote we're starting the relationship section. |
# Denote we're starting the relationship section. |
538 |
print HTMLOUT "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; |
$retVal .= "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; |
539 |
# Loop through the relationships. |
# Loop through the relationships. |
540 |
for my $key (sort keys %{$relationshipList}) { |
for my $key (sort keys %{$relationshipList}) { |
541 |
|
Trace("Building MetaData entry for $key relationship.") if T(4); |
542 |
# Get the relationship's structure. |
# Get the relationship's structure. |
543 |
my $relationshipStructure = $relationshipList->{$key}; |
my $relationshipStructure = $relationshipList->{$key}; |
544 |
# Create the relationship header. |
# Create the relationship header. |
545 |
my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); |
my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); |
546 |
print HTMLOUT "<h3><a name=\"$key\"></a>$headerText</h3>\n"; |
$retVal .= "<h3><a name=\"$key\"></a>$headerText</h3>\n"; |
547 |
# Get the entity names. |
# Get the entity names. |
548 |
my $fromEntity = $relationshipStructure->{from}; |
my $fromEntity = $relationshipStructure->{from}; |
549 |
my $toEntity = $relationshipStructure->{to}; |
my $toEntity = $relationshipStructure->{to}; |
553 |
# since both sentences will say the same thing. |
# since both sentences will say the same thing. |
554 |
my $arity = $relationshipStructure->{arity}; |
my $arity = $relationshipStructure->{arity}; |
555 |
if ($arity eq "11") { |
if ($arity eq "11") { |
556 |
print HTMLOUT "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; |
$retVal .= "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; |
557 |
} else { |
} else { |
558 |
print HTMLOUT "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; |
$retVal .= "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; |
559 |
if ($arity eq "MM" && $fromEntity ne $toEntity) { |
if ($arity eq "MM" && $fromEntity ne $toEntity) { |
560 |
print HTMLOUT "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; |
$retVal .= "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; |
561 |
} |
} |
562 |
} |
} |
563 |
print HTMLOUT "</p>\n"; |
$retVal .= "</p>\n"; |
564 |
# If there are notes on this relationship, display them. |
# If there are notes on this relationship, display them. |
565 |
if (my $notes = $relationshipStructure->{Notes}) { |
if (my $notes = $relationshipStructure->{Notes}) { |
566 |
print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
$retVal .= "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
567 |
} |
} |
568 |
# Generate the relationship's relation table. |
# Generate the relationship's relation table. |
569 |
my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); |
my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); |
570 |
print HTMLOUT $htmlString; |
$retVal .= $htmlString; |
571 |
} |
} |
572 |
|
Trace("Building MetaData join table.") if T(4); |
573 |
# Denote we're starting the join table. |
# Denote we're starting the join table. |
574 |
print HTMLOUT "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; |
$retVal .= "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; |
575 |
# Create a table header. |
# Create a table header. |
576 |
print HTMLOUT _OpenTable("Join Table", "Source", "Target", "Join Condition"); |
$retVal .= _OpenTable("Join Table", "Source", "Target", "Join Condition"); |
577 |
# Loop through the joins. |
# Loop through the joins. |
578 |
my $joinTable = $metadata->{Joins}; |
my $joinTable = $metadata->{Joins}; |
579 |
for my $joinKey (sort keys %{$joinTable}) { |
my @joinKeys = keys %{$joinTable}; |
580 |
|
for my $joinKey (sort @joinKeys) { |
581 |
# Separate out the source, the target, and the join clause. |
# Separate out the source, the target, and the join clause. |
582 |
$joinKey =~ m!([^/]*)/(.*)$!; |
$joinKey =~ m!^([^/]+)/(.+)$!; |
583 |
my ($source, $target, $clause) = ($self->ComputeObjectSentence($1), |
my ($sourceRelation, $targetRelation) = ($1, $2); |
584 |
$self->ComputeObjectSentence($2), |
Trace("Join with key $joinKey is from $sourceRelation to $targetRelation.") if T(Joins => 4); |
585 |
$joinTable->{$joinKey}); |
my $source = $self->ComputeObjectSentence($sourceRelation); |
586 |
|
my $target = $self->ComputeObjectSentence($targetRelation); |
587 |
|
my $clause = $joinTable->{$joinKey}; |
588 |
# Display them in a table row. |
# Display them in a table row. |
589 |
print HTMLOUT "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; |
$retVal .= "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; |
590 |
} |
} |
591 |
# Close the table. |
# Close the table. |
592 |
print HTMLOUT _CloseTable(); |
$retVal .= _CloseTable(); |
593 |
# Close the document. |
Trace("Built MetaData HTML.") if T(3); |
594 |
print HTMLOUT "</body>\n</html>\n"; |
# Return the HTML. |
595 |
# Close the file. |
return $retVal; |
|
close HTMLOUT; |
|
596 |
} |
} |
597 |
|
|
598 |
=head3 DumpMetaData |
=head3 DumpMetaData |
599 |
|
|
600 |
C<< $database->DumpMetaData(); >> |
C<< $erdb->DumpMetaData(); >> |
601 |
|
|
602 |
Return a dump of the metadata structure. |
Return a dump of the metadata structure. |
603 |
|
|
612 |
|
|
613 |
=head3 CreateTables |
=head3 CreateTables |
614 |
|
|
615 |
C<< $datanase->CreateTables(); >> |
C<< $erdb->CreateTables(); >> |
616 |
|
|
617 |
This method creates the tables for the database from the metadata structure loaded by the |
This method creates the tables for the database from the metadata structure loaded by the |
618 |
constructor. It is expected this function will only be used on rare occasions, when the |
constructor. It is expected this function will only be used on rare occasions, when the |
624 |
sub CreateTables { |
sub CreateTables { |
625 |
# Get the parameters. |
# Get the parameters. |
626 |
my ($self) = @_; |
my ($self) = @_; |
627 |
my $metadata = $self->{_metaData}; |
# Get the relation names. |
628 |
my $dbh = $self->{_dbh}; |
my @relNames = $self->GetTableNames(); |
629 |
# Loop through the entities. |
# Loop through the relations. |
630 |
while (my ($entityName, $entityData) = each %{$metadata->{Entities}}) { |
for my $relationName (@relNames) { |
|
# Tell the user what we're doing. |
|
|
Trace("Creating relations for entity $entityName.") if T(1); |
|
|
# Loop through the entity's relations. |
|
|
for my $relationName (keys %{$entityData->{Relations}}) { |
|
631 |
# Create a table for this relation. |
# Create a table for this relation. |
632 |
$self->CreateTable($relationName); |
$self->CreateTable($relationName); |
633 |
Trace("Relation $relationName created.") if T(1); |
Trace("Relation $relationName created.") if T(2); |
|
} |
|
|
} |
|
|
# Loop through the relationships. |
|
|
my $relationshipTable = $metadata->{Relationships}; |
|
|
for my $relationshipName (keys %{$metadata->{Relationships}}) { |
|
|
# Create a table for this relationship. |
|
|
Trace("Creating relationship $relationshipName.") if T(1); |
|
|
$self->CreateTable($relationshipName); |
|
634 |
} |
} |
635 |
} |
} |
636 |
|
|
637 |
=head3 CreateTable |
=head3 CreateTable |
638 |
|
|
639 |
C<< $database->CreateTable($tableName, $indexFlag); >> |
C<< $erdb->CreateTable($tableName, $indexFlag, $estimatedRows); >> |
640 |
|
|
641 |
Create the table for a relation and optionally create its indexes. |
Create the table for a relation and optionally create its indexes. |
642 |
|
|
646 |
|
|
647 |
Name of the relation (which will also be the table name). |
Name of the relation (which will also be the table name). |
648 |
|
|
649 |
=item $indexFlag |
=item indexFlag |
650 |
|
|
651 |
TRUE if the indexes for the relation should be created, else FALSE. If FALSE, |
TRUE if the indexes for the relation should be created, else FALSE. If FALSE, |
652 |
L</CreateIndexes> must be called later to bring the indexes into existence. |
L</CreateIndexes> must be called later to bring the indexes into existence. |
653 |
|
|
654 |
|
=item estimatedRows (optional) |
655 |
|
|
656 |
|
If specified, the estimated maximum number of rows for the relation. This |
657 |
|
information allows the creation of tables using storage engines that are |
658 |
|
faster but require size estimates, such as MyISAM. |
659 |
|
|
660 |
=back |
=back |
661 |
|
|
662 |
=cut |
=cut |
663 |
|
|
664 |
sub CreateTable { |
sub CreateTable { |
665 |
# Get the parameters. |
# Get the parameters. |
666 |
my ($self, $relationName, $indexFlag) = @_; |
my ($self, $relationName, $indexFlag, $estimatedRows) = @_; |
667 |
# Get the database handle. |
# Get the database handle. |
668 |
my $dbh = $self->{_dbh}; |
my $dbh = $self->{_dbh}; |
669 |
# Get the relation data and determine whether or not the relation is primary. |
# Get the relation data and determine whether or not the relation is primary. |
687 |
# Insure the table is not already there. |
# Insure the table is not already there. |
688 |
$dbh->drop_table(tbl => $relationName); |
$dbh->drop_table(tbl => $relationName); |
689 |
Trace("Table $relationName dropped.") if T(2); |
Trace("Table $relationName dropped.") if T(2); |
690 |
|
# If there are estimated rows, create an estimate so we can take advantage of |
691 |
|
# faster DB technologies. |
692 |
|
my $estimation = undef; |
693 |
|
if ($estimatedRows) { |
694 |
|
$estimation = [$self->EstimateRowSize($relationName), $estimatedRows]; |
695 |
|
} |
696 |
# Create the table. |
# Create the table. |
697 |
Trace("Creating table $relationName: $fieldThing") if T(2); |
Trace("Creating table $relationName: $fieldThing") if T(2); |
698 |
$dbh->create_table(tbl => $relationName, flds => $fieldThing); |
$dbh->create_table(tbl => $relationName, flds => $fieldThing, estimates => $estimation); |
699 |
Trace("Relation $relationName created in database.") if T(2); |
Trace("Relation $relationName created in database.") if T(2); |
700 |
# If we want to build the indexes, we do it here. |
# If we want to build the indexes, we do it here. Note that the full-text search |
701 |
|
# index will not be built until the table has been loaded. |
702 |
if ($indexFlag) { |
if ($indexFlag) { |
703 |
$self->CreateIndex($relationName); |
$self->CreateIndex($relationName); |
704 |
} |
} |
705 |
} |
} |
706 |
|
|
707 |
|
=head3 VerifyFields |
708 |
|
|
709 |
|
C<< my $count = $erdb->VerifyFields($relName, \@fieldList); >> |
710 |
|
|
711 |
|
Run through the list of proposed field values, insuring that all the character fields are |
712 |
|
below the maximum length. If any fields are too long, they will be truncated in place. |
713 |
|
|
714 |
|
=over 4 |
715 |
|
|
716 |
|
=item relName |
717 |
|
|
718 |
|
Name of the relation for which the specified fields are destined. |
719 |
|
|
720 |
|
=item fieldList |
721 |
|
|
722 |
|
Reference to a list, in order, of the fields to be put into the relation. |
723 |
|
|
724 |
|
=item RETURN |
725 |
|
|
726 |
|
Returns the number of fields truncated. |
727 |
|
|
728 |
|
=back |
729 |
|
|
730 |
|
=cut |
731 |
|
|
732 |
|
sub VerifyFields { |
733 |
|
# Get the parameters. |
734 |
|
my ($self, $relName, $fieldList) = @_; |
735 |
|
# Initialize the return value. |
736 |
|
my $retVal = 0; |
737 |
|
# Get the relation definition. |
738 |
|
my $relData = $self->_FindRelation($relName); |
739 |
|
# Get the list of field descriptors. |
740 |
|
my $fieldTypes = $relData->{Fields}; |
741 |
|
my $fieldCount = scalar @{$fieldTypes}; |
742 |
|
# Loop through the two lists. |
743 |
|
for (my $i = 0; $i < $fieldCount; $i++) { |
744 |
|
# Get the type of the current field. |
745 |
|
my $fieldType = $fieldTypes->[$i]->{type}; |
746 |
|
# If it's a character field, verify the length. |
747 |
|
if ($fieldType =~ /string/) { |
748 |
|
my $maxLen = $TypeTable{$fieldType}->{maxLen}; |
749 |
|
my $oldString = $fieldList->[$i]; |
750 |
|
if (length($oldString) > $maxLen) { |
751 |
|
# Here it's too big, so we truncate it. |
752 |
|
Trace("Truncating field $i in relation $relName to $maxLen characters from \"$oldString\".") if T(1); |
753 |
|
$fieldList->[$i] = substr $oldString, 0, $maxLen; |
754 |
|
$retVal++; |
755 |
|
} |
756 |
|
} |
757 |
|
} |
758 |
|
# Return the truncation count. |
759 |
|
return $retVal; |
760 |
|
} |
761 |
|
|
762 |
|
=head3 DigestFields |
763 |
|
|
764 |
|
C<< $erdb->DigestFields($relName, $fieldList); >> |
765 |
|
|
766 |
|
Digest the strings in the field list that correspond to data type C<hash-string> in the |
767 |
|
specified relation. |
768 |
|
|
769 |
|
=over 4 |
770 |
|
|
771 |
|
=item relName |
772 |
|
|
773 |
|
Name of the relation to which the fields belong. |
774 |
|
|
775 |
|
=item fieldList |
776 |
|
|
777 |
|
List of field contents to be loaded into the relation. |
778 |
|
|
779 |
|
=back |
780 |
|
|
781 |
|
=cut |
782 |
|
#: Return Type ; |
783 |
|
sub DigestFields { |
784 |
|
# Get the parameters. |
785 |
|
my ($self, $relName, $fieldList) = @_; |
786 |
|
# Get the relation definition. |
787 |
|
my $relData = $self->_FindRelation($relName); |
788 |
|
# Get the list of field descriptors. |
789 |
|
my $fieldTypes = $relData->{Fields}; |
790 |
|
my $fieldCount = scalar @{$fieldTypes}; |
791 |
|
# Loop through the two lists. |
792 |
|
for (my $i = 0; $i < $fieldCount; $i++) { |
793 |
|
# Get the type of the current field. |
794 |
|
my $fieldType = $fieldTypes->[$i]->{type}; |
795 |
|
# If it's a hash string, digest it in place. |
796 |
|
if ($fieldType eq 'hash-string') { |
797 |
|
$fieldList->[$i] = $self->DigestKey($fieldList->[$i]); |
798 |
|
} |
799 |
|
} |
800 |
|
} |
801 |
|
|
802 |
|
=head3 DigestKey |
803 |
|
|
804 |
|
C<< my $digested = $erdb->DigestKey($keyValue); >> |
805 |
|
|
806 |
|
Return the digested value of a symbolic key. The digested value can then be plugged into a |
807 |
|
key-based search into a table with key-type hash-string. |
808 |
|
|
809 |
|
Currently the digesting process is independent of the database structure, but that may not |
810 |
|
always be the case, so this is an instance method instead of a static method. |
811 |
|
|
812 |
|
=over 4 |
813 |
|
|
814 |
|
=item keyValue |
815 |
|
|
816 |
|
Key value to digest. |
817 |
|
|
818 |
|
=item RETURN |
819 |
|
|
820 |
|
Digested value of the key. |
821 |
|
|
822 |
|
=back |
823 |
|
|
824 |
|
=cut |
825 |
|
|
826 |
|
sub DigestKey { |
827 |
|
# Get the parameters. |
828 |
|
my ($self, $keyValue) = @_; |
829 |
|
# Compute the digest. |
830 |
|
my $retVal = md5_base64($keyValue); |
831 |
|
# Return the result. |
832 |
|
return $retVal; |
833 |
|
} |
834 |
|
|
835 |
=head3 CreateIndex |
=head3 CreateIndex |
836 |
|
|
837 |
C<< $database->CreateIndex($relationName); >> |
C<< $erdb->CreateIndex($relationName); >> |
838 |
|
|
839 |
Create the indexes for a relation. If a table is being loaded from a large source file (as |
Create the indexes for a relation. If a table is being loaded from a large source file (as |
840 |
is the case in L</LoadTable>), it is best to create the indexes after the load. If that is |
is the case in L</LoadTable>), it is sometimes best to create the indexes after the load. |
841 |
the case, then L</CreateTable> should be called with the index flag set to FALSE, and this |
If that is the case, then L</CreateTable> should be called with the index flag set to |
842 |
method used after the load to create the indexes for the table. |
FALSE, and this method used after the load to create the indexes for the table. |
843 |
|
|
844 |
=cut |
=cut |
845 |
|
|
851 |
# Get the database handle. |
# Get the database handle. |
852 |
my $dbh = $self->{_dbh}; |
my $dbh = $self->{_dbh}; |
853 |
# Now we need to create this relation's indexes. We do this by looping through its index table. |
# Now we need to create this relation's indexes. We do this by looping through its index table. |
854 |
while (my ($indexName, $indexData) = each %{$relationData->{Indexes}}) { |
my $indexHash = $relationData->{Indexes}; |
855 |
|
for my $indexName (keys %{$indexHash}) { |
856 |
|
my $indexData = $indexHash->{$indexName}; |
857 |
# Get the index's field list. |
# Get the index's field list. |
858 |
my @fieldList = _FixNames(@{$indexData->{IndexFields}}); |
my @fieldList = _FixNames(@{$indexData->{IndexFields}}); |
859 |
my $flds = join(', ', @fieldList); |
my $flds = join(', ', @fieldList); |
860 |
# Get the index's uniqueness flag. |
# Get the index's uniqueness flag. |
861 |
my $unique = (exists $indexData->{Unique} ? $indexData->{Unique} : 'false'); |
my $unique = (exists $indexData->{Unique} ? 'unique' : undef); |
862 |
# Create the index. |
# Create the index. |
863 |
$dbh->create_index(idx => $indexName, tbl => $relationName, flds => $flds, unique => $unique); |
my $rv = $dbh->create_index(idx => $indexName, tbl => $relationName, |
864 |
|
flds => $flds, kind => $unique); |
865 |
|
if ($rv) { |
866 |
Trace("Index created: $indexName for $relationName ($flds)") if T(1); |
Trace("Index created: $indexName for $relationName ($flds)") if T(1); |
867 |
|
} else { |
868 |
|
Confess("Error creating index $indexName for $relationName using ($flds): " . $dbh->error_message()); |
869 |
|
} |
870 |
} |
} |
871 |
} |
} |
872 |
|
|
873 |
=head3 LoadTables |
=head3 LoadTables |
874 |
|
|
875 |
C<< my $stats = $database->LoadTables($directoryName, $rebuild); >> |
C<< my $stats = $erdb->LoadTables($directoryName, $rebuild); >> |
876 |
|
|
877 |
This method will load the database tables from a directory. The tables must already have been created |
This method will load the database tables from a directory. The tables must already have been created |
878 |
in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; |
in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; |
915 |
$directoryName =~ s!/\\$!!; |
$directoryName =~ s!/\\$!!; |
916 |
# Declare the return variable. |
# Declare the return variable. |
917 |
my $retVal = Stats->new(); |
my $retVal = Stats->new(); |
918 |
# Get the metadata structure. |
# Get the relation names. |
919 |
my $metaData = $self->{_metaData}; |
my @relNames = $self->GetTableNames(); |
920 |
# Loop through the entities. |
for my $relationName (@relNames) { |
|
for my $entity (values %{$metaData->{Entities}}) { |
|
|
# Loop through the entity's relations. |
|
|
for my $relationName (keys %{$entity->{Relations}}) { |
|
921 |
# Try to load this relation. |
# Try to load this relation. |
922 |
my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); |
my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); |
923 |
# Accumulate the statistics. |
# Accumulate the statistics. |
924 |
$retVal->Accumulate($result); |
$retVal->Accumulate($result); |
925 |
} |
} |
|
} |
|
|
# Loop through the relationships. |
|
|
for my $relationshipName (keys %{$metaData->{Relationships}}) { |
|
|
# Try to load this relationship's relation. |
|
|
my $result = $self->_LoadRelation($directoryName, $relationshipName, $rebuild); |
|
|
# Accumulate the statistics. |
|
|
$retVal->Accumulate($result); |
|
|
} |
|
926 |
# Add the duration of the load to the statistical object. |
# Add the duration of the load to the statistical object. |
927 |
$retVal->Add('duration', gettimeofday - $startTime); |
$retVal->Add('duration', gettimeofday - $startTime); |
928 |
# Return the accumulated statistics. |
# Return the accumulated statistics. |
929 |
return $retVal; |
return $retVal; |
930 |
} |
} |
931 |
|
|
932 |
|
|
933 |
=head3 GetTableNames |
=head3 GetTableNames |
934 |
|
|
935 |
C<< my @names = $database->GetTableNames; >> |
C<< my @names = $erdb->GetTableNames; >> |
936 |
|
|
937 |
Return a list of the relations required to implement this database. |
Return a list of the relations required to implement this database. |
938 |
|
|
949 |
|
|
950 |
=head3 GetEntityTypes |
=head3 GetEntityTypes |
951 |
|
|
952 |
C<< my @names = $database->GetEntityTypes; >> |
C<< my @names = $erdb->GetEntityTypes; >> |
953 |
|
|
954 |
Return a list of the entity type names. |
Return a list of the entity type names. |
955 |
|
|
964 |
return sort keys %{$entityList}; |
return sort keys %{$entityList}; |
965 |
} |
} |
966 |
|
|
967 |
|
=head3 IsEntity |
968 |
|
|
969 |
|
C<< my $flag = $erdb->IsEntity($entityName); >> |
970 |
|
|
971 |
|
Return TRUE if the parameter is an entity name, else FALSE. |
972 |
|
|
973 |
|
=over 4 |
974 |
|
|
975 |
|
=item entityName |
976 |
|
|
977 |
|
Object name to be tested. |
978 |
|
|
979 |
|
=item RETURN |
980 |
|
|
981 |
|
Returns TRUE if the specified string is an entity name, else FALSE. |
982 |
|
|
983 |
|
=back |
984 |
|
|
985 |
|
=cut |
986 |
|
|
987 |
|
sub IsEntity { |
988 |
|
# Get the parameters. |
989 |
|
my ($self, $entityName) = @_; |
990 |
|
# Test to see if it's an entity. |
991 |
|
return exists $self->{_metaData}->{Entities}->{$entityName}; |
992 |
|
} |
993 |
|
|
994 |
=head3 Get |
=head3 Get |
995 |
|
|
996 |
C<< my $query = $database->Get(\@objectNames, $filterClause, $param1, $param2, ..., $paramN); >> |
C<< my $query = $erdb->Get(\@objectNames, $filterClause, \@params); >> |
997 |
|
|
998 |
This method returns a query object for entities of a specified type using a specified filter. |
This method returns a query object for entities of a specified type using a specified filter. |
999 |
The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each |
The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each |
1001 |
following call requests all B<Genome> objects for the genus specified in the variable |
following call requests all B<Genome> objects for the genus specified in the variable |
1002 |
$genus. |
$genus. |
1003 |
|
|
1004 |
C<< $query = $sprout->Get(['Genome'], "Genome(genus) = ?", $genus); >> |
C<< $query = $erdb->Get(['Genome'], "Genome(genus) = ?", [$genus]); >> |
1005 |
|
|
1006 |
The WHERE clause contains a single question mark, so there is a single additional |
The WHERE clause contains a single question mark, so there is a single additional |
1007 |
parameter representing the parameter value. It would also be possible to code |
parameter representing the parameter value. It would also be possible to code |
1008 |
|
|
1009 |
C<< $query = $sprout->Get(['Genome'], "Genome(genus) = \'$genus\'"); >> |
C<< $query = $erdb->Get(['Genome'], "Genome(genus) = \'$genus\'"); >> |
1010 |
|
|
1011 |
however, this version of the call would generate a syntax error if there were any quote |
however, this version of the call would generate a syntax error if there were any quote |
1012 |
characters inside the variable C<$genus>. |
characters inside the variable C<$genus>. |
1018 |
It is possible to specify multiple entity and relationship names in order to retrieve more than |
It is possible to specify multiple entity and relationship names in order to retrieve more than |
1019 |
one object's data at the same time, which allows highly complex joined queries. For example, |
one object's data at the same time, which allows highly complex joined queries. For example, |
1020 |
|
|
1021 |
C<< $query = $sprout->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", $genus); >> |
C<< $query = $erdb->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", [$genus]); >> |
1022 |
|
|
1023 |
If multiple names are specified, then the query processor will automatically determine a |
If multiple names are specified, then the query processor will automatically determine a |
1024 |
join path between the entities and relationships. The algorithm used is very simplistic. |
join path between the entities and relationships. The algorithm used is very simplistic. |
1025 |
In particular, you can't specify any entity or relationship more than once, and if a |
In particular, if a relationship is recursive, the path is determined by the order in which |
1026 |
relationship is recursive, the path is determined by the order in which the entity |
the entity and the relationship appear. For example, consider a recursive relationship |
1027 |
and the relationship appear. For example, consider a recursive relationship B<IsParentOf> |
B<IsParentOf> which relates B<People> objects to other B<People> objects. If the join path is |
|
which relates B<People> objects to other B<People> objects. If the join path is |
|
1028 |
coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, |
coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, |
1029 |
the join path is C<['IsParentOf', 'People']>, then the people returned will be children. |
the join path is C<['IsParentOf', 'People']>, then the people returned will be children. |
1030 |
|
|
1031 |
|
If an entity or relationship is mentioned twice, the name for the second occurrence will |
1032 |
|
be suffixed with C<2>, the third occurrence will be suffixed with C<3>, and so forth. So, |
1033 |
|
for example, if we have C<['Feature', 'HasContig', 'Contig', 'HasContig']>, then the |
1034 |
|
B<to-link> field of the first B<HasContig> is specified as C<HasContig(to-link)>, while |
1035 |
|
the B<to-link> field of the second B<HasContig> is specified as C<HasContig2(to-link)>. |
1036 |
|
|
1037 |
=over 4 |
=over 4 |
1038 |
|
|
1039 |
=item objectNames |
=item objectNames |
1056 |
|
|
1057 |
C<< "Genome(genus) = ? ORDER BY Genome(species)" >> |
C<< "Genome(genus) = ? ORDER BY Genome(species)" >> |
1058 |
|
|
1059 |
|
Note that the case is important. Only an uppercase "ORDER BY" with a single space will |
1060 |
|
be processed. The idea is to make it less likely to find the verb by accident. |
1061 |
|
|
1062 |
The rules for field references in a sort order are the same as those for field references in the |
The rules for field references in a sort order are the same as those for field references in the |
1063 |
filter clause in general; however, odd things may happen if a sort field is from a secondary |
filter clause in general; however, odd things may happen if a sort field is from a secondary |
1064 |
relation. |
relation. |
1065 |
|
|
1066 |
=item param1, param2, ..., paramN |
Finally, you can limit the number of rows returned by adding a LIMIT clause. The LIMIT must |
1067 |
|
be the last thing in the filter clause, and it contains only the word "LIMIT" followed by |
1068 |
|
a positive number. So, for example |
1069 |
|
|
1070 |
|
C<< "Genome(genus) = ? ORDER BY Genome(species) LIMIT 10" >> |
1071 |
|
|
1072 |
|
will only return the first ten genomes for the specified genus. The ORDER BY clause is not |
1073 |
|
required. For example, to just get the first 10 genomes in the B<Genome> table, you could |
1074 |
|
use |
1075 |
|
|
1076 |
Parameter values to be substituted into the filter clause. |
C<< "LIMIT 10" >> |
1077 |
|
|
1078 |
|
=item params |
1079 |
|
|
1080 |
|
Reference to a list of parameter values to be substituted into the filter clause. |
1081 |
|
|
1082 |
=item RETURN |
=item RETURN |
1083 |
|
|
1089 |
|
|
1090 |
sub Get { |
sub Get { |
1091 |
# Get the parameters. |
# Get the parameters. |
1092 |
my ($self, $objectNames, $filterClause, @params) = @_; |
my ($self, $objectNames, $filterClause, $params) = @_; |
1093 |
# Construct the SELECT statement. The general pattern is |
# Process the SQL stuff. |
1094 |
# |
my ($suffix, $mappedNameListRef, $mappedNameHashRef) = |
1095 |
# SELECT name1.*, name2.*, ... nameN.* FROM name1, name2, ... nameN |
$self->_SetupSQL($objectNames, $filterClause); |
1096 |
# |
# Create the query. |
1097 |
my $dbh = $self->{_dbh}; |
my $command = "SELECT DISTINCT " . join(".*, ", @{$mappedNameListRef}) . |
1098 |
my $command = "SELECT DISTINCT " . join('.*, ', @{$objectNames}) . ".* FROM " . |
".* $suffix"; |
1099 |
join(', ', @{$objectNames}); |
my $sth = $self->_GetStatementHandle($command, $params); |
1100 |
# Check for a filter clause. |
# Now we create the relation map, which enables DBQuery to determine the order, name |
1101 |
if ($filterClause) { |
# and mapped name for each object in the query. |
1102 |
# Here we have one, so we convert its field names and add it to the query. First, |
my @relationMap = (); |
1103 |
# We create a copy of the filter string we can work with. |
for my $mappedName (@{$mappedNameListRef}) { |
1104 |
my $filterString = $filterClause; |
push @relationMap, [$mappedName, $mappedNameHashRef->{$mappedName}]; |
|
# Next, we sort the object names by length. This helps protect us from finding |
|
|
# object names inside other object names when we're doing our search and replace. |
|
|
my @sortedNames = sort { length($b) - length($a) } @{$objectNames}; |
|
|
# We will also keep a list of conditions to add to the WHERE clause in order to link |
|
|
# entities and relationships as well as primary relations to secondary ones. |
|
|
my @joinWhere = (); |
|
|
# The final preparatory step is to create a hash table of relation names. The |
|
|
# table begins with the relation names already in the SELECT command. |
|
|
my %fromNames = (); |
|
|
for my $objectName (@sortedNames) { |
|
|
$fromNames{$objectName} = 1; |
|
1105 |
} |
} |
1106 |
# We are ready to begin. We loop through the object names, replacing each |
# Return the statement object. |
1107 |
# object name's field references by the corresponding SQL field reference. |
my $retVal = DBQuery::_new($self, $sth, \@relationMap); |
1108 |
# Along the way, if we find a secondary relation, we will need to add it |
return $retVal; |
1109 |
# to the FROM clause. |
} |
1110 |
for my $objectName (@sortedNames) { |
|
1111 |
# Get the length of the object name plus 2. This is the value we add to the |
=head3 Search |
1112 |
# size of the field name to determine the size of the field reference as a |
|
1113 |
# whole. |
C<< my $query = $erdb->Search($searchExpression, $idx, \@objectNames, $filterClause, \@params); >> |
1114 |
my $nameLength = 2 + length $objectName; |
|
1115 |
# Get the object's field list. |
Perform a full text search with filtering. The search will be against a specified object |
1116 |
my $fieldList = $self->_GetFieldTable($objectName); |
in the object name list. That object will get an extra field containing the search |
1117 |
# Find the field references for this object. |
relevance. Note that except for the search expression, the parameters of this method are |
1118 |
while ($filterString =~ m/$objectName\(([^)]*)\)/g) { |
the same as those for L</Get> and follow the same rules. |
1119 |
# At this point, $1 contains the field name, and the current position |
|
1120 |
# is set immediately after the final parenthesis. We pull out the name of |
=over 4 |
1121 |
# the field and the position and length of the field reference as a whole. |
|
1122 |
my $fieldName = $1; |
=item searchExpression |
1123 |
my $len = $nameLength + length $fieldName; |
|
1124 |
my $pos = pos($filterString) - $len; |
Boolean search expression for the text fields of the target object. |
1125 |
# Insure the field exists. |
|
1126 |
if (!exists $fieldList->{$fieldName}) { |
=item idx |
1127 |
Confess("Field $fieldName not found for object $objectName."); |
|
1128 |
|
Index in the I<$objectNames> list of the table to be searched in full-text mode. |
1129 |
|
|
1130 |
|
=item objectNames |
1131 |
|
|
1132 |
|
List containing the names of the entity and relationship objects to be retrieved. |
1133 |
|
|
1134 |
|
=item filterClause |
1135 |
|
|
1136 |
|
WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can |
1137 |
|
be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be |
1138 |
|
specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified |
1139 |
|
in the filter clause should be added to the parameter list as additional parameters. The |
1140 |
|
fields in a filter clause can come from primary entity relations, relationship relations, |
1141 |
|
or secondary entity relations; however, all of the entities and relationships involved must |
1142 |
|
be included in the list of object names. |
1143 |
|
|
1144 |
|
=item params |
1145 |
|
|
1146 |
|
Reference to a list of parameter values to be substituted into the filter clause. |
1147 |
|
|
1148 |
|
=item RETURN |
1149 |
|
|
1150 |
|
Returns a query object for the specified search. |
1151 |
|
|
1152 |
|
=back |
1153 |
|
|
1154 |
|
=cut |
1155 |
|
|
1156 |
|
sub Search { |
1157 |
|
# Get the parameters. |
1158 |
|
my ($self, $searchExpression, $idx, $objectNames, $filterClause, $params) = @_; |
1159 |
|
# Declare the return variable. |
1160 |
|
my $retVal; |
1161 |
|
# Create a safety copy of the parameter list. |
1162 |
|
my @myParams = @{$params}; |
1163 |
|
# Get the first object's structure so we have access to the searchable fields. |
1164 |
|
my $object1Name = $objectNames->[$idx]; |
1165 |
|
my $object1Structure = $self->_GetStructure($object1Name); |
1166 |
|
# Get the field list. |
1167 |
|
if (! exists $object1Structure->{searchFields}) { |
1168 |
|
Confess("No searchable index for $object1Name."); |
1169 |
} else { |
} else { |
1170 |
# Get the field's relation. |
# Get the field list. |
1171 |
my $relationName = $fieldList->{$fieldName}->{relation}; |
my @fields = @{$object1Structure->{searchFields}}; |
1172 |
# Insure the relation is in the FROM clause. |
# We need two match expressions, one for the filter clause and one in the |
1173 |
if (!exists $fromNames{$relationName}) { |
# query itself. Both will use a parameter mark, so we need to push the |
1174 |
# Add the relation to the FROM clause. |
# search expression onto the front of the parameter list twice. |
1175 |
$command .= ", $relationName"; |
unshift @myParams, $searchExpression, $searchExpression; |
1176 |
# Create its join sub-clause. |
# Build the match expression. |
1177 |
push @joinWhere, "$objectName.id = $relationName.id"; |
my @matchFilterFields = map { "$object1Name." . _FixName($_) } @fields; |
1178 |
# Denote we have it available for future fields. |
my $matchClause = "MATCH (" . join(", ", @matchFilterFields) . ") AGAINST (? IN BOOLEAN MODE)"; |
1179 |
$fromNames{$relationName} = 1; |
# Process the SQL stuff. |
1180 |
|
my ($suffix, $mappedNameListRef, $mappedNameHashRef) = |
1181 |
|
$self->_SetupSQL($objectNames, $filterClause, $matchClause); |
1182 |
|
# Create the query. Note that the match clause is inserted at the front of |
1183 |
|
# the select fields. |
1184 |
|
my $command = "SELECT DISTINCT $matchClause, " . join(".*, ", @{$mappedNameListRef}) . |
1185 |
|
".* $suffix"; |
1186 |
|
my $sth = $self->_GetStatementHandle($command, \@myParams); |
1187 |
|
# Now we create the relation map, which enables DBQuery to determine the order, name |
1188 |
|
# and mapped name for each object in the query. |
1189 |
|
my @relationMap = _RelationMap($mappedNameHashRef, $mappedNameListRef); |
1190 |
|
# Return the statement object. |
1191 |
|
$retVal = DBQuery::_new($self, $sth, \@relationMap, $object1Name); |
1192 |
} |
} |
1193 |
# Form an SQL field reference from the relation name and the field name. |
return $retVal; |
|
my $sqlReference = "$relationName." . _FixName($fieldName); |
|
|
# Put it into the filter string in place of the old value. |
|
|
substr($filterString, $pos, $len) = $sqlReference; |
|
|
# Reposition the search. |
|
|
pos $filterString = $pos + length $sqlReference; |
|
1194 |
} |
} |
1195 |
|
|
1196 |
|
=head3 GetFlat |
1197 |
|
|
1198 |
|
C<< my @list = $erdb->GetFlat(\@objectNames, $filterClause, \@parameterList, $field); >> |
1199 |
|
|
1200 |
|
This is a variation of L</GetAll> that asks for only a single field per record and |
1201 |
|
returns a single flattened list. |
1202 |
|
|
1203 |
|
=over 4 |
1204 |
|
|
1205 |
|
=item objectNames |
1206 |
|
|
1207 |
|
List containing the names of the entity and relationship objects to be retrieved. |
1208 |
|
|
1209 |
|
=item filterClause |
1210 |
|
|
1211 |
|
WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can |
1212 |
|
be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form |
1213 |
|
B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the |
1214 |
|
parameter list as additional parameters. The fields in a filter clause can come from primary |
1215 |
|
entity relations, relationship relations, or secondary entity relations; however, all of the |
1216 |
|
entities and relationships involved must be included in the list of object names. |
1217 |
|
|
1218 |
|
=item parameterList |
1219 |
|
|
1220 |
|
List of the parameters to be substituted in for the parameters marks in the filter clause. |
1221 |
|
|
1222 |
|
=item field |
1223 |
|
|
1224 |
|
Name of the field to be used to get the elements of the list returned. |
1225 |
|
|
1226 |
|
=item RETURN |
1227 |
|
|
1228 |
|
Returns a list of values. |
1229 |
|
|
1230 |
|
=back |
1231 |
|
|
1232 |
|
=cut |
1233 |
|
#: Return Type @; |
1234 |
|
sub GetFlat { |
1235 |
|
# Get the parameters. |
1236 |
|
my ($self, $objectNames, $filterClause, $parameterList, $field) = @_; |
1237 |
|
# Construct the query. |
1238 |
|
my $query = $self->Get($objectNames, $filterClause, $parameterList); |
1239 |
|
# Create the result list. |
1240 |
|
my @retVal = (); |
1241 |
|
# Loop through the records, adding the field values found to the result list. |
1242 |
|
while (my $row = $query->Fetch()) { |
1243 |
|
push @retVal, $row->Value($field); |
1244 |
} |
} |
1245 |
|
# Return the list created. |
1246 |
|
return @retVal; |
1247 |
} |
} |
1248 |
# The next step is to join the objects together. We only need to do this if there |
|
1249 |
# is more than one object in the object list. We start with the first object and |
=head3 Delete |
1250 |
# run through the objects after it. Note also that we make a safety copy of the |
|
1251 |
# list before running through it. |
C<< my $stats = $erdb->Delete($entityName, $objectID); >> |
1252 |
my @objectList = @{$objectNames}; |
|
1253 |
my $lastObject = shift @objectList; |
Delete an entity instance from the database. The instance is deleted along with all entity and |
1254 |
# Get the join table. |
relationship instances dependent on it. The idea of dependence here is recursive. An object is |
1255 |
my $joinTable = $self->{_metaData}->{Joins}; |
always dependent on itself. An object is dependent if it is a 1-to-many or many-to-many |
1256 |
# Loop through the object list. |
relationship connected to a dependent entity or the "to" entity connected to a 1-to-many |
1257 |
for my $thisObject (@objectList) { |
dependent relationship. |
1258 |
# Look for a join. |
|
1259 |
my $joinKey = "$lastObject/$thisObject"; |
=over 4 |
1260 |
if (!exists $joinTable->{$joinKey}) { |
|
1261 |
# Here there's no join, so we throw an error. |
=item entityName |
1262 |
Confess("No join exists to connect from $lastObject to $thisObject."); |
|
1263 |
|
Name of the entity type for the instance being deleted. |
1264 |
|
|
1265 |
|
=item objectID |
1266 |
|
|
1267 |
|
ID of the entity instance to be deleted. If the ID contains a wild card character (C<%>), |
1268 |
|
then it is presumed to by a LIKE pattern. |
1269 |
|
|
1270 |
|
=item testFlag |
1271 |
|
|
1272 |
|
If TRUE, the delete statements will be traced without being executed. |
1273 |
|
|
1274 |
|
=item RETURN |
1275 |
|
|
1276 |
|
Returns a statistics object indicating how many records of each particular table were |
1277 |
|
deleted. |
1278 |
|
|
1279 |
|
=back |
1280 |
|
|
1281 |
|
=cut |
1282 |
|
#: Return Type $%; |
1283 |
|
sub Delete { |
1284 |
|
# Get the parameters. |
1285 |
|
my ($self, $entityName, $objectID, $testFlag) = @_; |
1286 |
|
# Declare the return variable. |
1287 |
|
my $retVal = Stats->new(); |
1288 |
|
# Get the DBKernel object. |
1289 |
|
my $db = $self->{_dbh}; |
1290 |
|
# We're going to generate all the paths branching out from the starting entity. One of |
1291 |
|
# the things we have to be careful about is preventing loops. We'll use a hash to |
1292 |
|
# determine if we've hit a loop. |
1293 |
|
my %alreadyFound = (); |
1294 |
|
# These next lists will serve as our result stack. We start by pushing object lists onto |
1295 |
|
# the stack, and then popping them off to do the deletes. This means the deletes will |
1296 |
|
# start with the longer paths before getting to the shorter ones. That, in turn, makes |
1297 |
|
# sure we don't delete records that might be needed to forge relationships back to the |
1298 |
|
# original item. We have two lists-- one for TO-relationships, and one for |
1299 |
|
# FROM-relationships and entities. |
1300 |
|
my @fromPathList = (); |
1301 |
|
my @toPathList = (); |
1302 |
|
# This final hash is used to remember what work still needs to be done. We push paths |
1303 |
|
# onto the list, then pop them off to extend the paths. We prime it with the starting |
1304 |
|
# point. Note that we will work hard to insure that the last item on a path in the |
1305 |
|
# TODO list is always an entity. |
1306 |
|
my @todoList = ([$entityName]); |
1307 |
|
while (@todoList) { |
1308 |
|
# Get the current path. |
1309 |
|
my $current = pop @todoList; |
1310 |
|
# Copy it into a list. |
1311 |
|
my @stackedPath = @{$current}; |
1312 |
|
# Pull off the last item on the path. It will always be an entity. |
1313 |
|
my $entityName = pop @stackedPath; |
1314 |
|
# Add it to the alreadyFound list. |
1315 |
|
$alreadyFound{$entityName} = 1; |
1316 |
|
# Get the entity data. |
1317 |
|
my $entityData = $self->_GetStructure($entityName); |
1318 |
|
# The first task is to loop through the entity's relation. A DELETE command will |
1319 |
|
# be needed for each of them. |
1320 |
|
my $relations = $entityData->{Relations}; |
1321 |
|
for my $relation (keys %{$relations}) { |
1322 |
|
my @augmentedList = (@stackedPath, $relation); |
1323 |
|
push @fromPathList, \@augmentedList; |
1324 |
|
} |
1325 |
|
# Now we need to look for relationships connected to this entity. |
1326 |
|
my $relationshipList = $self->{_metaData}->{Relationships}; |
1327 |
|
for my $relationshipName (keys %{$relationshipList}) { |
1328 |
|
my $relationship = $relationshipList->{$relationshipName}; |
1329 |
|
# Check the FROM field. We're only interested if it's us. |
1330 |
|
if ($relationship->{from} eq $entityName) { |
1331 |
|
# Add the path to this relationship. |
1332 |
|
my @augmentedList = (@stackedPath, $entityName, $relationshipName); |
1333 |
|
push @fromPathList, \@augmentedList; |
1334 |
|
# Check the arity. If it's MM we're done. If it's 1M |
1335 |
|
# and the target hasn't been seen yet, we want to |
1336 |
|
# stack the entity for future processing. |
1337 |
|
if ($relationship->{arity} eq '1M') { |
1338 |
|
my $toEntity = $relationship->{to}; |
1339 |
|
if (! exists $alreadyFound{$toEntity}) { |
1340 |
|
# Here we have a new entity that's dependent on |
1341 |
|
# the current entity, so we need to stack it. |
1342 |
|
my @stackList = (@augmentedList, $toEntity); |
1343 |
|
push @fromPathList, \@stackList; |
1344 |
} else { |
} else { |
1345 |
# Get the join clause and add it to the WHERE list. |
Trace("$toEntity ignored because it occurred previously.") if T(4); |
|
push @joinWhere, $joinTable->{$joinKey}; |
|
|
# Save this object as the last object for the next iteration. |
|
|
$lastObject = $thisObject; |
|
1346 |
} |
} |
1347 |
} |
} |
|
# Now we need to handle the whole ORDER BY thing. We'll put the order by clause |
|
|
# in the following variable. |
|
|
my $orderClause = ""; |
|
|
# Locate the ORDER BY verb (if any). |
|
|
if ($filterString =~ m/^(.*)ORDER BY/g) { |
|
|
# Here we have an ORDER BY verb. Split it off of the filter string. |
|
|
my $pos = pos $filterString; |
|
|
$orderClause = substr($filterString, $pos); |
|
|
$filterString = $1; |
|
1348 |
} |
} |
1349 |
# Add the filter and the join clauses (if any) to the SELECT command. |
# Now check the TO field. In this case only the relationship needs |
1350 |
if ($filterString) { |
# deletion. |
1351 |
push @joinWhere, "($filterString)"; |
if ($relationship->{to} eq $entityName) { |
1352 |
|
my @augmentedList = (@stackedPath, $entityName, $relationshipName); |
1353 |
|
push @toPathList, \@augmentedList; |
1354 |
} |
} |
|
if (@joinWhere) { |
|
|
$command .= " WHERE " . join(' AND ', @joinWhere); |
|
1355 |
} |
} |
|
# Add the sort clause (if any) to the SELECT command. |
|
|
if ($orderClause) { |
|
|
$command .= " ORDER BY $orderClause"; |
|
1356 |
} |
} |
1357 |
|
# Create the first qualifier for the WHERE clause. This selects the |
1358 |
|
# keys of the primary entity records to be deleted. When we're deleting |
1359 |
|
# from a dependent table, we construct a join page from the first qualifier |
1360 |
|
# to the table containing the dependent records to delete. |
1361 |
|
my $qualifier = ($objectID =~ /%/ ? "LIKE ?" : "= ?"); |
1362 |
|
# We need to make two passes. The first is through the to-list, and |
1363 |
|
# the second through the from-list. The from-list is second because |
1364 |
|
# the to-list may need to pass through some of the entities the |
1365 |
|
# from-list would delete. |
1366 |
|
my %stackList = ( from_link => \@fromPathList, to_link => \@toPathList ); |
1367 |
|
# Now it's time to do the deletes. We do it in two passes. |
1368 |
|
for my $keyName ('to_link', 'from_link') { |
1369 |
|
# Get the list for this key. |
1370 |
|
my @pathList = @{$stackList{$keyName}}; |
1371 |
|
Trace(scalar(@pathList) . " entries in path list for $keyName.") if T(3); |
1372 |
|
# Loop through this list. |
1373 |
|
while (my $path = pop @pathList) { |
1374 |
|
# Get the table whose rows are to be deleted. |
1375 |
|
my @pathTables = @{$path}; |
1376 |
|
# Start the DELETE statement. We need to call DBKernel because the |
1377 |
|
# syntax of a DELETE-USING varies among DBMSs. |
1378 |
|
my $target = $pathTables[$#pathTables]; |
1379 |
|
my $stmt = $db->SetUsing(@pathTables); |
1380 |
|
# Now start the WHERE. The first thing is the ID field from the starting table. That |
1381 |
|
# starting table will either be the entity relation or one of the entity's |
1382 |
|
# sub-relations. |
1383 |
|
$stmt .= " WHERE $pathTables[0].id $qualifier"; |
1384 |
|
# Now we run through the remaining entities in the path, connecting them up. |
1385 |
|
for (my $i = 1; $i <= $#pathTables; $i += 2) { |
1386 |
|
# Connect the current relationship to the preceding entity. |
1387 |
|
my ($entity, $rel) = @pathTables[$i-1,$i]; |
1388 |
|
# The style of connection depends on the direction of the relationship. |
1389 |
|
$stmt .= " AND $entity.id = $rel.$keyName"; |
1390 |
|
if ($i + 1 <= $#pathTables) { |
1391 |
|
# Here there's a next entity, so connect that to the relationship's |
1392 |
|
# to-link. |
1393 |
|
my $entity2 = $pathTables[$i+1]; |
1394 |
|
$stmt .= " AND $rel.to_link = $entity2.id"; |
1395 |
|
} |
1396 |
|
} |
1397 |
|
# Now we have our desired DELETE statement. |
1398 |
|
if ($testFlag) { |
1399 |
|
# Here the user wants to trace without executing. |
1400 |
|
Trace($stmt) if T(0); |
1401 |
|
} else { |
1402 |
|
# Here we can delete. Note that the SQL method dies with a confessing |
1403 |
|
# if an error occurs, so we just go ahead and do it. |
1404 |
|
Trace("Executing delete from $target using '$objectID'.") if T(3); |
1405 |
|
my $rv = $db->SQL($stmt, 0, $objectID); |
1406 |
|
# Accumulate the statistics for this delete. The only rows deleted |
1407 |
|
# are from the target table, so we use its name to record the |
1408 |
|
# statistic. |
1409 |
|
$retVal->Add($target, $rv); |
1410 |
} |
} |
1411 |
Trace("SQL query: $command") if T(2); |
} |
1412 |
Trace("PARMS: '" . (join "', '", @params) . "'") if (T(3) && (@params > 0)); |
} |
1413 |
my $sth = $dbh->prepare_command($command); |
# Return the result. |
|
# Execute it with the parameters bound in. |
|
|
$sth->execute(@params) || Confess("SELECT error" . $sth->errstr()); |
|
|
# Return the statement object. |
|
|
my $retVal = DBQuery::_new($self, $sth, @{$objectNames}); |
|
1414 |
return $retVal; |
return $retVal; |
1415 |
} |
} |
1416 |
|
|
1417 |
=head3 ComputeObjectSentence |
=head3 SortNeeded |
1418 |
|
|
1419 |
C<< my $sentence = $database->ComputeObjectSentence($objectName); >> |
C<< my $parms = $erdb->SortNeeded($relationName); >> |
1420 |
|
|
1421 |
Check an object name, and if it is a relationship convert it to a relationship sentence. |
Return the pipe command for the sort that should be applied to the specified |
1422 |
|
relation when creating the load file. |
1423 |
|
|
1424 |
|
For example, if the load file should be sorted ascending by the first |
1425 |
|
field, this method would return |
1426 |
|
|
1427 |
|
sort -k1 -t"\t" |
1428 |
|
|
1429 |
|
If the first field is numeric, the method would return |
1430 |
|
|
1431 |
|
sort -k1n -t"\t" |
1432 |
|
|
1433 |
|
Unfortunately, due to a bug in the C<sort> command, we cannot eliminate duplicate |
1434 |
|
keys using a sort. |
1435 |
|
|
1436 |
=over 4 |
=over 4 |
1437 |
|
|
1438 |
=item objectName |
=item relationName |
1439 |
|
|
1440 |
Name of the entity or relationship. |
Name of the relation to be examined. |
1441 |
|
|
1442 |
=item RETURN |
=item |
1443 |
|
|
1444 |
Returns a string containing the entity name or a relationship sentence. |
Returns the sort command to use for sorting the relation, suitable for piping. |
1445 |
|
|
1446 |
=back |
=back |
1447 |
|
|
1448 |
=cut |
=cut |
1449 |
|
#: Return Type $; |
1450 |
sub ComputeObjectSentence { |
sub SortNeeded { |
1451 |
# Get the parameters. |
# Get the parameters. |
1452 |
my ($self, $objectName) = @_; |
my ($self, $relationName) = @_; |
1453 |
# Set the default return value. |
# Declare a descriptor to hold the names of the key fields. |
1454 |
my $retVal = $objectName; |
my @keyNames = (); |
1455 |
# Look for the object as a relationship. |
# Get the relation structure. |
1456 |
my $relTable = $self->{_metaData}->{Relationships}; |
my $relationData = $self->_FindRelation($relationName); |
1457 |
if (exists $relTable->{$objectName}) { |
# Find out if the relation is a primary entity relation, |
1458 |
# Get the relationship sentence. |
# a relationship relation, or a secondary entity relation. |
1459 |
$retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); |
my $entityTable = $self->{_metaData}->{Entities}; |
1460 |
|
my $relationshipTable = $self->{_metaData}->{Relationships}; |
1461 |
|
if (exists $entityTable->{$relationName}) { |
1462 |
|
# Here we have a primary entity relation. |
1463 |
|
push @keyNames, "id"; |
1464 |
|
} elsif (exists $relationshipTable->{$relationName}) { |
1465 |
|
# Here we have a relationship. We sort using the FROM index. |
1466 |
|
my $relationshipData = $relationshipTable->{$relationName}; |
1467 |
|
my $index = $relationData->{Indexes}->{"idx${relationName}From"}; |
1468 |
|
push @keyNames, @{$index->{IndexFields}}; |
1469 |
|
} else { |
1470 |
|
# Here we have a secondary entity relation, so we have a sort on the ID field. |
1471 |
|
push @keyNames, "id"; |
1472 |
|
} |
1473 |
|
# Now we parse the key names into sort parameters. First, we prime the return |
1474 |
|
# string. |
1475 |
|
my $retVal = "sort -t\"\t\" "; |
1476 |
|
# Get the relation's field list. |
1477 |
|
my @fields = @{$relationData->{Fields}}; |
1478 |
|
# Loop through the keys. |
1479 |
|
for my $keyData (@keyNames) { |
1480 |
|
# Get the key and the ordering. |
1481 |
|
my ($keyName, $ordering); |
1482 |
|
if ($keyData =~ /^([^ ]+) DESC/) { |
1483 |
|
($keyName, $ordering) = ($1, "descending"); |
1484 |
|
} else { |
1485 |
|
($keyName, $ordering) = ($keyData, "ascending"); |
1486 |
|
} |
1487 |
|
# Find the key's position and type. |
1488 |
|
my $fieldSpec; |
1489 |
|
for (my $i = 0; $i <= $#fields && ! $fieldSpec; $i++) { |
1490 |
|
my $thisField = $fields[$i]; |
1491 |
|
if ($thisField->{name} eq $keyName) { |
1492 |
|
# Get the sort modifier for this field type. The modifier |
1493 |
|
# decides whether we're using a character, numeric, or |
1494 |
|
# floating-point sort. |
1495 |
|
my $modifier = $TypeTable{$thisField->{type}}->{sort}; |
1496 |
|
# If the index is descending for this field, denote we want |
1497 |
|
# to reverse the sort order on this field. |
1498 |
|
if ($ordering eq 'descending') { |
1499 |
|
$modifier .= "r"; |
1500 |
|
} |
1501 |
|
# Store the position and modifier into the field spec, which |
1502 |
|
# will stop the inner loop. Note that the field number is |
1503 |
|
# 1-based in the sort command, so we have to increment the |
1504 |
|
# index. |
1505 |
|
$fieldSpec = ($i + 1) . $modifier; |
1506 |
|
} |
1507 |
|
} |
1508 |
|
# Add this field to the sort command. |
1509 |
|
$retVal .= " -k$fieldSpec"; |
1510 |
} |
} |
1511 |
# Return the result. |
# Return the result. |
1512 |
return $retVal; |
return $retVal; |
1513 |
} |
} |
1514 |
|
|
1515 |
=head3 DumpRelations |
=head3 GetList |
1516 |
|
|
1517 |
C<< $database->DumpRelations($outputDirectory); >> |
C<< my @dbObjects = $erdb->GetList(\@objectNames, $filterClause, \@params); >> |
1518 |
|
|
1519 |
Write the contents of all the relations to tab-delimited files in the specified directory. |
Return a list of object descriptors for the specified objects as determined by the |
1520 |
Each file will have the same name as the relation dumped, with an extension of DTX. |
specified filter clause. |
1521 |
|
|
1522 |
|
This method is essentially the same as L</Get> except it returns a list of objects rather |
1523 |
|
than a query object that can be used to get the results one record at a time. |
1524 |
|
|
1525 |
=over 4 |
=over 4 |
1526 |
|
|
1527 |
=item outputDirectory |
=item objectNames |
1528 |
|
|
1529 |
Name of the directory into which the relation files should be dumped. |
List containing the names of the entity and relationship objects to be retrieved. |
1530 |
|
|
1531 |
|
=item filterClause |
1532 |
|
|
1533 |
|
WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can |
1534 |
|
be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be |
1535 |
|
specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified |
1536 |
|
in the filter clause should be added to the parameter list as additional parameters. The |
1537 |
|
fields in a filter clause can come from primary entity relations, relationship relations, |
1538 |
|
or secondary entity relations; however, all of the entities and relationships involved must |
1539 |
|
be included in the list of object names. |
1540 |
|
|
1541 |
|
The filter clause can also specify a sort order. To do this, simply follow the filter string |
1542 |
|
with an ORDER BY clause. For example, the following filter string gets all genomes for a |
1543 |
|
particular genus and sorts them by species name. |
1544 |
|
|
1545 |
|
C<< "Genome(genus) = ? ORDER BY Genome(species)" >> |
1546 |
|
|
1547 |
|
The rules for field references in a sort order are the same as those for field references in the |
1548 |
|
filter clause in general; however, odd things may happen if a sort field is from a secondary |
1549 |
|
relation. |
1550 |
|
|
1551 |
|
=item params |
1552 |
|
|
1553 |
|
Reference to a list of parameter values to be substituted into the filter clause. |
1554 |
|
|
1555 |
|
=item RETURN |
1556 |
|
|
1557 |
|
Returns a list of B<DBObject>s that satisfy the query conditions. |
1558 |
|
|
1559 |
=back |
=back |
1560 |
|
|
1561 |
=cut |
=cut |
1562 |
|
#: Return Type @% |
1563 |
sub DumpRelations { |
sub GetList { |
1564 |
# Get the parameters. |
# Get the parameters. |
1565 |
my ($self, $outputDirectory) = @_; |
my ($self, $objectNames, $filterClause, $params) = @_; |
1566 |
# Now we need to run through all the relations. First, we loop through the entities. |
# Declare the return variable. |
1567 |
my $metaData = $self->{_metaData}; |
my @retVal = (); |
1568 |
my $entities = $metaData->{Entities}; |
# Perform the query. |
1569 |
while (my ($entityName, $entityStructure) = each %{$entities}) { |
my $query = $self->Get($objectNames, $filterClause, $params); |
1570 |
# Get the entity's relations. |
# Loop through the results. |
1571 |
my $relationList = $entityStructure->{Relations}; |
while (my $object = $query->Fetch) { |
1572 |
# Loop through the relations, dumping them. |
push @retVal, $object; |
|
while (my ($relationName, $relation) = each %{$relationList}) { |
|
|
$self->_DumpRelation($outputDirectory, $relationName, $relation); |
|
|
} |
|
|
} |
|
|
# Next, we loop through the relationships. |
|
|
my $relationships = $metaData->{Relationships}; |
|
|
while (my ($relationshipName, $relationshipStructure) = each %{$relationships}) { |
|
|
# Dump this relationship's relation. |
|
|
$self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); |
|
1573 |
} |
} |
1574 |
|
# Return the result. |
1575 |
|
return @retVal; |
1576 |
} |
} |
1577 |
|
|
1578 |
=head3 InsertObject |
=head3 GetCount |
1579 |
|
|
1580 |
C<< my $ok = $database->InsertObject($objectType, \%fieldHash); >> |
C<< my $count = $erdb->GetCount(\@objectNames, $filter, \@params); >> |
1581 |
|
|
1582 |
Insert an object into the database. The object is defined by a type name and then a hash |
Return the number of rows found by a specified query. This method would |
1583 |
of field names to values. Field values in the primary relation are represented by scalars. |
normally be used to count the records in a single table. For example, in a |
1584 |
(Note that for relationships, the primary relation is the B<only> relation.) |
genetics database |
|
Field values for the other relations comprising the entity are always list references. For |
|
|
example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases |
|
|
C<ZP_00210270.1> and C<gi|46206278>. |
|
1585 |
|
|
1586 |
C<< $database->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); >> |
my $count = $erdb->GetCount(['Genome'], 'Genome(genus-species) LIKE ?', ['homo %']); |
1587 |
|
|
1588 |
The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and |
would return the number of genomes for the genus I<homo>. It is conceivable, however, |
1589 |
property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. |
to use it to return records based on a join. For example, |
1590 |
|
|
1591 |
|
my $count = $erdb->GetCount(['HasFeature', 'Genome'], 'Genome(genus-species) LIKE ?', |
1592 |
|
['homo %']); |
1593 |
|
|
1594 |
C<< $database->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence = 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); >> |
would return the number of features for genomes in the genus I<homo>. Note that |
1595 |
|
only the rows from the first table are counted. If the above command were |
1596 |
|
|
1597 |
|
my $count = $erdb->GetCount(['Genome', 'Feature'], 'Genome(genus-species) LIKE ?', |
1598 |
|
['homo %']); |
1599 |
|
|
1600 |
|
it would return the number of genomes, not the number of genome/feature pairs. |
1601 |
|
|
1602 |
=over 4 |
=over 4 |
1603 |
|
|
1604 |
=item newObjectType |
=item objectNames |
1605 |
|
|
1606 |
Type name of the object to insert. |
Reference to a list of the objects (entities and relationships) included in the |
1607 |
|
query. |
1608 |
|
|
1609 |
=item fieldHash |
=item filter |
1610 |
|
|
1611 |
Hash of field names to values. |
A filter clause for restricting the query. The rules are the same as for the L</Get> |
1612 |
|
method. |
1613 |
|
|
1614 |
|
=item params |
1615 |
|
|
1616 |
|
Reference to a list of the parameter values to be substituted for the parameter marks |
1617 |
|
in the filter. |
1618 |
|
|
1619 |
=item RETURN |
=item RETURN |
1620 |
|
|
1621 |
Returns 1 if successful, 0 if an error occurred. |
Returns a count of the number of records in the first table that would satisfy |
1622 |
|
the query. |
1623 |
|
|
1624 |
=back |
=back |
1625 |
|
|
1626 |
=cut |
=cut |
1627 |
|
|
1628 |
sub InsertObject { |
sub GetCount { |
1629 |
# Get the parameters. |
# Get the parameters. |
1630 |
my ($self, $newObjectType, $fieldHash) = @_; |
my ($self, $objectNames, $filter, $params) = @_; |
1631 |
# Denote that so far we appear successful. |
# Insure the params argument is an array reference if the caller left it off. |
1632 |
my $retVal = 1; |
if (! defined($params)) { |
1633 |
# Get the database handle. |
$params = []; |
1634 |
my $dbh = $self->{_dbh}; |
} |
1635 |
# Get the relation list. |
# Declare the return variable. |
1636 |
my $relationTable = $self->_GetRelationTable($newObjectType); |
my $retVal; |
1637 |
# Loop through the relations. We'll build insert statements for each one. If a relation is |
# Find out if we're counting an entity or a relationship. |
1638 |
# secondary, we may end up generating multiple insert statements. If an error occurs, we |
my $countedField; |
1639 |
|
if ($self->IsEntity($objectNames->[0])) { |
1640 |
|
$countedField = "id"; |
1641 |
|
} else { |
1642 |
|
# For a relationship we count the to-link because it's usually more |
1643 |
|
# numerous. Note we're automatically converting to the SQL form |
1644 |
|
# of the field name (to_link vs. to-link). |
1645 |
|
$countedField = "to_link"; |
1646 |
|
} |
1647 |
|
# Create the SQL command suffix to get the desired records. |
1648 |
|
my ($suffix, $mappedNameListRef, $mappedNameHashRef) = $self->_SetupSQL($objectNames, |
1649 |
|
$filter); |
1650 |
|
# Prefix it with text telling it we want a record count. |
1651 |
|
my $firstObject = $mappedNameListRef->[0]; |
1652 |
|
my $command = "SELECT COUNT($firstObject.$countedField) $suffix"; |
1653 |
|
# Prepare and execute the command. |
1654 |
|
my $sth = $self->_GetStatementHandle($command, $params); |
1655 |
|
# Get the count value. |
1656 |
|
($retVal) = $sth->fetchrow_array(); |
1657 |
|
# Check for a problem. |
1658 |
|
if (! defined($retVal)) { |
1659 |
|
if ($sth->err) { |
1660 |
|
# Here we had an SQL error. |
1661 |
|
Confess("Error retrieving row count: " . $sth->errstr()); |
1662 |
|
} else { |
1663 |
|
# Here we have no result. |
1664 |
|
Confess("No result attempting to retrieve row count."); |
1665 |
|
} |
1666 |
|
} |
1667 |
|
# Return the result. |
1668 |
|
return $retVal; |
1669 |
|
} |
1670 |
|
|
1671 |
|
=head3 ComputeObjectSentence |
1672 |
|
|
1673 |
|
C<< my $sentence = $erdb->ComputeObjectSentence($objectName); >> |
1674 |
|
|
1675 |
|
Check an object name, and if it is a relationship convert it to a relationship sentence. |
1676 |
|
|
1677 |
|
=over 4 |
1678 |
|
|
1679 |
|
=item objectName |
1680 |
|
|
1681 |
|
Name of the entity or relationship. |
1682 |
|
|
1683 |
|
=item RETURN |
1684 |
|
|
1685 |
|
Returns a string containing the entity name or a relationship sentence. |
1686 |
|
|
1687 |
|
=back |
1688 |
|
|
1689 |
|
=cut |
1690 |
|
|
1691 |
|
sub ComputeObjectSentence { |
1692 |
|
# Get the parameters. |
1693 |
|
my ($self, $objectName) = @_; |
1694 |
|
# Set the default return value. |
1695 |
|
my $retVal = $objectName; |
1696 |
|
# Look for the object as a relationship. |
1697 |
|
my $relTable = $self->{_metaData}->{Relationships}; |
1698 |
|
if (exists $relTable->{$objectName}) { |
1699 |
|
# Get the relationship sentence. |
1700 |
|
$retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); |
1701 |
|
} |
1702 |
|
# Return the result. |
1703 |
|
return $retVal; |
1704 |
|
} |
1705 |
|
|
1706 |
|
=head3 DumpRelations |
1707 |
|
|
1708 |
|
C<< $erdb->DumpRelations($outputDirectory); >> |
1709 |
|
|
1710 |
|
Write the contents of all the relations to tab-delimited files in the specified directory. |
1711 |
|
Each file will have the same name as the relation dumped, with an extension of DTX. |
1712 |
|
|
1713 |
|
=over 4 |
1714 |
|
|
1715 |
|
=item outputDirectory |
1716 |
|
|
1717 |
|
Name of the directory into which the relation files should be dumped. |
1718 |
|
|
1719 |
|
=back |
1720 |
|
|
1721 |
|
=cut |
1722 |
|
|
1723 |
|
sub DumpRelations { |
1724 |
|
# Get the parameters. |
1725 |
|
my ($self, $outputDirectory) = @_; |
1726 |
|
# Now we need to run through all the relations. First, we loop through the entities. |
1727 |
|
my $metaData = $self->{_metaData}; |
1728 |
|
my $entities = $metaData->{Entities}; |
1729 |
|
for my $entityName (keys %{$entities}) { |
1730 |
|
my $entityStructure = $entities->{$entityName}; |
1731 |
|
# Get the entity's relations. |
1732 |
|
my $relationList = $entityStructure->{Relations}; |
1733 |
|
# Loop through the relations, dumping them. |
1734 |
|
for my $relationName (keys %{$relationList}) { |
1735 |
|
my $relation = $relationList->{$relationName}; |
1736 |
|
$self->_DumpRelation($outputDirectory, $relationName, $relation); |
1737 |
|
} |
1738 |
|
} |
1739 |
|
# Next, we loop through the relationships. |
1740 |
|
my $relationships = $metaData->{Relationships}; |
1741 |
|
for my $relationshipName (keys %{$relationships}) { |
1742 |
|
my $relationshipStructure = $relationships->{$relationshipName}; |
1743 |
|
# Dump this relationship's relation. |
1744 |
|
$self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); |
1745 |
|
} |
1746 |
|
} |
1747 |
|
|
1748 |
|
=head3 InsertValue |
1749 |
|
|
1750 |
|
C<< $erdb->InsertValue($entityID, $fieldName, $value); >> |
1751 |
|
|
1752 |
|
This method will insert a new value into the database. The value must be one |
1753 |
|
associated with a secondary relation, since primary values cannot be inserted: |
1754 |
|
they occur exactly once. Secondary values, on the other hand, can be missing |
1755 |
|
or multiply-occurring. |
1756 |
|
|
1757 |
|
=over 4 |
1758 |
|
|
1759 |
|
=item entityID |
1760 |
|
|
1761 |
|
ID of the object that is to receive the new value. |
1762 |
|
|
1763 |
|
=item fieldName |
1764 |
|
|
1765 |
|
Field name for the new value-- this includes the entity name, since |
1766 |
|
field names are of the format I<objectName>C<(>I<fieldName>C<)>. |
1767 |
|
|
1768 |
|
=item value |
1769 |
|
|
1770 |
|
New value to be put in the field. |
1771 |
|
|
1772 |
|
=back |
1773 |
|
|
1774 |
|
=cut |
1775 |
|
|
1776 |
|
sub InsertValue { |
1777 |
|
# Get the parameters. |
1778 |
|
my ($self, $entityID, $fieldName, $value) = @_; |
1779 |
|
# Parse the entity name and the real field name. |
1780 |
|
if ($fieldName =~ /^([^(]+)\(([^)]+)\)/) { |
1781 |
|
my $entityName = $1; |
1782 |
|
my $fieldTitle = $2; |
1783 |
|
# Get its descriptor. |
1784 |
|
if (!$self->IsEntity($entityName)) { |
1785 |
|
Confess("$entityName is not a valid entity."); |
1786 |
|
} else { |
1787 |
|
my $entityData = $self->{_metaData}->{Entities}->{$entityName}; |
1788 |
|
# Find the relation containing this field. |
1789 |
|
my $fieldHash = $entityData->{Fields}; |
1790 |
|
if (! exists $fieldHash->{$fieldTitle}) { |
1791 |
|
Confess("$fieldTitle not found in $entityName."); |
1792 |
|
} else { |
1793 |
|
my $relation = $fieldHash->{$fieldTitle}->{relation}; |
1794 |
|
if ($relation eq $entityName) { |
1795 |
|
Confess("Cannot do InsertValue on primary field $fieldTitle of $entityName."); |
1796 |
|
} else { |
1797 |
|
# Now we can create an INSERT statement. |
1798 |
|
my $dbh = $self->{_dbh}; |
1799 |
|
my $fixedName = _FixName($fieldTitle); |
1800 |
|
my $statement = "INSERT INTO $relation (id, $fixedName) VALUES(?, ?)"; |
1801 |
|
# Execute the command. |
1802 |
|
$dbh->SQL($statement, 0, $entityID, $value); |
1803 |
|
} |
1804 |
|
} |
1805 |
|
} |
1806 |
|
} else { |
1807 |
|
Confess("$fieldName is not a valid field name."); |
1808 |
|
} |
1809 |
|
} |
1810 |
|
|
1811 |
|
=head3 InsertObject |
1812 |
|
|
1813 |
|
C<< my $ok = $erdb->InsertObject($objectType, \%fieldHash); >> |
1814 |
|
|
1815 |
|
Insert an object into the database. The object is defined by a type name and then a hash |
1816 |
|
of field names to values. Field values in the primary relation are represented by scalars. |
1817 |
|
(Note that for relationships, the primary relation is the B<only> relation.) |
1818 |
|
Field values for the other relations comprising the entity are always list references. For |
1819 |
|
example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases |
1820 |
|
C<ZP_00210270.1> and C<gi|46206278>. |
1821 |
|
|
1822 |
|
C<< $erdb->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); >> |
1823 |
|
|
1824 |
|
The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and |
1825 |
|
property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. |
1826 |
|
|
1827 |
|
C<< $erdb->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence => 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); >> |
1828 |
|
|
1829 |
|
=over 4 |
1830 |
|
|
1831 |
|
=item newObjectType |
1832 |
|
|
1833 |
|
Type name of the object to insert. |
1834 |
|
|
1835 |
|
=item fieldHash |
1836 |
|
|
1837 |
|
Hash of field names to values. |
1838 |
|
|
1839 |
|
=item RETURN |
1840 |
|
|
1841 |
|
Returns 1 if successful, 0 if an error occurred. |
1842 |
|
|
1843 |
|
=back |
1844 |
|
|
1845 |
|
=cut |
1846 |
|
|
1847 |
|
sub InsertObject { |
1848 |
|
# Get the parameters. |
1849 |
|
my ($self, $newObjectType, $fieldHash) = @_; |
1850 |
|
# Denote that so far we appear successful. |
1851 |
|
my $retVal = 1; |
1852 |
|
# Get the database handle. |
1853 |
|
my $dbh = $self->{_dbh}; |
1854 |
|
# Get the relation list. |
1855 |
|
my $relationTable = $self->_GetRelationTable($newObjectType); |
1856 |
|
# Loop through the relations. We'll build insert statements for each one. If a relation is |
1857 |
|
# secondary, we may end up generating multiple insert statements. If an error occurs, we |
1858 |
# stop the loop. |
# stop the loop. |
1859 |
while ($retVal && (my ($relationName, $relationDefinition) = each %{$relationTable})) { |
my @relationList = keys %{$relationTable}; |
1860 |
|
for (my $i = 0; $retVal && $i <= $#relationList; $i++) { |
1861 |
|
my $relationName = $relationList[$i]; |
1862 |
|
my $relationDefinition = $relationTable->{$relationName}; |
1863 |
# Get the relation's fields. For each field we will collect a value in the corresponding |
# Get the relation's fields. For each field we will collect a value in the corresponding |
1864 |
# position of the @valueList array. If one of the fields is missing, we will add it to the |
# position of the @valueList array. If one of the fields is missing, we will add it to the |
1865 |
# @missing list. |
# @missing list. |
1934 |
push @parameterList, $value; |
push @parameterList, $value; |
1935 |
} |
} |
1936 |
} |
} |
1937 |
# Execute the INSERT statement with the specified parameter list. |
# Execute the INSERT statement with the specified parameter list. |
1938 |
$retVal = $sth->execute(@parameterList); |
$retVal = $sth->execute(@parameterList); |
1939 |
if (!$retVal) { |
if (!$retVal) { |
1940 |
my $errorString = $sth->errstr(); |
my $errorString = $sth->errstr(); |
1941 |
Trace("Insert error: $errorString.") if T(0); |
Trace("Insert error: $errorString.") if T(0); |
1942 |
|
} |
1943 |
|
} |
1944 |
|
} |
1945 |
|
} |
1946 |
|
# Return the success indicator. |
1947 |
|
return $retVal; |
1948 |
|
} |
1949 |
|
|
1950 |
|
=head3 LoadTable |
1951 |
|
|
1952 |
|
C<< my %results = $erdb->LoadTable($fileName, $relationName, $truncateFlag); >> |
1953 |
|
|
1954 |
|
Load data from a tab-delimited file into a specified table, optionally re-creating the table |
1955 |
|
first. |
1956 |
|
|
1957 |
|
=over 4 |
1958 |
|
|
1959 |
|
=item fileName |
1960 |
|
|
1961 |
|
Name of the file from which the table data should be loaded. |
1962 |
|
|
1963 |
|
=item relationName |
1964 |
|
|
1965 |
|
Name of the relation to be loaded. This is the same as the table name. |
1966 |
|
|
1967 |
|
=item truncateFlag |
1968 |
|
|
1969 |
|
TRUE if the table should be dropped and re-created, else FALSE |
1970 |
|
|
1971 |
|
=item RETURN |
1972 |
|
|
1973 |
|
Returns a statistical object containing a list of the error messages. |
1974 |
|
|
1975 |
|
=back |
1976 |
|
|
1977 |
|
=cut |
1978 |
|
sub LoadTable { |
1979 |
|
# Get the parameters. |
1980 |
|
my ($self, $fileName, $relationName, $truncateFlag) = @_; |
1981 |
|
# Create the statistical return object. |
1982 |
|
my $retVal = _GetLoadStats(); |
1983 |
|
# Trace the fact of the load. |
1984 |
|
Trace("Loading table $relationName from $fileName") if T(2); |
1985 |
|
# Get the database handle. |
1986 |
|
my $dbh = $self->{_dbh}; |
1987 |
|
# Get the input file size. |
1988 |
|
my $fileSize = -s $fileName; |
1989 |
|
# Get the relation data. |
1990 |
|
my $relation = $self->_FindRelation($relationName); |
1991 |
|
# Check the truncation flag. |
1992 |
|
if ($truncateFlag) { |
1993 |
|
Trace("Creating table $relationName") if T(2); |
1994 |
|
# Compute the row count estimate. We take the size of the load file, |
1995 |
|
# divide it by the estimated row size, and then multiply by 1.5 to |
1996 |
|
# leave extra room. We postulate a minimum row count of 1000 to |
1997 |
|
# prevent problems with incoming empty load files. |
1998 |
|
my $rowSize = $self->EstimateRowSize($relationName); |
1999 |
|
my $estimate = FIG::max($fileSize * 1.5 / $rowSize, 1000); |
2000 |
|
# Re-create the table without its index. |
2001 |
|
$self->CreateTable($relationName, 0, $estimate); |
2002 |
|
# If this is a pre-index DBMS, create the index here. |
2003 |
|
if ($dbh->{_preIndex}) { |
2004 |
|
eval { |
2005 |
|
$self->CreateIndex($relationName); |
2006 |
|
}; |
2007 |
|
if ($@) { |
2008 |
|
$retVal->AddMessage($@); |
2009 |
|
} |
2010 |
|
} |
2011 |
|
} |
2012 |
|
# Load the table. |
2013 |
|
my $rv; |
2014 |
|
eval { |
2015 |
|
$rv = $dbh->load_table(file => $fileName, tbl => $relationName); |
2016 |
|
}; |
2017 |
|
if (!defined $rv) { |
2018 |
|
$retVal->AddMessage($@) if ($@); |
2019 |
|
$retVal->AddMessage("Table load failed for $relationName using $fileName."); |
2020 |
|
Trace("Table load failed for $relationName.") if T(1); |
2021 |
|
} else { |
2022 |
|
# Here we successfully loaded the table. |
2023 |
|
$retVal->Add("tables"); |
2024 |
|
my $size = -s $fileName; |
2025 |
|
Trace("$size bytes loaded into $relationName.") if T(2); |
2026 |
|
# If we're rebuilding, we need to create the table indexes. |
2027 |
|
if ($truncateFlag) { |
2028 |
|
# Indexes are created here for PostGres. For PostGres, indexes are |
2029 |
|
# best built at the end. For MySQL, the reverse is true. |
2030 |
|
if (! $dbh->{_preIndex}) { |
2031 |
|
eval { |
2032 |
|
$self->CreateIndex($relationName); |
2033 |
|
}; |
2034 |
|
if ($@) { |
2035 |
|
$retVal->AddMessage($@); |
2036 |
|
} |
2037 |
|
} |
2038 |
|
# The full-text index (if any) is always built last, even for MySQL. |
2039 |
|
# First we need to see if this table has a full-text index. Only |
2040 |
|
# primary relations are allowed that privilege. |
2041 |
|
if ($self->_IsPrimary($relationName)) { |
2042 |
|
# Get the relation's entity/relationship structure. |
2043 |
|
my $structure = $self->_GetStructure($relationName); |
2044 |
|
# Check for a searchable fields list. |
2045 |
|
if (exists $structure->{searchFields}) { |
2046 |
|
# Here we know that we need to create a full-text search index. |
2047 |
|
# Get an SQL-formatted field name list. |
2048 |
|
my $fields = join(", ", $self->_FixNames(@{$structure->{searchFields}})); |
2049 |
|
# Create the index. |
2050 |
|
$dbh->create_index(tbl => $relationName, idx => "search_idx_$relationName", |
2051 |
|
flds => $fields, kind => 'fulltext'); |
2052 |
|
} |
2053 |
|
} |
2054 |
|
} |
2055 |
|
} |
2056 |
|
# Analyze the table to improve performance. |
2057 |
|
Trace("Analyzing and compacting $relationName.") if T(3); |
2058 |
|
$dbh->vacuum_it($relationName); |
2059 |
|
Trace("$relationName load completed.") if T(3); |
2060 |
|
# Return the statistics. |
2061 |
|
return $retVal; |
2062 |
|
} |
2063 |
|
|
2064 |
|
=head3 GenerateEntity |
2065 |
|
|
2066 |
|
C<< my $fieldHash = $erdb->GenerateEntity($id, $type, \%values); >> |
2067 |
|
|
2068 |
|
Generate the data for a new entity instance. This method creates a field hash suitable for |
2069 |
|
passing as a parameter to L</InsertObject>. The ID is specified by the callr, but the rest |
2070 |
|
of the fields are generated using information in the database schema. |
2071 |
|
|
2072 |
|
Each data type has a default algorithm for generating random test data. This can be overridden |
2073 |
|
by including a B<DataGen> element in the field. If this happens, the content of the element is |
2074 |
|
executed as a PERL program in the context of this module. The element may make use of a C<$this> |
2075 |
|
variable which contains the field hash as it has been built up to the current point. If any |
2076 |
|
fields are dependent on other fields, the C<pass> attribute can be used to control the order |
2077 |
|
in which the fields are generated. A field with a high data pass number will be generated after |
2078 |
|
a field with a lower one. If any external values are needed, they should be passed in via the |
2079 |
|
optional third parameter, which will be available to the data generation script under the name |
2080 |
|
C<$value>. Several useful utility methods are provided for generating random values, including |
2081 |
|
L</IntGen>, L</StringGen>, L</FloatGen>, and L</DateGen>. Note that dates are stored and generated |
2082 |
|
in the form of a timestamp number rather than a string. |
2083 |
|
|
2084 |
|
=over 4 |
2085 |
|
|
2086 |
|
=item id |
2087 |
|
|
2088 |
|
ID to assign to the new entity. |
2089 |
|
|
2090 |
|
=item type |
2091 |
|
|
2092 |
|
Type name for the new entity. |
2093 |
|
|
2094 |
|
=item values |
2095 |
|
|
2096 |
|
Hash containing additional values that might be needed by the data generation methods (optional). |
2097 |
|
|
2098 |
|
=back |
2099 |
|
|
2100 |
|
=cut |
2101 |
|
|
2102 |
|
sub GenerateEntity { |
2103 |
|
# Get the parameters. |
2104 |
|
my ($self, $id, $type, $values) = @_; |
2105 |
|
# Create the return hash. |
2106 |
|
my $this = { id => $id }; |
2107 |
|
# Get the metadata structure. |
2108 |
|
my $metadata = $self->{_metaData}; |
2109 |
|
# Get this entity's list of fields. |
2110 |
|
if (!exists $metadata->{Entities}->{$type}) { |
2111 |
|
Confess("Unrecognized entity type $type in GenerateEntity."); |
2112 |
|
} else { |
2113 |
|
my $entity = $metadata->{Entities}->{$type}; |
2114 |
|
my $fields = $entity->{Fields}; |
2115 |
|
# Generate data from the fields. |
2116 |
|
_GenerateFields($this, $fields, $type, $values); |
2117 |
|
} |
2118 |
|
# Return the hash created. |
2119 |
|
return $this; |
2120 |
|
} |
2121 |
|
|
2122 |
|
=head3 GetEntity |
2123 |
|
|
2124 |
|
C<< my $entityObject = $erdb->GetEntity($entityType, $ID); >> |
2125 |
|
|
2126 |
|
Return an object describing the entity instance with a specified ID. |
2127 |
|
|
2128 |
|
=over 4 |
2129 |
|
|
2130 |
|
=item entityType |
2131 |
|
|
2132 |
|
Entity type name. |
2133 |
|
|
2134 |
|
=item ID |
2135 |
|
|
2136 |
|
ID of the desired entity. |
2137 |
|
|
2138 |
|
=item RETURN |
2139 |
|
|
2140 |
|
Returns a B<DBObject> representing the desired entity instance, or an undefined value if no |
2141 |
|
instance is found with the specified key. |
2142 |
|
|
2143 |
|
=back |
2144 |
|
|
2145 |
|
=cut |
2146 |
|
|
2147 |
|
sub GetEntity { |
2148 |
|
# Get the parameters. |
2149 |
|
my ($self, $entityType, $ID) = @_; |
2150 |
|
# Create a query. |
2151 |
|
my $query = $self->Get([$entityType], "$entityType(id) = ?", [$ID]); |
2152 |
|
# Get the first (and only) object. |
2153 |
|
my $retVal = $query->Fetch(); |
2154 |
|
# Return the result. |
2155 |
|
return $retVal; |
2156 |
|
} |
2157 |
|
|
2158 |
|
=head3 GetChoices |
2159 |
|
|
2160 |
|
C<< my @values = $erdb->GetChoices($entityName, $fieldName); >> |
2161 |
|
|
2162 |
|
Return a list of all the values for the specified field that are represented in the |
2163 |
|
specified entity. |
2164 |
|
|
2165 |
|
Note that if the field is not indexed, then this will be a very slow operation. |
2166 |
|
|
2167 |
|
=over 4 |
2168 |
|
|
2169 |
|
=item entityName |
2170 |
|
|
2171 |
|
Name of an entity in the database. |
2172 |
|
|
2173 |
|
=item fieldName |
2174 |
|
|
2175 |
|
Name of a field belonging to the entity. This is a raw field name without |
2176 |
|
the standard parenthesized notation used in most calls. |
2177 |
|
|
2178 |
|
=item RETURN |
2179 |
|
|
2180 |
|
Returns a list of the distinct values for the specified field in the database. |
2181 |
|
|
2182 |
|
=back |
2183 |
|
|
2184 |
|
=cut |
2185 |
|
|
2186 |
|
sub GetChoices { |
2187 |
|
# Get the parameters. |
2188 |
|
my ($self, $entityName, $fieldName) = @_; |
2189 |
|
# Declare the return variable. |
2190 |
|
my @retVal; |
2191 |
|
# Get the entity data structure. |
2192 |
|
my $entityData = $self->_GetStructure($entityName); |
2193 |
|
# Get the field. |
2194 |
|
my $fieldHash = $entityData->{Fields}; |
2195 |
|
if (! exists $fieldHash->{$fieldName}) { |
2196 |
|
Confess("$fieldName not found in $entityName."); |
2197 |
|
} else { |
2198 |
|
# Get the name of the relation containing the field. |
2199 |
|
my $relation = $fieldHash->{$fieldName}->{relation}; |
2200 |
|
# Fix up the field name. |
2201 |
|
my $realName = _FixName($fieldName); |
2202 |
|
# Get the database handle. |
2203 |
|
my $dbh = $self->{_dbh}; |
2204 |
|
# Query the database. |
2205 |
|
my $results = $dbh->SQL("SELECT DISTINCT $realName FROM $relation"); |
2206 |
|
# Clean the results. They are stored as a list of lists, and we just want the one list. |
2207 |
|
@retVal = sort map { $_->[0] } @{$results}; |
2208 |
|
} |
2209 |
|
# Return the result. |
2210 |
|
return @retVal; |
2211 |
|
} |
2212 |
|
|
2213 |
|
=head3 GetEntityValues |
2214 |
|
|
2215 |
|
C<< my @values = $erdb->GetEntityValues($entityType, $ID, \@fields); >> |
2216 |
|
|
2217 |
|
Return a list of values from a specified entity instance. If the entity instance |
2218 |
|
does not exist, an empty list is returned. |
2219 |
|
|
2220 |
|
=over 4 |
2221 |
|
|
2222 |
|
=item entityType |
2223 |
|
|
2224 |
|
Entity type name. |
2225 |
|
|
2226 |
|
=item ID |
2227 |
|
|
2228 |
|
ID of the desired entity. |
2229 |
|
|
2230 |
|
=item fields |
2231 |
|
|
2232 |
|
List of field names, each of the form I<objectName>C<(>I<fieldName>C<)>. |
2233 |
|
|
2234 |
|
=item RETURN |
2235 |
|
|
2236 |
|
Returns a flattened list of the values of the specified fields for the specified entity. |
2237 |
|
|
2238 |
|
=back |
2239 |
|
|
2240 |
|
=cut |
2241 |
|
|
2242 |
|
sub GetEntityValues { |
2243 |
|
# Get the parameters. |
2244 |
|
my ($self, $entityType, $ID, $fields) = @_; |
2245 |
|
# Get the specified entity. |
2246 |
|
my $entity = $self->GetEntity($entityType, $ID); |
2247 |
|
# Declare the return list. |
2248 |
|
my @retVal = (); |
2249 |
|
# If we found the entity, push the values into the return list. |
2250 |
|
if ($entity) { |
2251 |
|
push @retVal, $entity->Values($fields); |
2252 |
|
} |
2253 |
|
# Return the result. |
2254 |
|
return @retVal; |
2255 |
|
} |
2256 |
|
|
2257 |
|
=head3 GetAll |
2258 |
|
|
2259 |
|
C<< my @list = $erdb->GetAll(\@objectNames, $filterClause, \@parameters, \@fields, $count); >> |
2260 |
|
|
2261 |
|
Return a list of values taken from the objects returned by a query. The first three |
2262 |
|
parameters correspond to the parameters of the L</Get> method. The final parameter is |
2263 |
|
a list of the fields desired from each record found by the query. The field name |
2264 |
|
syntax is the standard syntax used for fields in the B<ERDB> system-- |
2265 |
|
B<I<objectName>(I<fieldName>)>-- where I<objectName> is the name of the relevant entity |
2266 |
|
or relationship and I<fieldName> is the name of the field. |
2267 |
|
|
2268 |
|
The list returned will be a list of lists. Each element of the list will contain |
2269 |
|
the values returned for the fields specified in the fourth parameter. If one of the |
2270 |
|
fields specified returns multiple values, they are flattened in with the rest. For |
2271 |
|
example, the following call will return a list of the features in a particular |
2272 |
|
spreadsheet cell, and each feature will be represented by a list containing the |
2273 |
|
feature ID followed by all of its aliases. |
2274 |
|
|
2275 |
|
C<< $query = $erdb->Get(['ContainsFeature', 'Feature'], "ContainsFeature(from-link) = ?", [$ssCellID], ['Feature(id)', 'Feature(alias)']); >> |
2276 |
|
|
2277 |
|
=over 4 |
2278 |
|
|
2279 |
|
=item objectNames |
2280 |
|
|
2281 |
|
List containing the names of the entity and relationship objects to be retrieved. |
2282 |
|
|
2283 |
|
=item filterClause |
2284 |
|
|
2285 |
|
WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can |
2286 |
|
be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form |
2287 |
|
B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the |
2288 |
|
parameter list as additional parameters. The fields in a filter clause can come from primary |
2289 |
|
entity relations, relationship relations, or secondary entity relations; however, all of the |
2290 |
|
entities and relationships involved must be included in the list of object names. |
2291 |
|
|
2292 |
|
=item parameterList |
2293 |
|
|
2294 |
|
List of the parameters to be substituted in for the parameters marks in the filter clause. |
2295 |
|
|
2296 |
|
=item fields |
2297 |
|
|
2298 |
|
List of the fields to be returned in each element of the list returned. |
2299 |
|
|
2300 |
|
=item count |
2301 |
|
|
2302 |
|
Maximum number of records to return. If omitted or 0, all available records will be returned. |
2303 |
|
|
2304 |
|
=item RETURN |
2305 |
|
|
2306 |
|
Returns a list of list references. Each element of the return list contains the values for the |
2307 |
|
fields specified in the B<fields> parameter. |
2308 |
|
|
2309 |
|
=back |
2310 |
|
|
2311 |
|
=cut |
2312 |
|
#: Return Type @@; |
2313 |
|
sub GetAll { |
2314 |
|
# Get the parameters. |
2315 |
|
my ($self, $objectNames, $filterClause, $parameterList, $fields, $count) = @_; |
2316 |
|
# Translate the parameters from a list reference to a list. If the parameter |
2317 |
|
# list is a scalar we convert it into a singleton list. |
2318 |
|
my @parmList = (); |
2319 |
|
if (ref $parameterList eq "ARRAY") { |
2320 |
|
Trace("GetAll parm list is an array.") if T(4); |
2321 |
|
@parmList = @{$parameterList}; |
2322 |
|
} else { |
2323 |
|
Trace("GetAll parm list is a scalar: $parameterList.") if T(4); |
2324 |
|
push @parmList, $parameterList; |
2325 |
|
} |
2326 |
|
# Insure the counter has a value. |
2327 |
|
if (!defined $count) { |
2328 |
|
$count = 0; |
2329 |
|
} |
2330 |
|
# Add the row limit to the filter clause. |
2331 |
|
if ($count > 0) { |
2332 |
|
$filterClause .= " LIMIT $count"; |
2333 |
|
} |
2334 |
|
# Create the query. |
2335 |
|
my $query = $self->Get($objectNames, $filterClause, \@parmList); |
2336 |
|
# Set up a counter of the number of records read. |
2337 |
|
my $fetched = 0; |
2338 |
|
# Loop through the records returned, extracting the fields. Note that if the |
2339 |
|
# counter is non-zero, we stop when the number of records read hits the count. |
2340 |
|
my @retVal = (); |
2341 |
|
while (($count == 0 || $fetched < $count) && (my $row = $query->Fetch())) { |
2342 |
|
my @rowData = $row->Values($fields); |
2343 |
|
push @retVal, \@rowData; |
2344 |
|
$fetched++; |
2345 |
|
} |
2346 |
|
Trace("$fetched rows returned in GetAll.") if T(SQL => 4); |
2347 |
|
# Return the resulting list. |
2348 |
|
return @retVal; |
2349 |
|
} |
2350 |
|
|
2351 |
|
=head3 Exists |
2352 |
|
|
2353 |
|
C<< my $found = $sprout->Exists($entityName, $entityID); >> |
2354 |
|
|
2355 |
|
Return TRUE if an entity exists, else FALSE. |
2356 |
|
|
2357 |
|
=over 4 |
2358 |
|
|
2359 |
|
=item entityName |
2360 |
|
|
2361 |
|
Name of the entity type (e.g. C<Feature>) relevant to the existence check. |
2362 |
|
|
2363 |
|
=item entityID |
2364 |
|
|
2365 |
|
ID of the entity instance whose existence is to be checked. |
2366 |
|
|
2367 |
|
=item RETURN |
2368 |
|
|
2369 |
|
Returns TRUE if the entity instance exists, else FALSE. |
2370 |
|
|
2371 |
|
=back |
2372 |
|
|
2373 |
|
=cut |
2374 |
|
#: Return Type $; |
2375 |
|
sub Exists { |
2376 |
|
# Get the parameters. |
2377 |
|
my ($self, $entityName, $entityID) = @_; |
2378 |
|
# Check for the entity instance. |
2379 |
|
Trace("Checking existence of $entityName with ID=$entityID.") if T(4); |
2380 |
|
my $testInstance = $self->GetEntity($entityName, $entityID); |
2381 |
|
# Return an existence indicator. |
2382 |
|
my $retVal = ($testInstance ? 1 : 0); |
2383 |
|
return $retVal; |
2384 |
|
} |
2385 |
|
|
2386 |
|
=head3 EstimateRowSize |
2387 |
|
|
2388 |
|
C<< my $rowSize = $erdb->EstimateRowSize($relName); >> |
2389 |
|
|
2390 |
|
Estimate the row size of the specified relation. The estimated row size is computed by adding |
2391 |
|
up the average length for each data type. |
2392 |
|
|
2393 |
|
=over 4 |
2394 |
|
|
2395 |
|
=item relName |
2396 |
|
|
2397 |
|
Name of the relation whose estimated row size is desired. |
2398 |
|
|
2399 |
|
=item RETURN |
2400 |
|
|
2401 |
|
Returns an estimate of the row size for the specified relation. |
2402 |
|
|
2403 |
|
=back |
2404 |
|
|
2405 |
|
=cut |
2406 |
|
#: Return Type $; |
2407 |
|
sub EstimateRowSize { |
2408 |
|
# Get the parameters. |
2409 |
|
my ($self, $relName) = @_; |
2410 |
|
# Declare the return variable. |
2411 |
|
my $retVal = 0; |
2412 |
|
# Find the relation descriptor. |
2413 |
|
my $relation = $self->_FindRelation($relName); |
2414 |
|
# Get the list of fields. |
2415 |
|
for my $fieldData (@{$relation->{Fields}}) { |
2416 |
|
# Get the field type and add its length. |
2417 |
|
my $fieldLen = $TypeTable{$fieldData->{type}}->{avgLen}; |
2418 |
|
$retVal += $fieldLen; |
2419 |
|
} |
2420 |
|
# Return the result. |
2421 |
|
return $retVal; |
2422 |
|
} |
2423 |
|
|
2424 |
|
=head3 GetFieldTable |
2425 |
|
|
2426 |
|
C<< my $fieldHash = $self->GetFieldTable($objectnName); >> |
2427 |
|
|
2428 |
|
Get the field structure for a specified entity or relationship. |
2429 |
|
|
2430 |
|
=over 4 |
2431 |
|
|
2432 |
|
=item objectName |
2433 |
|
|
2434 |
|
Name of the desired entity or relationship. |
2435 |
|
|
2436 |
|
=item RETURN |
2437 |
|
|
2438 |
|
The table containing the field descriptors for the specified object. |
2439 |
|
|
2440 |
|
=back |
2441 |
|
|
2442 |
|
=cut |
2443 |
|
|
2444 |
|
sub GetFieldTable { |
2445 |
|
# Get the parameters. |
2446 |
|
my ($self, $objectName) = @_; |
2447 |
|
# Get the descriptor from the metadata. |
2448 |
|
my $objectData = $self->_GetStructure($objectName); |
2449 |
|
# Return the object's field table. |
2450 |
|
return $objectData->{Fields}; |
2451 |
|
} |
2452 |
|
|
2453 |
|
=head2 Data Mining Methods |
2454 |
|
|
2455 |
|
=head3 GetUsefulCrossValues |
2456 |
|
|
2457 |
|
C<< my @attrNames = $sprout->GetUsefulCrossValues($sourceEntity, $relationship); >> |
2458 |
|
|
2459 |
|
Return a list of the useful attributes that would be returned by a B<Cross> call |
2460 |
|
from an entity of the source entity type through the specified relationship. This |
2461 |
|
means it will return the fields of the target entity type and the intersection data |
2462 |
|
fields in the relationship. Only primary table fields are returned. In other words, |
2463 |
|
the field names returned will be for fields where there is always one and only one |
2464 |
|
value. |
2465 |
|
|
2466 |
|
=over 4 |
2467 |
|
|
2468 |
|
=item sourceEntity |
2469 |
|
|
2470 |
|
Name of the entity from which the relationship crossing will start. |
2471 |
|
|
2472 |
|
=item relationship |
2473 |
|
|
2474 |
|
Name of the relationship being crossed. |
2475 |
|
|
2476 |
|
=item RETURN |
2477 |
|
|
2478 |
|
Returns a list of field names in Sprout field format (I<objectName>C<(>I<fieldName>C<)>. |
2479 |
|
|
2480 |
|
=back |
2481 |
|
|
2482 |
|
=cut |
2483 |
|
#: Return Type @; |
2484 |
|
sub GetUsefulCrossValues { |
2485 |
|
# Get the parameters. |
2486 |
|
my ($self, $sourceEntity, $relationship) = @_; |
2487 |
|
# Declare the return variable. |
2488 |
|
my @retVal = (); |
2489 |
|
# Determine the target entity for the relationship. This is whichever entity is not |
2490 |
|
# the source entity. So, if the source entity is the FROM, we'll get the name of |
2491 |
|
# the TO, and vice versa. |
2492 |
|
my $relStructure = $self->_GetStructure($relationship); |
2493 |
|
my $targetEntityType = ($relStructure->{from} eq $sourceEntity ? "to" : "from"); |
2494 |
|
my $targetEntity = $relStructure->{$targetEntityType}; |
2495 |
|
# Get the field table for the entity. |
2496 |
|
my $entityFields = $self->GetFieldTable($targetEntity); |
2497 |
|
# The field table is a hash. The hash key is the field name. The hash value is a structure. |
2498 |
|
# For the entity fields, the key aspect of the target structure is that the {relation} value |
2499 |
|
# must match the entity name. |
2500 |
|
my @fieldList = map { "$targetEntity($_)" } grep { $entityFields->{$_}->{relation} eq $targetEntity } |
2501 |
|
keys %{$entityFields}; |
2502 |
|
# Push the fields found onto the return variable. |
2503 |
|
push @retVal, sort @fieldList; |
2504 |
|
# Get the field table for the relationship. |
2505 |
|
my $relationshipFields = $self->GetFieldTable($relationship); |
2506 |
|
# Here we have a different rule. We want all the fields other than "from-link" and "to-link". |
2507 |
|
# This may end up being an empty set. |
2508 |
|
my @fieldList2 = map { "$relationship($_)" } grep { $_ ne "from-link" && $_ ne "to-link" } |
2509 |
|
keys %{$relationshipFields}; |
2510 |
|
# Push these onto the return list. |
2511 |
|
push @retVal, sort @fieldList2; |
2512 |
|
# Return the result. |
2513 |
|
return @retVal; |
2514 |
|
} |
2515 |
|
|
2516 |
|
=head3 FindColumn |
2517 |
|
|
2518 |
|
C<< my $colIndex = ERDB::FindColumn($headerLine, $columnIdentifier); >> |
2519 |
|
|
2520 |
|
Return the location a desired column in a data mining header line. The data |
2521 |
|
mining header line is a tab-separated list of column names. The column |
2522 |
|
identifier is either the numerical index of a column or the actual column |
2523 |
|
name. |
2524 |
|
|
2525 |
|
=over 4 |
2526 |
|
|
2527 |
|
=item headerLine |
2528 |
|
|
2529 |
|
The header line from a data mining command, which consists of a tab-separated |
2530 |
|
list of column names. |
2531 |
|
|
2532 |
|
=item columnIdentifier |
2533 |
|
|
2534 |
|
Either the ordinal number of the desired column (1-based), or the name of the |
2535 |
|
desired column. |
2536 |
|
|
2537 |
|
=item RETURN |
2538 |
|
|
2539 |
|
Returns the array index (0-based) of the desired column. |
2540 |
|
|
2541 |
|
=back |
2542 |
|
|
2543 |
|
=cut |
2544 |
|
|
2545 |
|
sub FindColumn { |
2546 |
|
# Get the parameters. |
2547 |
|
my ($headerLine, $columnIdentifier) = @_; |
2548 |
|
# Declare the return variable. |
2549 |
|
my $retVal; |
2550 |
|
# Split the header line into column names. |
2551 |
|
my @headers = ParseColumns($headerLine); |
2552 |
|
# Determine whether we have a number or a name. |
2553 |
|
if ($columnIdentifier =~ /^\d+$/) { |
2554 |
|
# Here we have a number. Subtract 1 and validate the result. |
2555 |
|
$retVal = $columnIdentifier - 1; |
2556 |
|
if ($retVal < 0 || $retVal > $#headers) { |
2557 |
|
Confess("Invalid column identifer \"$columnIdentifier\": value out of range."); |
2558 |
|
} |
2559 |
|
} else { |
2560 |
|
# Here we have a name. We need to find it in the list. |
2561 |
|
for (my $i = 0; $i <= $#headers && ! defined($retVal); $i++) { |
2562 |
|
if ($headers[$i] eq $columnIdentifier) { |
2563 |
|
$retVal = $i; |
2564 |
} |
} |
2565 |
} |
} |
2566 |
|
if (! defined($retVal)) { |
2567 |
|
Confess("Invalid column identifier \"$columnIdentifier\": value not found."); |
2568 |
} |
} |
2569 |
} |
} |
2570 |
# Return the success indicator. |
# Return the result. |
2571 |
return $retVal; |
return $retVal; |
2572 |
} |
} |
2573 |
|
|
2574 |
=head3 LoadTable |
=head3 ParseColumns |
2575 |
|
|
2576 |
C<< my %results = $database->LoadTable($fileName, $relationName, $truncateFlag); >> |
C<< my @columns = ERDB::ParseColumns($line); >> |
2577 |
|
|
2578 |
Load data from a tab-delimited file into a specified table, optionally re-creating the table first. |
Convert the specified data line to a list of columns. |
2579 |
|
|
2580 |
=over 4 |
=over 4 |
2581 |
|
|
2582 |
=item fileName |
=item line |
2583 |
|
|
2584 |
Name of the file from which the table data should be loaded. |
A data mining input, consisting of a tab-separated list of columns terminated by a |
2585 |
|
new-line. |
2586 |
|
|
2587 |
=item relationName |
=item RETURN |
2588 |
|
|
2589 |
Name of the relation to be loaded. This is the same as the table name. |
Returns a list consisting of the column values. |
2590 |
|
|
2591 |
=item truncateFlag |
=back |
2592 |
|
|
2593 |
TRUE if the table should be dropped and re-created, else FALSE |
=cut |
2594 |
|
|
2595 |
|
sub ParseColumns { |
2596 |
|
# Get the parameters. |
2597 |
|
my ($line) = @_; |
2598 |
|
# Chop off the line-end. |
2599 |
|
chomp $line; |
2600 |
|
# Split it into a list. |
2601 |
|
my @retVal = split(/\t/, $line); |
2602 |
|
# Return the result. |
2603 |
|
return @retVal; |
2604 |
|
} |
2605 |
|
|
2606 |
|
=head2 Internal Utility Methods |
2607 |
|
|
2608 |
|
=head3 _RelationMap |
2609 |
|
|
2610 |
|
C<< my @relationMap = _RelationMap($mappedNameHashRef, $mappedNameListRef); >> |
2611 |
|
|
2612 |
|
Create the relation map for an SQL query. The relation map is used by B<DBObject> |
2613 |
|
to determine how to interpret the results of the query. |
2614 |
|
|
2615 |
|
=over 4 |
2616 |
|
|
2617 |
|
=item mappedNameHashRef |
2618 |
|
|
2619 |
|
Reference to a hash that maps modified object names to real object names. |
2620 |
|
|
2621 |
|
=item mappedNameListRef |
2622 |
|
|
2623 |
|
Reference to a list of modified object names in the order they appear in the |
2624 |
|
SELECT list. |
2625 |
|
|
2626 |
=item RETURN |
=item RETURN |
2627 |
|
|
2628 |
Returns a statistical object containing the number of records read and a list of the error messages. |
Returns a list of 2-tuples. Each tuple consists of an object name as used in the |
2629 |
|
query followed by the actual name of that object. This enables the B<DBObject> to |
2630 |
|
determine the order of the tables in the query and which object name belongs to each |
2631 |
|
mapped object name. Most of the time these two values are the same; however, if a |
2632 |
|
relation occurs twice in the query, the relation name in the field list and WHERE |
2633 |
|
clause will use a mapped name (generally the actual relation name with a numeric |
2634 |
|
suffix) that does not match the actual relation name. |
2635 |
|
|
2636 |
=back |
=back |
2637 |
|
|
2638 |
=cut |
=cut |
2639 |
sub LoadTable { |
|
2640 |
|
sub _RelationMap { |
2641 |
# Get the parameters. |
# Get the parameters. |
2642 |
my ($self, $fileName, $relationName, $truncateFlag) = @_; |
my ($mappedNameHashRef, $mappedNameListRef) = @_; |
2643 |
# Create the statistical return object. |
# Declare the return variable. |
2644 |
my $retVal = _GetLoadStats(); |
my @retVal = (); |
2645 |
# Trace the fact of the load. |
# Build the map. |
2646 |
Trace("Loading table $relationName from $fileName") if T(1); |
for my $mappedName (@{$mappedNameListRef}) { |
2647 |
# Get the database handle. |
push @retVal, [$mappedName, $mappedNameHashRef->{$mappedName}]; |
|
my $dbh = $self->{_dbh}; |
|
|
# Get the relation data. |
|
|
my $relation = $self->_FindRelation($relationName); |
|
|
# Check the truncation flag. |
|
|
if ($truncateFlag) { |
|
|
Trace("Creating table $relationName") if T(1); |
|
|
# Re-create the table without its index. |
|
|
$self->CreateTable($relationName, 0); |
|
|
} |
|
|
# Determine whether or not this is a primary relation. Primary relations have an extra |
|
|
# field indicating whether or not a given object is new or was loaded from the flat files. |
|
|
my $primary = $self->_IsPrimary($relationName); |
|
|
# Get the number of fields in this relation. |
|
|
my @fieldList = @{$relation->{Fields}}; |
|
|
my $fieldCount = @fieldList; |
|
|
# Record the number of expected fields. |
|
|
my $expectedFields = $fieldCount + ($primary ? 1 : 0); |
|
|
# Start a database transaction. |
|
|
$dbh->begin_tran; |
|
|
# Open the relation file. We need to create a cleaned-up copy before loading. |
|
|
open TABLEIN, '<', $fileName; |
|
|
my $tempName = "$fileName.tbl"; |
|
|
open TABLEOUT, '>', $tempName; |
|
|
# Loop through the file. |
|
|
while (<TABLEIN>) { |
|
|
# Chop off the new-line character. |
|
|
my $record = $_; |
|
|
chomp $record; |
|
|
# Only proceed if the record is non-blank. |
|
|
if ($record) { |
|
|
# Escape all the backslashes found in the line. |
|
|
$record =~ s/\\/\\\\/g; |
|
|
# Eliminate any trailing tabs. |
|
|
chop $record while substr($record, -1) eq "\t"; |
|
|
# If this is a primary relation, add a 0 for the new-record flag (indicating that |
|
|
# this record is not new, but part of the original load). |
|
|
if ($primary) { |
|
|
$record .= "\t0"; |
|
|
} |
|
|
# Write the record. |
|
|
print TABLEOUT "$record\n"; |
|
|
# Count the record read. |
|
|
my $count = $retVal->Add('records'); |
|
|
my $len = length $record; |
|
|
Trace("Record $count written with $len characters.") if T(4); |
|
|
} |
|
|
} |
|
|
# Close the files. |
|
|
close TABLEIN; |
|
|
close TABLEOUT; |
|
|
Trace("Temporary file $tempName created.") if T(4); |
|
|
# Load the table. |
|
|
my $rv; |
|
|
eval { |
|
|
$rv = $dbh->load_table(file => $tempName, tbl => $relationName); |
|
|
}; |
|
|
if (!defined $rv) { |
|
|
$retVal->AddMessage($@) if ($@); |
|
|
$retVal->AddMessage("Table load failed for $relationName using $tempName."); |
|
|
Trace("Table load failed for $relationName.") if T(1); |
|
|
} else { |
|
|
# Here we successfully loaded the table. Trace the number of records loaded. |
|
|
Trace("$retVal->{records} records read for $relationName.") if T(1); |
|
|
# If we're rebuilding, we need to create the table indexes. |
|
|
if ($truncateFlag) { |
|
|
eval { |
|
|
$self->CreateIndex($relationName); |
|
|
}; |
|
|
if ($@) { |
|
|
$retVal->AddMessage($@); |
|
|
} |
|
|
} |
|
2648 |
} |
} |
2649 |
# Commit the database changes. |
# Return it. |
2650 |
$dbh->commit_tran; |
return @retVal; |
|
# Delete the temporary file. |
|
|
unlink $tempName; |
|
|
# Return the statistics. |
|
|
return $retVal; |
|
2651 |
} |
} |
2652 |
|
|
|
=head3 GenerateEntity |
|
2653 |
|
|
2654 |
C<< my $fieldHash = $database->GenerateEntity($id, $type, \%values); >> |
=head3 _SetupSQL |
2655 |
|
|
2656 |
Generate the data for a new entity instance. This method creates a field hash suitable for |
Process a list of object names and a filter clause so that they can be used to |
2657 |
passing as a parameter to L</InsertObject>. The ID is specified by the callr, but the rest |
build an SQL statement. This method takes in a reference to a list of object names |
2658 |
of the fields are generated using information in the database schema. |
and a filter clause. It will return a corrected filter clause, a list of mapped |
2659 |
|
names and the mapped name hash. |
2660 |
|
|
2661 |
Each data type has a default algorithm for generating random test data. This can be overridden |
This is an instance method. |
|
by including a B<DataGen> element in the field. If this happens, the content of the element is |
|
|
executed as a PERL program in the context of this module. The element may make use of a C<$this> |
|
|
variable which contains the field hash as it has been built up to the current point. If any |
|
|
fields are dependent on other fields, the C<pass> attribute can be used to control the order |
|
|
in which the fields are generated. A field with a high data pass number will be generated after |
|
|
a field with a lower one. If any external values are needed, they should be passed in via the |
|
|
optional third parameter, which will be available to the data generation script under the name |
|
|
C<$value>. Several useful utility methods are provided for generating random values, including |
|
|
L</IntGen>, L</StringGen>, L</FloatGen>, and L</DateGen>. Note that dates are stored and generated |
|
|
in the form of a timestamp number rather than a string. |
|
2662 |
|
|
2663 |
=over 4 |
=over 4 |
2664 |
|
|
2665 |
=item id |
=item objectNames |
2666 |
|
|
2667 |
ID to assign to the new entity. |
Reference to a list of the object names to be included in the query. |
2668 |
|
|
2669 |
=item type |
=item filterClause |
2670 |
|
|
2671 |
Type name for the new entity. |
A string containing the WHERE clause for the query (without the C<WHERE>) and also |
2672 |
|
optionally the C<ORDER BY> and C<LIMIT> clauses. |
2673 |
|
|
2674 |
=item values |
=item matchClause |
2675 |
|
|
2676 |
Hash containing additional values that might be needed by the data generation methods (optional). |
An optional full-text search clause. If specified, it will be inserted at the |
2677 |
|
front of the WHERE clause. It should already be SQL-formatted; that is, the |
2678 |
|
field names should be in the form I<table>C<.>I<fieldName>. |
2679 |
|
|
2680 |
|
=item RETURN |
2681 |
|
|
2682 |
|
Returns a three-element list. The first element is the SQL statement suffix, beginning |
2683 |
|
with the FROM clause. The second element is a reference to a list of the names to be |
2684 |
|
used in retrieving the fields. The third element is a hash mapping the names to the |
2685 |
|
objects they represent. |
2686 |
|
|
2687 |
=back |
=back |
2688 |
|
|
2689 |
=cut |
=cut |
2690 |
|
|
2691 |
sub GenerateEntity { |
sub _SetupSQL { |
2692 |
# Get the parameters. |
my ($self, $objectNames, $filterClause, $matchClause) = @_; |
2693 |
my ($self, $id, $type, $values) = @_; |
# Adjust the list of object names to account for multiple occurrences of the |
2694 |
# Create the return hash. |
# same object. We start with a hash table keyed on object name that will |
2695 |
my $this = { id => $id }; |
# return the object suffix. The first time an object is encountered it will |
2696 |
# Get the metadata structure. |
# not be found in the hash. The next time the hash will map the object name |
2697 |
my $metadata = $self->{_metaData}; |
# to 2, then 3, and so forth. |
2698 |
# Get this entity's list of fields. |
my %objectHash = (); |
2699 |
if (!exists $metadata->{Entities}->{$type}) { |
# This list will contain the object names as they are to appear in the |
2700 |
Confess("Unrecognized entity type $type in GenerateEntity."); |
# FROM list. |
2701 |
|
my @fromList = (); |
2702 |
|
# This list contains the suffixed object name for each object. It is exactly |
2703 |
|
# parallel to the list in the $objectNames parameter. |
2704 |
|
my @mappedNameList = (); |
2705 |
|
# Finally, this hash translates from a mapped name to its original object name. |
2706 |
|
my %mappedNameHash = (); |
2707 |
|
# Now we create the lists. Note that for every single name we push something into |
2708 |
|
# @fromList and @mappedNameList. This insures that those two arrays are exactly |
2709 |
|
# parallel to $objectNames. |
2710 |
|
for my $objectName (@{$objectNames}) { |
2711 |
|
# Get the next suffix for this object. |
2712 |
|
my $suffix = $objectHash{$objectName}; |
2713 |
|
if (! $suffix) { |
2714 |
|
# Here we are seeing the object for the first time. The object name |
2715 |
|
# is used as is. |
2716 |
|
push @mappedNameList, $objectName; |
2717 |
|
push @fromList, $objectName; |
2718 |
|
$mappedNameHash{$objectName} = $objectName; |
2719 |
|
# Denote the next suffix will be 2. |
2720 |
|
$objectHash{$objectName} = 2; |
2721 |
} else { |
} else { |
2722 |
my $entity = $metadata->{Entities}->{$type}; |
# Here we've seen the object before. We construct a new name using |
2723 |
my $fields = $entity->{Fields}; |
# the suffix from the hash and update the hash. |
2724 |
# Generate data from the fields. |
my $mappedName = "$objectName$suffix"; |
2725 |
_GenerateFields($this, $fields, $type, $values); |
$objectHash{$objectName} = $suffix + 1; |
2726 |
|
# The FROM list has the object name followed by the mapped name. This |
2727 |
|
# tells SQL it's still the same table, but we're using a different name |
2728 |
|
# for it to avoid confusion. |
2729 |
|
push @fromList, "$objectName $mappedName"; |
2730 |
|
# The mapped-name list contains the real mapped name. |
2731 |
|
push @mappedNameList, $mappedName; |
2732 |
|
# Finally, enable us to get back from the mapped name to the object name. |
2733 |
|
$mappedNameHash{$mappedName} = $objectName; |
2734 |
} |
} |
2735 |
# Return the hash created. |
} |
2736 |
return $this; |
# Begin the SELECT suffix. It starts with |
2737 |
|
# |
2738 |
|
# FROM name1, name2, ... nameN |
2739 |
|
# |
2740 |
|
my $suffix = "FROM " . join(', ', @fromList); |
2741 |
|
# Now for the WHERE. First, we need a place for the filter string. |
2742 |
|
my $filterString = ""; |
2743 |
|
# We will also keep a list of conditions to add to the WHERE clause in order to link |
2744 |
|
# entities and relationships as well as primary relations to secondary ones. |
2745 |
|
my @joinWhere = (); |
2746 |
|
# Check for a filter clause. |
2747 |
|
if ($filterClause) { |
2748 |
|
# Here we have one, so we convert its field names and add it to the query. First, |
2749 |
|
# We create a copy of the filter string we can work with. |
2750 |
|
$filterString = $filterClause; |
2751 |
|
# Next, we sort the object names by length. This helps protect us from finding |
2752 |
|
# object names inside other object names when we're doing our search and replace. |
2753 |
|
my @sortedNames = sort { length($b) - length($a) } @mappedNameList; |
2754 |
|
# The final preparatory step is to create a hash table of relation names. The |
2755 |
|
# table begins with the relation names already in the SELECT command. We may |
2756 |
|
# need to add relations later if there is filtering on a field in a secondary |
2757 |
|
# relation. The secondary relations are the ones that contain multiply- |
2758 |
|
# occurring or optional fields. |
2759 |
|
my %fromNames = map { $_ => 1 } @sortedNames; |
2760 |
|
# We are ready to begin. We loop through the object names, replacing each |
2761 |
|
# object name's field references by the corresponding SQL field reference. |
2762 |
|
# Along the way, if we find a secondary relation, we will need to add it |
2763 |
|
# to the FROM clause. |
2764 |
|
for my $mappedName (@sortedNames) { |
2765 |
|
# Get the length of the object name plus 2. This is the value we add to the |
2766 |
|
# size of the field name to determine the size of the field reference as a |
2767 |
|
# whole. |
2768 |
|
my $nameLength = 2 + length $mappedName; |
2769 |
|
# Get the real object name for this mapped name. |
2770 |
|
my $objectName = $mappedNameHash{$mappedName}; |
2771 |
|
Trace("Processing $mappedName for object $objectName.") if T(4); |
2772 |
|
# Get the object's field list. |
2773 |
|
my $fieldList = $self->GetFieldTable($objectName); |
2774 |
|
# Find the field references for this object. |
2775 |
|
while ($filterString =~ m/$mappedName\(([^)]*)\)/g) { |
2776 |
|
# At this point, $1 contains the field name, and the current position |
2777 |
|
# is set immediately after the final parenthesis. We pull out the name of |
2778 |
|
# the field and the position and length of the field reference as a whole. |
2779 |
|
my $fieldName = $1; |
2780 |
|
my $len = $nameLength + length $fieldName; |
2781 |
|
my $pos = pos($filterString) - $len; |
2782 |
|
# Insure the field exists. |
2783 |
|
if (!exists $fieldList->{$fieldName}) { |
2784 |
|
Confess("Field $fieldName not found for object $objectName."); |
2785 |
|
} else { |
2786 |
|
Trace("Processing $fieldName at position $pos.") if T(4); |
2787 |
|
# Get the field's relation. |
2788 |
|
my $relationName = $fieldList->{$fieldName}->{relation}; |
2789 |
|
# Now we have a secondary relation. We need to insure it matches the |
2790 |
|
# mapped name of the primary relation. First we peel off the suffix |
2791 |
|
# from the mapped name. |
2792 |
|
my $mappingSuffix = substr $mappedName, length($objectName); |
2793 |
|
# Put the mapping suffix onto the relation name to get the |
2794 |
|
# mapped relation name. |
2795 |
|
my $mappedRelationName = "$relationName$mappingSuffix"; |
2796 |
|
# Insure the relation is in the FROM clause. |
2797 |
|
if (!exists $fromNames{$mappedRelationName}) { |
2798 |
|
# Add the relation to the FROM clause. |
2799 |
|
if ($mappedRelationName eq $relationName) { |
2800 |
|
# The name is un-mapped, so we add it without |
2801 |
|
# any frills. |
2802 |
|
$suffix .= ", $relationName"; |
2803 |
|
push @joinWhere, "$objectName.id = $relationName.id"; |
2804 |
|
} else { |
2805 |
|
# Here we have a mapping situation. |
2806 |
|
$suffix .= ", $relationName $mappedRelationName"; |
2807 |
|
push @joinWhere, "$mappedRelationName.id = $mappedName.id"; |
2808 |
|
} |
2809 |
|
# Denote we have this relation available for future fields. |
2810 |
|
$fromNames{$mappedRelationName} = 1; |
2811 |
|
} |
2812 |
|
# Form an SQL field reference from the relation name and the field name. |
2813 |
|
my $sqlReference = "$mappedRelationName." . _FixName($fieldName); |
2814 |
|
# Put it into the filter string in place of the old value. |
2815 |
|
substr($filterString, $pos, $len) = $sqlReference; |
2816 |
|
# Reposition the search. |
2817 |
|
pos $filterString = $pos + length $sqlReference; |
2818 |
|
} |
2819 |
|
} |
2820 |
|
} |
2821 |
|
} |
2822 |
|
# The next step is to join the objects together. We only need to do this if there |
2823 |
|
# is more than one object in the object list. We start with the first object and |
2824 |
|
# run through the objects after it. Note also that we make a safety copy of the |
2825 |
|
# list before running through it, because we shift off the first object before |
2826 |
|
# processing the rest. |
2827 |
|
my @mappedObjectList = @mappedNameList; |
2828 |
|
my $lastMappedObject = shift @mappedObjectList; |
2829 |
|
# Get the join table. |
2830 |
|
my $joinTable = $self->{_metaData}->{Joins}; |
2831 |
|
# Loop through the object list. |
2832 |
|
for my $thisMappedObject (@mappedObjectList) { |
2833 |
|
# Look for a join using the real object names. |
2834 |
|
my $lastObject = $mappedNameHash{$lastMappedObject}; |
2835 |
|
my $thisObject = $mappedNameHash{$thisMappedObject}; |
2836 |
|
my $joinKey = "$lastObject/$thisObject"; |
2837 |
|
if (!exists $joinTable->{$joinKey}) { |
2838 |
|
# Here there's no join, so we throw an error. |
2839 |
|
Confess("No join exists to connect from $lastMappedObject to $thisMappedObject."); |
2840 |
|
} else { |
2841 |
|
# Get the join clause. |
2842 |
|
my $unMappedJoin = $joinTable->{$joinKey}; |
2843 |
|
# Fix the names. |
2844 |
|
$unMappedJoin =~ s/$lastObject/$lastMappedObject/; |
2845 |
|
$unMappedJoin =~ s/$thisObject/$thisMappedObject/; |
2846 |
|
push @joinWhere, $unMappedJoin; |
2847 |
|
# Save this object as the last object for the next iteration. |
2848 |
|
$lastMappedObject = $thisMappedObject; |
2849 |
|
} |
2850 |
|
} |
2851 |
|
# Now we need to handle the whole ORDER BY / LIMIT thing. The important part |
2852 |
|
# here is we want the filter clause to be empty if there's no WHERE filter. |
2853 |
|
# We'll put the ORDER BY / LIMIT clauses in the following variable. |
2854 |
|
my $orderClause = ""; |
2855 |
|
# This is only necessary if we have a filter string in which the ORDER BY |
2856 |
|
# and LIMIT clauses can live. |
2857 |
|
if ($filterString) { |
2858 |
|
# Locate the ORDER BY or LIMIT verbs (if any). We use a non-greedy |
2859 |
|
# operator so that we find the first occurrence of either verb. |
2860 |
|
if ($filterString =~ m/^(.*?)\s*(ORDER BY|LIMIT)/g) { |
2861 |
|
# Here we have an ORDER BY or LIMIT verb. Split it off of the filter string. |
2862 |
|
my $pos = pos $filterString; |
2863 |
|
$orderClause = $2 . substr($filterString, $pos); |
2864 |
|
$filterString = $1; |
2865 |
|
} |
2866 |
|
} |
2867 |
|
# All the things that are supposed to be in the WHERE clause of the |
2868 |
|
# SELECT command need to be put into @joinWhere so we can string them |
2869 |
|
# together. We begin with the match clause. This is important, |
2870 |
|
# because the match clause's parameter mark must precede any parameter |
2871 |
|
# marks in the filter string. |
2872 |
|
if ($matchClause) { |
2873 |
|
push @joinWhere, $matchClause; |
2874 |
|
} |
2875 |
|
# Add the filter string. We put it in parentheses to avoid operator |
2876 |
|
# precedence problems with the match clause or the joins. |
2877 |
|
if ($filterString) { |
2878 |
|
Trace("Filter string is \"$filterString\".") if T(4); |
2879 |
|
push @joinWhere, "($filterString)"; |
2880 |
|
} |
2881 |
|
# String it all together into a big filter clause. |
2882 |
|
if (@joinWhere) { |
2883 |
|
$suffix .= " WHERE " . join(' AND ', @joinWhere); |
2884 |
|
} |
2885 |
|
# Add the sort or limit clause (if any). |
2886 |
|
if ($orderClause) { |
2887 |
|
$suffix .= " $orderClause"; |
2888 |
|
} |
2889 |
|
# Return the suffix, the mapped name list, and the mapped name hash. |
2890 |
|
return ($suffix, \@mappedNameList, \%mappedNameHash); |
2891 |
} |
} |
2892 |
|
|
2893 |
|
=head3 _GetStatementHandle |
2894 |
|
|
2895 |
=head2 Internal Utility Methods |
This method will prepare and execute an SQL query, returning the statement handle. |
2896 |
|
The main reason for doing this here is so that everybody who does SQL queries gets |
2897 |
|
the benefit of tracing. |
2898 |
|
|
2899 |
|
This is an instance method. |
2900 |
|
|
2901 |
|
=over 4 |
2902 |
|
|
2903 |
|
=item command |
2904 |
|
|
2905 |
|
Command to prepare and execute. |
2906 |
|
|
2907 |
|
=item params |
2908 |
|
|
2909 |
|
Reference to a list of the values to be substituted in for the parameter marks. |
2910 |
|
|
2911 |
|
=item RETURN |
2912 |
|
|
2913 |
|
Returns a prepared and executed statement handle from which the caller can extract |
2914 |
|
results. |
2915 |
|
|
2916 |
|
=back |
2917 |
|
|
2918 |
=head3 GetLoadStats |
=cut |
2919 |
|
|
2920 |
|
sub _GetStatementHandle { |
2921 |
|
# Get the parameters. |
2922 |
|
my ($self, $command, $params) = @_; |
2923 |
|
# Trace the query. |
2924 |
|
Trace("SQL query: $command") if T(SQL => 3); |
2925 |
|
Trace("PARMS: '" . (join "', '", @{$params}) . "'") if (T(SQL => 4) && (@{$params} > 0)); |
2926 |
|
# Get the database handle. |
2927 |
|
my $dbh = $self->{_dbh}; |
2928 |
|
# Prepare the command. |
2929 |
|
my $sth = $dbh->prepare_command($command); |
2930 |
|
# Execute it with the parameters bound in. |
2931 |
|
$sth->execute(@{$params}) || Confess("SELECT error" . $sth->errstr()); |
2932 |
|
# Return the statement handle. |
2933 |
|
return $sth; |
2934 |
|
} |
2935 |
|
|
2936 |
|
=head3 _GetLoadStats |
2937 |
|
|
2938 |
Return a blank statistics object for use by the load methods. |
Return a blank statistics object for use by the load methods. |
2939 |
|
|
2942 |
=cut |
=cut |
2943 |
|
|
2944 |
sub _GetLoadStats { |
sub _GetLoadStats { |
2945 |
return Stats->new('records'); |
return Stats->new(); |
2946 |
} |
} |
2947 |
|
|
2948 |
=head3 GenerateFields |
=head3 _GenerateFields |
2949 |
|
|
2950 |
Generate field values from a field structure and store in a specified table. The field names |
Generate field values from a field structure and store in a specified table. The field names |
2951 |
are first sorted by pass count, certain pre-defined fields are removed from the list, and |
are first sorted by pass count, certain pre-defined fields are removed from the list, and |
3019 |
} |
} |
3020 |
} |
} |
3021 |
|
|
3022 |
=head3 DumpRelation |
=head3 _DumpRelation |
3023 |
|
|
3024 |
Dump the specified relation's to the specified output file in tab-delimited format. |
Dump the specified relation's to the specified output file in tab-delimited format. |
3025 |
|
|
3069 |
close DTXOUT; |
close DTXOUT; |
3070 |
} |
} |
3071 |
|
|
3072 |
=head3 GetStructure |
=head3 _GetStructure |
3073 |
|
|
3074 |
Get the data structure for a specified entity or relationship. |
Get the data structure for a specified entity or relationship. |
3075 |
|
|
3108 |
return $retVal; |
return $retVal; |
3109 |
} |
} |
3110 |
|
|
3111 |
=head3 GetRelationTable |
|
3112 |
|
|
3113 |
|
=head3 _GetRelationTable |
3114 |
|
|
3115 |
Get the list of relations for a specified entity or relationship. |
Get the list of relations for a specified entity or relationship. |
3116 |
|
|
3139 |
return $objectData->{Relations}; |
return $objectData->{Relations}; |
3140 |
} |
} |
3141 |
|
|
3142 |
=head3 GetFieldTable |
=head3 _ValidateFieldNames |
|
|
|
|
Get the field structure for a specified entity or relationship. |
|
|
|
|
|
This is an instance method. |
|
|
|
|
|
=over 4 |
|
|
|
|
|
=item objectName |
|
|
|
|
|
Name of the desired entity or relationship. |
|
|
|
|
|
=item RETURN |
|
|
|
|
|
The table containing the field descriptors for the specified object. |
|
|
|
|
|
=back |
|
|
|
|
|
=cut |
|
|
|
|
|
sub _GetFieldTable { |
|
|
# Get the parameters. |
|
|
my ($self, $objectName) = @_; |
|
|
# Get the descriptor from the metadata. |
|
|
my $objectData = $self->_GetStructure($objectName); |
|
|
# Return the object's field table. |
|
|
return $objectData->{Fields}; |
|
|
} |
|
|
|
|
|
=head3 ValidateFieldNames |
|
3143 |
|
|
3144 |
Determine whether or not the field names are valid. A description of the problems with the names |
Determine whether or not the field names are valid. A description of the problems with the names |
3145 |
will be written to the standard error output. If there is an error, this method will abort. This is |
will be written to the standard error output. If there is an error, this method will abort. This is |
3194 |
} |
} |
3195 |
} |
} |
3196 |
|
|
3197 |
=head3 LoadRelation |
=head3 _LoadRelation |
3198 |
|
|
3199 |
Load a relation from the data in a tab-delimited disk file. The load will only take place if a disk |
Load a relation from the data in a tab-delimited disk file. The load will only take place if a disk |
3200 |
file with the same name as the relation exists in the specified directory. |
file with the same name as the relation exists in the specified directory. |
3254 |
return $retVal; |
return $retVal; |
3255 |
} |
} |
3256 |
|
|
3257 |
=head3 LoadMetaData |
=head3 _LoadMetaData |
3258 |
|
|
3259 |
This method loads the data describing this database from an XML file into a metadata structure. |
This method loads the data describing this database from an XML file into a metadata structure. |
3260 |
The resulting structure is a set of nested hash tables containing all the information needed to |
The resulting structure is a set of nested hash tables containing all the information needed to |
3279 |
sub _LoadMetaData { |
sub _LoadMetaData { |
3280 |
# Get the parameters. |
# Get the parameters. |
3281 |
my ($filename) = @_; |
my ($filename) = @_; |
3282 |
|
Trace("Reading Sprout DBD from $filename.") if T(2); |
3283 |
# Slurp the XML file into a variable. Extensive use of options is used to insure we |
# Slurp the XML file into a variable. Extensive use of options is used to insure we |
3284 |
# get the exact structure we want. |
# get the exact structure we want. |
3285 |
my $metadata = XML::Simple::XMLin($filename, |
my $metadata = XML::Simple::XMLin($filename, |
3304 |
my %masterRelationTable = (); |
my %masterRelationTable = (); |
3305 |
# Loop through the entities. |
# Loop through the entities. |
3306 |
my $entityList = $metadata->{Entities}; |
my $entityList = $metadata->{Entities}; |
3307 |
while (my ($entityName, $entityStructure) = each %{$entityList}) { |
for my $entityName (keys %{$entityList}) { |
3308 |
|
my $entityStructure = $entityList->{$entityName}; |
3309 |
# |
# |
3310 |
# The first step is to run creating all the entity's default values. For C<Field> elements, |
# The first step is to create all the entity's default values. For C<Field> elements, |
3311 |
# the relation name must be added where it is not specified. For relationships, |
# the relation name must be added where it is not specified. For relationships, |
3312 |
# the B<from-link> and B<to-link> fields must be inserted, and for entities an B<id> |
# the B<from-link> and B<to-link> fields must be inserted, and for entities an B<id> |
3313 |
# field must be added to each relation. Finally, each field will have a C<PrettySort> attribute |
# field must be added to each relation. Finally, each field will have a C<PrettySort> attribute |
3353 |
# to a list of fields. First, we need the ID field itself. |
# to a list of fields. First, we need the ID field itself. |
3354 |
my $idField = $fieldList->{id}; |
my $idField = $fieldList->{id}; |
3355 |
# Loop through the relations. |
# Loop through the relations. |
3356 |
while (my ($relationName, $relation) = each %{$relationTable}) { |
for my $relationName (keys %{$relationTable}) { |
3357 |
|
my $relation = $relationTable->{$relationName}; |
3358 |
# Get the relation's field list. |
# Get the relation's field list. |
3359 |
my $relationFieldList = $relation->{Fields}; |
my $relationFieldList = $relation->{Fields}; |
3360 |
# Add the ID field to it. If the field's already there, it will not make any |
# Add the ID field to it. If the field's already there, it will not make any |
3404 |
# The next step is to insure that each relation has at least one index that begins with the ID field. |
# The next step is to insure that each relation has at least one index that begins with the ID field. |
3405 |
# After that, we convert each relation's index list to an index table. We first need to loop through |
# After that, we convert each relation's index list to an index table. We first need to loop through |
3406 |
# the relations. |
# the relations. |
3407 |
while (my ($relationName, $relation) = each %{$relationTable}) { |
for my $relationName (keys %{$relationTable}) { |
3408 |
|
my $relation = $relationTable->{$relationName}; |
3409 |
# Get the relation's index list. |
# Get the relation's index list. |
3410 |
my $indexList = $relation->{Indexes}; |
my $indexList = $relation->{Indexes}; |
3411 |
# Insure this relation has an ID index. |
# Insure this relation has an ID index. |
3436 |
# Loop through the relationships. Relationships actually turn out to be much simpler than entities. |
# Loop through the relationships. Relationships actually turn out to be much simpler than entities. |
3437 |
# For one thing, there is only a single constituent relation. |
# For one thing, there is only a single constituent relation. |
3438 |
my $relationshipList = $metadata->{Relationships}; |
my $relationshipList = $metadata->{Relationships}; |
3439 |
while (my ($relationshipName, $relationshipStructure) = each %{$relationshipList}) { |
for my $relationshipName (keys %{$relationshipList}) { |
3440 |
|
my $relationshipStructure = $relationshipList->{$relationshipName}; |
3441 |
# Fix up this relationship. |
# Fix up this relationship. |
3442 |
_FixupFields($relationshipStructure, $relationshipName, 2, 3); |
_FixupFields($relationshipStructure, $relationshipName, 2, 3); |
3443 |
# Format a description for the FROM field. |
# Format a description for the FROM field. |
3486 |
my @fromList = (); |
my @fromList = (); |
3487 |
my @toList = (); |
my @toList = (); |
3488 |
my @bothList = (); |
my @bothList = (); |
3489 |
while (my ($relationshipName, $relationship) = each %{$relationshipList}) { |
Trace("Join table build for $entityName.") if T(metadata => 4); |
3490 |
|
for my $relationshipName (keys %{$relationshipList}) { |
3491 |
|
my $relationship = $relationshipList->{$relationshipName}; |
3492 |
# Determine if this relationship has our entity in one of its link fields. |
# Determine if this relationship has our entity in one of its link fields. |
3493 |
if ($relationship->{from} eq $entityName) { |
my $fromEntity = $relationship->{from}; |
3494 |
if ($relationship->{to} eq $entityName) { |
my $toEntity = $relationship->{to}; |
3495 |
|
Trace("Join check for relationship $relationshipName from $fromEntity to $toEntity.") if T(Joins => 4); |
3496 |
|
if ($fromEntity eq $entityName) { |
3497 |
|
if ($toEntity eq $entityName) { |
3498 |
# Here the relationship is recursive. |
# Here the relationship is recursive. |
3499 |
push @bothList, $relationshipName; |
push @bothList, $relationshipName; |
3500 |
|
Trace("Relationship $relationshipName put in both-list.") if T(metadata => 4); |
3501 |
} else { |
} else { |
3502 |
# Here the relationship comes from the entity. |
# Here the relationship comes from the entity. |
3503 |
push @fromList, $relationshipName; |
push @fromList, $relationshipName; |
3504 |
|
Trace("Relationship $relationshipName put in from-list.") if T(metadata => 4); |
3505 |
} |
} |
3506 |
} elsif ($relationship->{to} eq $entityName) { |
} elsif ($toEntity eq $entityName) { |
3507 |
# Here the relationship goes to the entity. |
# Here the relationship goes to the entity. |
3508 |
push @toList, $relationshipName; |
push @toList, $relationshipName; |
3509 |
|
Trace("Relationship $relationshipName put in to-list.") if T(metadata => 4); |
3510 |
} |
} |
3511 |
} |
} |
3512 |
# Create the nonrecursive joins. Note that we build two hashes for running |
# Create the nonrecursive joins. Note that we build two hashes for running |
3515 |
# hash table at the same time. |
# hash table at the same time. |
3516 |
my %directRelationships = ( from => \@fromList, to => \@toList ); |
my %directRelationships = ( from => \@fromList, to => \@toList ); |
3517 |
my %otherRelationships = ( from => \@fromList, to => \@toList ); |
my %otherRelationships = ( from => \@fromList, to => \@toList ); |
3518 |
while (my ($linkType, $relationships) = each %directRelationships) { |
for my $linkType (keys %directRelationships) { |
3519 |
|
my $relationships = $directRelationships{$linkType}; |
3520 |
# Loop through all the relationships. |
# Loop through all the relationships. |
3521 |
for my $relationshipName (@{$relationships}) { |
for my $relationshipName (@{$relationships}) { |
3522 |
# Create joins between the entity and this relationship. |
# Create joins between the entity and this relationship. |
3523 |
my $linkField = "$relationshipName.${linkType}_link"; |
my $linkField = "$relationshipName.${linkType}_link"; |
3524 |
my $joinClause = "$entityName.id = $linkField"; |
my $joinClause = "$entityName.id = $linkField"; |
3525 |
|
Trace("Entity join clause is $joinClause for $entityName and $relationshipName.") if T(metadata => 4); |
3526 |
$joinTable{"$entityName/$relationshipName"} = $joinClause; |
$joinTable{"$entityName/$relationshipName"} = $joinClause; |
3527 |
$joinTable{"$relationshipName/$entityName"} = $joinClause; |
$joinTable{"$relationshipName/$entityName"} = $joinClause; |
3528 |
# Create joins between this relationship and the other relationships. |
# Create joins between this relationship and the other relationships. |
3529 |
while (my ($otherType, $otherships) = each %otherRelationships) { |
for my $otherType (keys %otherRelationships) { |
3530 |
|
my $otherships = $otherRelationships{$otherType}; |
3531 |
for my $otherName (@{$otherships}) { |
for my $otherName (@{$otherships}) { |
3532 |
# Get the key for this join. |
# Get the key for this join. |
3533 |
my $joinKey = "$otherName/$relationshipName"; |
my $joinKey = "$otherName/$relationshipName"; |
3537 |
# path is ambiguous. We delete the join from the join |
# path is ambiguous. We delete the join from the join |
3538 |
# table to prevent it from being used. |
# table to prevent it from being used. |
3539 |
delete $joinTable{$joinKey}; |
delete $joinTable{$joinKey}; |
3540 |
|
Trace("Deleting ambiguous join $joinKey.") if T(4); |
3541 |
} elsif ($otherName ne $relationshipName) { |
} elsif ($otherName ne $relationshipName) { |
3542 |
# Here we have a valid join. Note that joins between a |
# Here we have a valid join. Note that joins between a |
3543 |
# relationship and itself are prohibited. |
# relationship and itself are prohibited. |
3544 |
$joinTable{$joinKey} = "$otherName.${otherType}_link = $linkField"; |
my $relJoinClause = "$otherName.${otherType}_link = $linkField"; |
3545 |
|
$joinTable{$joinKey} = $relJoinClause; |
3546 |
|
Trace("Relationship join clause is $relJoinClause for $joinKey.") if T(metadata => 4); |
3547 |
} |
} |
3548 |
} |
} |
3549 |
} |
} |
3552 |
# relationship can only be ambiguous with another recursive relationship, |
# relationship can only be ambiguous with another recursive relationship, |
3553 |
# and the incoming relationship from the outer loop is never recursive. |
# and the incoming relationship from the outer loop is never recursive. |
3554 |
for my $otherName (@bothList) { |
for my $otherName (@bothList) { |
3555 |
|
Trace("Setting up relationship joins to recursive relationship $otherName with $relationshipName.") if T(metadata => 4); |
3556 |
# Join from the left. |
# Join from the left. |
3557 |
$joinTable{"$relationshipName/$otherName"} = |
$joinTable{"$relationshipName/$otherName"} = |
3558 |
"$linkField = $otherName.from_link"; |
"$linkField = $otherName.from_link"; |
3567 |
# rise to situations where we can't create the path we want; however, it is always |
# rise to situations where we can't create the path we want; however, it is always |
3568 |
# possible to get the same effect using multiple queries. |
# possible to get the same effect using multiple queries. |
3569 |
for my $relationshipName (@bothList) { |
for my $relationshipName (@bothList) { |
3570 |
|
Trace("Setting up entity joins to recursive relationship $relationshipName with $entityName.") if T(metadata => 4); |
3571 |
# Join to the entity from each direction. |
# Join to the entity from each direction. |
3572 |
$joinTable{"$entityName/$relationshipName"} = |
$joinTable{"$entityName/$relationshipName"} = |
3573 |
"$entityName.id = $relationshipName.from_link"; |
"$entityName.id = $relationshipName.from_link"; |
3581 |
return $metadata; |
return $metadata; |
3582 |
} |
} |
3583 |
|
|
3584 |
=head3 CreateRelationshipIndex |
=head3 _CreateRelationshipIndex |
3585 |
|
|
3586 |
Create an index for a relationship's relation. |
Create an index for a relationship's relation. |
3587 |
|
|
3618 |
# index descriptor does not exist, it will be created automatically so we can add |
# index descriptor does not exist, it will be created automatically so we can add |
3619 |
# the field to it. |
# the field to it. |
3620 |
unshift @{$newIndex->{IndexFields}}, $firstField; |
unshift @{$newIndex->{IndexFields}}, $firstField; |
3621 |
|
# If this is a one-to-many relationship, the "To" index is unique. |
3622 |
|
if ($relationshipStructure->{arity} eq "1M" && $indexKey eq "To") { |
3623 |
|
$newIndex->{Unique} = 'true'; |
3624 |
|
} |
3625 |
# Add the index to the relation. |
# Add the index to the relation. |
3626 |
_AddIndex("idx$relationshipName$indexKey", $relationStructure, $newIndex); |
_AddIndex("idx$relationshipName$indexKey", $relationStructure, $newIndex); |
3627 |
} |
} |
3628 |
|
|
3629 |
=head3 AddIndex |
=head3 _AddIndex |
3630 |
|
|
3631 |
Add an index to a relation structure. |
Add an index to a relation structure. |
3632 |
|
|
3672 |
$relationStructure->{Indexes}->{$indexName} = $newIndex; |
$relationStructure->{Indexes}->{$indexName} = $newIndex; |
3673 |
} |
} |
3674 |
|
|
3675 |
=head3 FixupFields |
=head3 _FixupFields |
3676 |
|
|
3677 |
This method fixes the field list for an entity or relationship. It will add the caller-specified |
This method fixes the field list for an entity or relationship. It will add the caller-specified |
3678 |
relation name to fields that do not have a name and set the C<PrettySort> value as specified. |
relation name to fields that do not have a name and set the C<PrettySort> value as specified. |
3710 |
# Here it doesn't, so we create a new one. |
# Here it doesn't, so we create a new one. |
3711 |
$structure->{Fields} = { }; |
$structure->{Fields} = { }; |
3712 |
} else { |
} else { |
3713 |
# Here we have a field list. Loop through its fields. |
# Here we have a field list. We need to track the searchable fields, so we |
3714 |
while (my ($fieldName, $fieldData) = each %{$structure->{Fields}}) { |
# create a list for stashing them. |
3715 |
|
my @textFields = (); |
3716 |
|
# Loop through the fields. |
3717 |
|
my $fieldStructures = $structure->{Fields}; |
3718 |
|
for my $fieldName (keys %{$fieldStructures}) { |
3719 |
|
Trace("Processing field $fieldName of $defaultRelationName.") if T(4); |
3720 |
|
my $fieldData = $fieldStructures->{$fieldName}; |
3721 |
# Get the field type. |
# Get the field type. |
3722 |
my $type = $fieldData->{type}; |
my $type = $fieldData->{type}; |
3723 |
# Plug in a relation name if it is needed. |
# Plug in a relation name if it is needed. |
3727 |
# The data generator will use the default for the field's type. |
# The data generator will use the default for the field's type. |
3728 |
$fieldData->{DataGen} = { content => $TypeTable{$type}->{dataGen} }; |
$fieldData->{DataGen} = { content => $TypeTable{$type}->{dataGen} }; |
3729 |
} |
} |
3730 |
|
# Check for searchability. |
3731 |
|
if ($fieldData->{searchable}) { |
3732 |
|
# Only allow this for a primary relation. |
3733 |
|
if ($fieldData->{relation} ne $defaultRelationName) { |
3734 |
|
Confess("Field $fieldName of $defaultRelationName is in secondary relations and cannot be searchable."); |
3735 |
|
} else { |
3736 |
|
push @textFields, $fieldName; |
3737 |
|
} |
3738 |
|
} |
3739 |
# Plug in the defaults for the optional data generation parameters. |
# Plug in the defaults for the optional data generation parameters. |
3740 |
Tracer::MergeOptions($fieldData->{DataGen}, { testCount => 1, pass => 0 }); |
Tracer::MergeOptions($fieldData->{DataGen}, { testCount => 1, pass => 0 }); |
3741 |
# Add the PrettySortValue. |
# Add the PrettySortValue. |
3742 |
$fieldData->{PrettySort} = (($type eq "text") ? $textPrettySortValue : $prettySortValue); |
$fieldData->{PrettySort} = (($type eq "text") ? $textPrettySortValue : $prettySortValue); |
3743 |
} |
} |
3744 |
|
# If there are searchable fields, remember the fact. |
3745 |
|
if (@textFields) { |
3746 |
|
$structure->{searchFields} = \@textFields; |
3747 |
|
} |
3748 |
} |
} |
3749 |
} |
} |
3750 |
|
|
3751 |
=head3 FixName |
=head3 _FixName |
3752 |
|
|
3753 |
Fix the incoming field name so that it is a legal SQL column name. |
Fix the incoming field name so that it is a legal SQL column name. |
3754 |
|
|
3777 |
return $fieldName; |
return $fieldName; |
3778 |
} |
} |
3779 |
|
|
3780 |
=head3 FixNames |
=head3 _FixNames |
3781 |
|
|
3782 |
Fix all the field names in a list. |
Fix all the field names in a list. |
3783 |
|
|
3808 |
return @result; |
return @result; |
3809 |
} |
} |
3810 |
|
|
3811 |
=head3 AddField |
=head3 _AddField |
3812 |
|
|
3813 |
Add a field to a field list. |
Add a field to a field list. |
3814 |
|
|
3843 |
$fieldList->{$fieldName} = $fieldStructure; |
$fieldList->{$fieldName} = $fieldStructure; |
3844 |
} |
} |
3845 |
|
|
3846 |
=head3 ReOrderRelationTable |
=head3 _ReOrderRelationTable |
3847 |
|
|
3848 |
This method will take a relation table and re-sort it according to the implicit ordering of the |
This method will take a relation table and re-sort it according to the implicit ordering of the |
3849 |
C<PrettySort> property. Instead of a hash based on field names, it will return a list of fields. |
C<PrettySort> property. Instead of a hash based on field names, it will return a list of fields. |
3904 |
|
|
3905 |
} |
} |
3906 |
|
|
3907 |
=head3 IsPrimary |
=head3 _IsPrimary |
3908 |
|
|
3909 |
Return TRUE if a specified relation is a primary relation, else FALSE. A relation is primary |
Return TRUE if a specified relation is a primary relation, else FALSE. A relation is primary |
3910 |
if it has the same name as an entity or relationship. |
if it has the same name as an entity or relationship. |
3940 |
return $retVal; |
return $retVal; |
3941 |
} |
} |
3942 |
|
|
3943 |
=head3 FindRelation |
=head3 _FindRelation |
3944 |
|
|
3945 |
Return the descriptor for the specified relation. |
Return the descriptor for the specified relation. |
3946 |
|
|
3971 |
|
|
3972 |
=head2 HTML Documentation Utility Methods |
=head2 HTML Documentation Utility Methods |
3973 |
|
|
3974 |
=head3 ComputeRelationshipSentence |
=head3 _ComputeRelationshipSentence |
3975 |
|
|
3976 |
The relationship sentence consists of the relationship name between the names of the |
The relationship sentence consists of the relationship name between the names of the |
3977 |
two related entities and an arity indicator. |
two related entities and an arity indicator. |
4009 |
return $result; |
return $result; |
4010 |
} |
} |
4011 |
|
|
4012 |
=head3 ComputeRelationshipHeading |
=head3 _ComputeRelationshipHeading |
4013 |
|
|
4014 |
The relationship heading is the L<relationship sentence|/ComputeRelationshipSentence> with the entity |
The relationship heading is the L<relationship sentence|/ComputeRelationshipSentence> with the entity |
4015 |
names hyperlinked to the appropriate entity sections of the document. |
names hyperlinked to the appropriate entity sections of the document. |
4046 |
return $result; |
return $result; |
4047 |
} |
} |
4048 |
|
|
4049 |
=head3 ShowRelationTable |
=head3 _ShowRelationTable |
4050 |
|
|
4051 |
Generate the HTML string for a particular relation. The relation's data will be formatted as an HTML |
Generate the HTML string for a particular relation. The relation's data will be formatted as an HTML |
4052 |
table with three columns-- the field name, the field type, and the field description. |
table with three columns-- the field name, the field type, and the field description. |
4089 |
my $indexData = $indexTable->{$indexName}; |
my $indexData = $indexTable->{$indexName}; |
4090 |
# Determine whether or not the index is unique. |
# Determine whether or not the index is unique. |
4091 |
my $fullName = $indexName; |
my $fullName = $indexName; |
4092 |
if ($indexData->{Unique} eq "true") { |
if (exists $indexData->{Unique} && $indexData->{Unique} eq "true") { |
4093 |
$fullName .= " (unique)"; |
$fullName .= " (unique)"; |
4094 |
} |
} |
4095 |
# Start an HTML list item for this index. |
# Start an HTML list item for this index. |
4107 |
$htmlString .= "</ul>\n"; |
$htmlString .= "</ul>\n"; |
4108 |
} |
} |
4109 |
|
|
4110 |
=head3 OpenFieldTable |
=head3 _OpenFieldTable |
4111 |
|
|
4112 |
This method creates the header string for the field table generated by L</ShowMetaData>. |
This method creates the header string for the field table generated by L</ShowMetaData>. |
4113 |
|
|
4132 |
return _OpenTable($tablename, 'Field', 'Type', 'Description'); |
return _OpenTable($tablename, 'Field', 'Type', 'Description'); |
4133 |
} |
} |
4134 |
|
|
4135 |
=head3 OpenTable |
=head3 _OpenTable |
4136 |
|
|
4137 |
This method creates the header string for an HTML table. |
This method creates the header string for an HTML table. |
4138 |
|
|
4172 |
return $htmlString; |
return $htmlString; |
4173 |
} |
} |
4174 |
|
|
4175 |
=head3 CloseTable |
=head3 _CloseTable |
4176 |
|
|
4177 |
This method returns the HTML for closing a table. |
This method returns the HTML for closing a table. |
4178 |
|
|
4184 |
return "</table></p>\n"; |
return "</table></p>\n"; |
4185 |
} |
} |
4186 |
|
|
4187 |
=head3 ShowField |
=head3 _ShowField |
4188 |
|
|
4189 |
This method returns the HTML for displaying a row of field information in a field table. |
This method returns the HTML for displaying a row of field information in a field table. |
4190 |
|
|
4219 |
return $htmlString; |
return $htmlString; |
4220 |
} |
} |
4221 |
|
|
4222 |
=head3 HTMLNote |
=head3 _HTMLNote |
4223 |
|
|
4224 |
Convert a note or comment to HTML by replacing some bulletin-board codes with HTML. The codes |
Convert a note or comment to HTML by replacing some bulletin-board codes with HTML. The codes |
4225 |
supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a new paragraph. |
supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a new paragraph. |