Parent Directory
|
Revision Log
Revision 1.65 - (view) (download) (as text)
1 : | parrello | 1.1 | package ERDB; |
2 : | |||
3 : | parrello | 1.10 | use strict; |
4 : | use Tracer; | ||
5 : | olson | 1.14 | use DBrtns; |
6 : | parrello | 1.10 | use Data::Dumper; |
7 : | use XML::Simple; | ||
8 : | use DBQuery; | ||
9 : | use DBObject; | ||
10 : | use Stats; | ||
11 : | use Time::HiRes qw(gettimeofday); | ||
12 : | parrello | 1.42 | use Digest::MD5 qw(md5_base64); |
13 : | parrello | 1.19 | use FIG; |
14 : | parrello | 1.1 | |
15 : | =head1 Entity-Relationship Database Package | ||
16 : | |||
17 : | =head2 Introduction | ||
18 : | |||
19 : | The Entity-Relationship Database Package allows the client to create an easily-configurable | ||
20 : | database of Entities connected by Relationships. Each entity is represented by one or more | ||
21 : | relations in an underlying SQL database. Each relationship is represented by a single | ||
22 : | relation that connects two entities. | ||
23 : | |||
24 : | Although this package is designed for general use, all examples are derived from the | ||
25 : | Sprout database, which is the first database implemented using this package. | ||
26 : | |||
27 : | Each entity has at least one relation, the I<primary relation>, that has the same name as | ||
28 : | the entity. The primary relation contains a field named C<id> that contains the unique | ||
29 : | identifier of each entity instance. An entity may have additional relations that contain | ||
30 : | fields which are optional or can occur more than once. For example, the B<FEATURE> entity | ||
31 : | has a B<feature-type> attribute that occurs exactly once for each feature. This attribute | ||
32 : | is implemented by a C<feature_type> column in the primary relation C<Feature>. In addition, | ||
33 : | however, a feature may have zero or more aliases. These are implemented using a C<FeatureAlias> | ||
34 : | relation that contains two fields-- the feature ID (C<id>) and the alias name (C<alias>). | ||
35 : | The B<FEATURE> entity also contains an optional virulence number. This is implemented | ||
36 : | as a separate relation C<FeatureVirulence> which contains an ID (C<id>) and a virulence number | ||
37 : | parrello | 1.8 | (C<virulence>). If the virulence of a feature I<ABC> is known to be 6, there will be one row in |
38 : | the C<FeatureVirulence> relation possessing the value I<ABC> as its ID and 6 as its virulence | ||
39 : | number. If the virulence of I<ABC> is not known, there will not be any rows for it in | ||
40 : | C<FeatureVirulence>. | ||
41 : | parrello | 1.1 | |
42 : | Entities are connected by binary relationships implemented using single relations possessing the | ||
43 : | same name as the relationship itself and that has an I<arity> of 1-to-1 (C<11>), 1-to-many (C<1M>), | ||
44 : | or many-to-many (C<MM>). Each relationship's relation contains a C<from-link> field that contains the | ||
45 : | ID of the source entity and a C<to-link> field that contains the ID of the target entity. The name | ||
46 : | of the relationship is generally a verb phrase with the source entity as the subject and the | ||
47 : | target entity as the object. So, for example, the B<ComesFrom> relationship connects the B<GENOME> | ||
48 : | and B<SOURCE> entities, and indicates that a particular source organization participated in the | ||
49 : | mapping of the genome. A source organization frequently participates in the mapping | ||
50 : | of many genomes, and many source organizations can cooperate in the mapping of a single genome, so | ||
51 : | this relationship has an arity of many-to-many (C<MM>). The relation that implements the B<ComesFrom> | ||
52 : | relationship is called C<ComesFrom> and contains two fields-- C<from-link>, which contains a genome ID, | ||
53 : | and C<to-link>, which contains a source ID. | ||
54 : | |||
55 : | A relationship may itself have attributes. These attributes, known as I<intersection data attributes>, | ||
56 : | are implemented as additional fields in the relationship's relation. So, for example, the | ||
57 : | B<IsMadeUpOf> relationship connects the B<Contig> entity to the B<Sequence> entity, and is used | ||
58 : | to determine which sequences make up a contig. The relationship has as an attribute the | ||
59 : | B<start-position>, which indicates where in the contig that the sequence begins. This attribute | ||
60 : | is implemented as the C<start_position> field in the C<IsMadeUpOf> relation. | ||
61 : | |||
62 : | The database itself is described by an XML file using the F<ERDatabase.xsd> schema. In addition to | ||
63 : | all the data required to define the entities, relationships, and attributes, the schema provides | ||
64 : | space for notes describing the data and what it means. These notes are used by L</ShowMetaData> | ||
65 : | to generate documentation for the database. | ||
66 : | |||
67 : | Finally, every entity and relationship object has a flag indicating if it is new or old. The object | ||
68 : | is considered I<old> if it was loaded by the L</LoadTables> method. It is considered I<new> if it | ||
69 : | was inserted by the L</InsertObject> method. | ||
70 : | |||
71 : | To facilitate testing, the ERDB module supports automatic generation of test data. This process | ||
72 : | parrello | 1.5 | is described in the L</GenerateEntity> and L</GenerateConnection> methods, though it is not yet |
73 : | fully implemented. | ||
74 : | parrello | 1.1 | |
75 : | parrello | 1.8 | =head2 XML Database Description |
76 : | |||
77 : | =head3 Data Types | ||
78 : | |||
79 : | The ERDB system supports the following data types. Note that there are numerous string | ||
80 : | types depending on the maximum length. Some database packages limit the total number of | ||
81 : | characters you have in an index key; to insure the database works in all environments, | ||
82 : | the type of string should be the shortest one possible that supports all the known values. | ||
83 : | |||
84 : | =over 4 | ||
85 : | |||
86 : | =item char | ||
87 : | |||
88 : | single ASCII character | ||
89 : | |||
90 : | =item int | ||
91 : | |||
92 : | 32-bit signed integer | ||
93 : | |||
94 : | parrello | 1.60 | =item counter |
95 : | |||
96 : | 32-bit unsigned integer | ||
97 : | |||
98 : | parrello | 1.8 | =item date |
99 : | |||
100 : | 64-bit unsigned integer, representing a PERL date/time value | ||
101 : | |||
102 : | =item text | ||
103 : | |||
104 : | long string; Text fields cannot be used in indexes or sorting and do not support the | ||
105 : | normal syntax of filter clauses, but can be up to a billion character in length | ||
106 : | |||
107 : | =item float | ||
108 : | |||
109 : | double-precision floating-point number | ||
110 : | |||
111 : | =item boolean | ||
112 : | |||
113 : | single-bit numeric value; The value is stored as a 16-bit signed integer (for | ||
114 : | compatability with certain database packages), but the only values supported are | ||
115 : | 0 and 1. | ||
116 : | |||
117 : | parrello | 1.44 | =item id-string |
118 : | |||
119 : | variable-length string, maximum 25 characters | ||
120 : | |||
121 : | parrello | 1.8 | =item key-string |
122 : | |||
123 : | variable-length string, maximum 40 characters | ||
124 : | |||
125 : | =item name-string | ||
126 : | |||
127 : | variable-length string, maximum 80 characters | ||
128 : | |||
129 : | =item medium-string | ||
130 : | |||
131 : | variable-length string, maximum 160 characters | ||
132 : | |||
133 : | =item string | ||
134 : | |||
135 : | variable-length string, maximum 255 characters | ||
136 : | |||
137 : | parrello | 1.42 | =item hash-string |
138 : | |||
139 : | variable-length string, maximum 22 characters | ||
140 : | |||
141 : | parrello | 1.8 | =back |
142 : | |||
143 : | parrello | 1.42 | The hash-string data type has a special meaning. The actual key passed into the loader will |
144 : | be a string, but it will be digested into a 22-character MD5 code to save space. Although the | ||
145 : | MD5 algorithm is not perfect, it is extremely unlikely two strings will have the same | ||
146 : | digest. Therefore, it is presumed the keys will be unique. When the database is actually | ||
147 : | in use, the hashed keys will be presented rather than the original values. For this reason, | ||
148 : | they should not be used for entities where the key is meaningful. | ||
149 : | |||
150 : | parrello | 1.8 | =head3 Global Tags |
151 : | |||
152 : | The entire database definition must be inside a B<Database> tag. The display name of | ||
153 : | the database is given by the text associated with the B<Title> tag. The display name | ||
154 : | is only used in the automated documentation. It has no other effect. The entities and | ||
155 : | relationships are listed inside the B<Entities> and B<Relationships> tags, | ||
156 : | respectively. None of these tags have attributes. | ||
157 : | |||
158 : | parrello | 1.10 | <Database> |
159 : | <Title>... display title here...</Title> | ||
160 : | <Entities> | ||
161 : | ... entity definitions here ... | ||
162 : | </Entities> | ||
163 : | <Relationships> | ||
164 : | ... relationship definitions here... | ||
165 : | </Relationships> | ||
166 : | </Database> | ||
167 : | parrello | 1.8 | |
168 : | Entities, relationships, indexes, and fields all allow a text tag called B<Notes>. | ||
169 : | The text inside the B<Notes> tag contains comments that will appear when the database | ||
170 : | documentation is generated. Within a B<Notes> tag, you may use C<[i]> and C<[/i]> for | ||
171 : | italics, C<[b]> and C<[/b]> for bold, and C<[p]> for a new paragraph. | ||
172 : | |||
173 : | =head3 Fields | ||
174 : | |||
175 : | Both entities and relationships have fields described by B<Field> tags. A B<Field> | ||
176 : | tag can have B<Notes> associated with it. The complete set of B<Field> tags for an | ||
177 : | object mus be inside B<Fields> tags. | ||
178 : | |||
179 : | parrello | 1.10 | <Entity ... > |
180 : | <Fields> | ||
181 : | ... Field tags ... | ||
182 : | </Fields> | ||
183 : | </Entity> | ||
184 : | parrello | 1.8 | |
185 : | The attributes for the B<Field> tag are as follows. | ||
186 : | |||
187 : | =over 4 | ||
188 : | |||
189 : | =item name | ||
190 : | |||
191 : | Name of the field. The field name should contain only letters, digits, and hyphens (C<->), | ||
192 : | and the first character should be a letter. Most underlying databases are case-insensitive | ||
193 : | with the respect to field names, so a best practice is to use lower-case letters only. | ||
194 : | |||
195 : | =item type | ||
196 : | |||
197 : | Data type of the field. The legal data types are given above. | ||
198 : | |||
199 : | =item relation | ||
200 : | |||
201 : | Name of the relation containing the field. This should only be specified for entity | ||
202 : | fields. The ERDB system does not support optional fields or multi-occurring fields | ||
203 : | in the primary relation of an entity. Instead, they are put into secondary relations. | ||
204 : | So, for example, in the C<Genome> entity, the C<group-name> field indicates a special | ||
205 : | grouping used to select a subset of the genomes. A given genome may not be in any | ||
206 : | groups or may be in multiple groups. Therefore, C<group-name> specifies a relation | ||
207 : | value. The relation name specified must be a valid table name. By convention, it is | ||
208 : | usually the entity name followed by a qualifying word (e.g. C<GenomeGroup>). In an | ||
209 : | entity, the fields without a relation attribute are said to belong to the | ||
210 : | I<primary relation>. This relation has the same name as the entity itself. | ||
211 : | |||
212 : | =back | ||
213 : | |||
214 : | =head3 Indexes | ||
215 : | |||
216 : | An entity can have multiple alternate indexes associated with it. The fields must | ||
217 : | be from the primary relation. The alternate indexes assist in ordering results | ||
218 : | from a query. A relationship can have up to two indexes-- a I<to-index> and a | ||
219 : | I<from-index>. These order the results when crossing the relationship. For | ||
220 : | example, in the relationship C<HasContig> from C<Genome> to C<Contig>, the | ||
221 : | from-index would order the contigs of a ganome, and the to-index would order | ||
222 : | the genomes of a contig. A relationship's index must specify only fields in | ||
223 : | the relationship. | ||
224 : | |||
225 : | The indexes for an entity must be listed inside the B<Indexes> tag. The from-index | ||
226 : | of a relationship is specified using the B<FromIndex> tag; the to-index is specified | ||
227 : | using the B<ToIndex> tag. | ||
228 : | |||
229 : | Each index can contain a B<Notes> tag. In addition, it will have an B<IndexFields> | ||
230 : | tag containing the B<IndexField> tags. These specify, in order, the fields used in | ||
231 : | the index. The attributes of an B<IndexField> tag are as follows. | ||
232 : | |||
233 : | =over 4 | ||
234 : | |||
235 : | =item name | ||
236 : | |||
237 : | Name of the field. | ||
238 : | |||
239 : | =item order | ||
240 : | |||
241 : | Sort order of the field-- C<ascending> or C<descending>. | ||
242 : | |||
243 : | =back | ||
244 : | |||
245 : | The B<Index>, B<FromIndex>, and B<ToIndex> tags themselves have no attributes. | ||
246 : | |||
247 : | =head3 Object and Field Names | ||
248 : | |||
249 : | By convention entity and relationship names use capital casing (e.g. C<Genome> or | ||
250 : | C<HasRegionsIn>. Most underlying databases, however, are aggressively case-insensitive | ||
251 : | with respect to relation names, converting them internally to all-upper case or | ||
252 : | all-lower case. | ||
253 : | |||
254 : | If syntax or parsing errors occur when you try to load or use an ERDB database, the | ||
255 : | most likely reason is that one of your objects has an SQL reserved word as its name. | ||
256 : | The list of SQL reserved words keeps increasing; however, most are unlikely to show | ||
257 : | up as a noun or declarative verb phrase. The exceptions are C<Group>, C<User>, | ||
258 : | C<Table>, C<Index>, C<Object>, C<Date>, C<Number>, C<Update>, C<Time>, C<Percent>, | ||
259 : | C<Memo>, C<Order>, and C<Sum>. This problem can crop up in field names as well. | ||
260 : | |||
261 : | Every entity has a field called C<id> that acts as its primary key. Every relationship | ||
262 : | has fields called C<from-link> and C<to-link> that contain copies of the relevant | ||
263 : | entity IDs. These are essentially ERDB's reserved words, and should not be used | ||
264 : | for user-defined field names. | ||
265 : | |||
266 : | =head3 Entities | ||
267 : | |||
268 : | An entity is described by the B<Entity> tag. The entity can contain B<Notes>, an | ||
269 : | B<Indexes> tag containing one or more secondary indexes, and a B<Fields> tag | ||
270 : | containing one or more fields. The attributes of the B<Entity> tag are as follows. | ||
271 : | |||
272 : | =over 4 | ||
273 : | |||
274 : | =item name | ||
275 : | |||
276 : | Name of the entity. The entity name, by convention, uses capital casing (e.g. C<Genome> | ||
277 : | or C<GroupBlock>) and should be a noun or noun phrase. | ||
278 : | |||
279 : | =item keyType | ||
280 : | |||
281 : | Data type of the primary key. The primary key is always named C<id>. | ||
282 : | |||
283 : | =back | ||
284 : | |||
285 : | =head3 Relationships | ||
286 : | |||
287 : | A relationship is described by the C<Relationship> tag. Within a relationship, | ||
288 : | there can be a C<Notes> tag, a C<Fields> tag containing the intersection data | ||
289 : | fields, a C<FromIndex> tag containing the from-index, and a C<ToIndex> tag containing | ||
290 : | the to-index. | ||
291 : | |||
292 : | The C<Relationship> tag has the following attributes. | ||
293 : | |||
294 : | =over 4 | ||
295 : | |||
296 : | =item name | ||
297 : | |||
298 : | Name of the relationship. The relationship name, by convention, uses capital casing | ||
299 : | (e.g. C<ContainsRegionIn> or C<HasContig>), and should be a declarative verb | ||
300 : | phrase, designed to fit between the from-entity and the to-entity (e.g. | ||
301 : | Block C<ContainsRegionIn> Genome). | ||
302 : | |||
303 : | =item from | ||
304 : | |||
305 : | Name of the entity from which the relationship starts. | ||
306 : | |||
307 : | =item to | ||
308 : | |||
309 : | Name of the entity to which the relationship proceeds. | ||
310 : | |||
311 : | =item arity | ||
312 : | |||
313 : | Relationship type: C<1M> for one-to-many and C<MM> for many-to-many. | ||
314 : | |||
315 : | =back | ||
316 : | |||
317 : | parrello | 1.1 | =cut |
318 : | |||
319 : | # GLOBALS | ||
320 : | |||
321 : | # Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. | ||
322 : | # "maxLen" is the maximum permissible length of the incoming string data used to populate a field | ||
323 : | # of the specified type. "dataGen" is PERL string that will be evaluated if no test data generation | ||
324 : | parrello | 1.18 | # string is specified in the field definition. "avgLen" is the average byte length for estimating |
325 : | parrello | 1.60 | # record sizes. "sort" is the key modifier for the sort command. |
326 : | my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, avgLen => 1, sort => "", dataGen => "StringGen('A')" }, | ||
327 : | int => { sqlType => 'INTEGER', maxLen => 20, avgLen => 4, sort => "n", dataGen => "IntGen(0, 99999999)" }, | ||
328 : | counter => { sqlType => 'INTEGER UNSIGNED', maxLen => 20, avgLen => 4, sort => "n", dataGen => "IntGen(0, 99999999)" }, | ||
329 : | string => { sqlType => 'VARCHAR(255)', maxLen => 255, avgLen => 100, sort => "", dataGen => "StringGen(IntGen(10,250))" }, | ||
330 : | text => { sqlType => 'TEXT', maxLen => 1000000000, avgLen => 500, sort => "", dataGen => "StringGen(IntGen(80,1000))" }, | ||
331 : | date => { sqlType => 'BIGINT', maxLen => 80, avgLen => 8, sort => "n", dataGen => "DateGen(-7, 7, IntGen(0,1400))" }, | ||
332 : | float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, avgLen => 8, sort => "g", dataGen => "FloatGen(0.0, 100.0)" }, | ||
333 : | boolean => { sqlType => 'SMALLINT', maxLen => 1, avgLen => 1, sort => "n", dataGen => "IntGen(0, 1)" }, | ||
334 : | parrello | 1.42 | 'hash-string' => |
335 : | parrello | 1.60 | { sqlType => 'VARCHAR(22)', maxLen => 22, avgLen => 22, sort => "", dataGen => "SringGen(22)" }, |
336 : | parrello | 1.44 | 'id-string' => |
337 : | parrello | 1.60 | { sqlType => 'VARCHAR(25)', maxLen => 25, avgLen => 25, sort => "", dataGen => "SringGen(22)" }, |
338 : | parrello | 1.10 | 'key-string' => |
339 : | parrello | 1.60 | { sqlType => 'VARCHAR(40)', maxLen => 40, avgLen => 10, sort => "", dataGen => "StringGen(IntGen(10,40))" }, |
340 : | parrello | 1.10 | 'name-string' => |
341 : | parrello | 1.60 | { sqlType => 'VARCHAR(80)', maxLen => 80, avgLen => 40, sort => "", dataGen => "StringGen(IntGen(10,80))" }, |
342 : | parrello | 1.10 | 'medium-string' => |
343 : | parrello | 1.60 | { sqlType => 'VARCHAR(160)', maxLen => 160, avgLen => 40, sort => "", dataGen => "StringGen(IntGen(10,160))" }, |
344 : | parrello | 1.10 | ); |
345 : | parrello | 1.1 | |
346 : | # Table translating arities into natural language. | ||
347 : | my %ArityTable = ( '11' => 'one-to-one', | ||
348 : | parrello | 1.10 | '1M' => 'one-to-many', |
349 : | 'MM' => 'many-to-many' | ||
350 : | ); | ||
351 : | parrello | 1.1 | |
352 : | # Table for interpreting string patterns. | ||
353 : | |||
354 : | my %PictureTable = ( 'A' => "abcdefghijklmnopqrstuvwxyz", | ||
355 : | parrello | 1.10 | '9' => "0123456789", |
356 : | 'X' => "abcdefghijklmnopqrstuvwxyz0123456789", | ||
357 : | 'V' => "aeiou", | ||
358 : | 'K' => "bcdfghjklmnoprstvwxyz" | ||
359 : | ); | ||
360 : | parrello | 1.1 | |
361 : | =head2 Public Methods | ||
362 : | |||
363 : | =head3 new | ||
364 : | |||
365 : | parrello | 1.5 | C<< my $database = ERDB->new($dbh, $metaFileName); >> |
366 : | parrello | 1.1 | |
367 : | Create a new ERDB object. | ||
368 : | |||
369 : | =over 4 | ||
370 : | |||
371 : | =item dbh | ||
372 : | |||
373 : | DBKernel database object for the target database. | ||
374 : | |||
375 : | =item metaFileName | ||
376 : | |||
377 : | Name of the XML file containing the metadata. | ||
378 : | |||
379 : | =back | ||
380 : | |||
381 : | =cut | ||
382 : | |||
383 : | sub new { | ||
384 : | parrello | 1.10 | # Get the parameters. |
385 : | my ($class, $dbh, $metaFileName, $options) = @_; | ||
386 : | # Load the meta-data. | ||
387 : | my $metaData = _LoadMetaData($metaFileName); | ||
388 : | # Create the object. | ||
389 : | my $self = { _dbh => $dbh, | ||
390 : | _metaData => $metaData | ||
391 : | }; | ||
392 : | # Bless and return it. | ||
393 : | bless $self, $class; | ||
394 : | return $self; | ||
395 : | parrello | 1.1 | } |
396 : | |||
397 : | =head3 ShowMetaData | ||
398 : | |||
399 : | parrello | 1.18 | C<< $erdb->ShowMetaData($fileName); >> |
400 : | parrello | 1.1 | |
401 : | This method outputs a description of the database. This description can be used to help users create | ||
402 : | the data to be loaded into the relations. | ||
403 : | |||
404 : | =over 4 | ||
405 : | |||
406 : | =item filename | ||
407 : | |||
408 : | The name of the output file. | ||
409 : | |||
410 : | =back | ||
411 : | |||
412 : | =cut | ||
413 : | |||
414 : | sub ShowMetaData { | ||
415 : | parrello | 1.10 | # Get the parameters. |
416 : | my ($self, $filename) = @_; | ||
417 : | # Get the metadata and the title string. | ||
418 : | my $metadata = $self->{_metaData}; | ||
419 : | # Get the title string. | ||
420 : | my $title = $metadata->{Title}; | ||
421 : | # Get the entity and relationship lists. | ||
422 : | my $entityList = $metadata->{Entities}; | ||
423 : | my $relationshipList = $metadata->{Relationships}; | ||
424 : | # Open the output file. | ||
425 : | open(HTMLOUT, ">$filename") || Confess("Could not open MetaData display file $filename: $!"); | ||
426 : | Trace("Building MetaData table of contents.") if T(4); | ||
427 : | # Write the HTML heading stuff. | ||
428 : | print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; | ||
429 : | print HTMLOUT "</head>\n<body>\n"; | ||
430 : | parrello | 1.45 | # Write the documentation. |
431 : | print HTMLOUT $self->DisplayMetaData(); | ||
432 : | # Close the document. | ||
433 : | print HTMLOUT "</body>\n</html>\n"; | ||
434 : | # Close the file. | ||
435 : | close HTMLOUT; | ||
436 : | } | ||
437 : | |||
438 : | =head3 DisplayMetaData | ||
439 : | |||
440 : | C<< my $html = $erdb->DisplayMetaData(); >> | ||
441 : | |||
442 : | Return an HTML description of the database. This description can be used to help users create | ||
443 : | the data to be loaded into the relations and form queries. The output is raw includable HTML | ||
444 : | without any HEAD or BODY tags. | ||
445 : | |||
446 : | =over 4 | ||
447 : | |||
448 : | =item filename | ||
449 : | |||
450 : | The name of the output file. | ||
451 : | |||
452 : | =back | ||
453 : | |||
454 : | =cut | ||
455 : | |||
456 : | sub DisplayMetaData { | ||
457 : | # Get the parameters. | ||
458 : | my ($self) = @_; | ||
459 : | # Get the metadata and the title string. | ||
460 : | my $metadata = $self->{_metaData}; | ||
461 : | # Get the title string. | ||
462 : | my $title = $metadata->{Title}; | ||
463 : | # Get the entity and relationship lists. | ||
464 : | my $entityList = $metadata->{Entities}; | ||
465 : | my $relationshipList = $metadata->{Relationships}; | ||
466 : | # Declare the return variable. | ||
467 : | my $retVal = ""; | ||
468 : | # Open the output file. | ||
469 : | Trace("Building MetaData table of contents.") if T(4); | ||
470 : | parrello | 1.10 | # Here we do the table of contents. It starts as an unordered list of section names. Each |
471 : | # section contains an ordered list of entity or relationship subsections. | ||
472 : | parrello | 1.45 | $retVal .= "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; |
473 : | parrello | 1.10 | # Loop through the Entities, displaying a list item for each. |
474 : | foreach my $key (sort keys %{$entityList}) { | ||
475 : | # Display this item. | ||
476 : | parrello | 1.45 | $retVal .= "<li><a href=\"#$key\">$key</a></li>\n"; |
477 : | parrello | 1.10 | } |
478 : | # Close off the entity section and start the relationship section. | ||
479 : | parrello | 1.45 | $retVal .= "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; |
480 : | parrello | 1.10 | # Loop through the Relationships. |
481 : | foreach my $key (sort keys %{$relationshipList}) { | ||
482 : | # Display this item. | ||
483 : | my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); | ||
484 : | parrello | 1.45 | $retVal .= "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; |
485 : | parrello | 1.10 | } |
486 : | # Close off the relationship section and list the join table section. | ||
487 : | parrello | 1.45 | $retVal .= "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; |
488 : | parrello | 1.10 | # Close off the table of contents itself. |
489 : | parrello | 1.45 | $retVal .= "</ul>\n"; |
490 : | parrello | 1.10 | # Now we start with the actual data. Denote we're starting the entity section. |
491 : | parrello | 1.45 | $retVal .= "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; |
492 : | parrello | 1.10 | # Loop through the entities. |
493 : | for my $key (sort keys %{$entityList}) { | ||
494 : | Trace("Building MetaData entry for $key entity.") if T(4); | ||
495 : | # Create the entity header. It contains a bookmark and the entity name. | ||
496 : | parrello | 1.45 | $retVal .= "<a name=\"$key\"></a><h3>$key</h3>\n"; |
497 : | parrello | 1.10 | # Get the entity data. |
498 : | my $entityData = $entityList->{$key}; | ||
499 : | # If there's descriptive text, display it. | ||
500 : | if (my $notes = $entityData->{Notes}) { | ||
501 : | parrello | 1.45 | $retVal .= "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
502 : | parrello | 1.10 | } |
503 : | # Now we want a list of the entity's relationships. First, we set up the relationship subsection. | ||
504 : | parrello | 1.45 | $retVal .= "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; |
505 : | parrello | 1.10 | # Loop through the relationships. |
506 : | for my $relationship (sort keys %{$relationshipList}) { | ||
507 : | # Get the relationship data. | ||
508 : | my $relationshipStructure = $relationshipList->{$relationship}; | ||
509 : | # Only use the relationship if if has this entity in its FROM or TO fields. | ||
510 : | if ($relationshipStructure->{from} eq $key || $relationshipStructure->{to} eq $key) { | ||
511 : | # Get the relationship sentence and append the arity. | ||
512 : | my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); | ||
513 : | # Display the relationship data. | ||
514 : | parrello | 1.45 | $retVal .= "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; |
515 : | parrello | 1.10 | } |
516 : | } | ||
517 : | # Close off the relationship list. | ||
518 : | parrello | 1.45 | $retVal .= "</ul>\n"; |
519 : | parrello | 1.10 | # Get the entity's relations. |
520 : | my $relationList = $entityData->{Relations}; | ||
521 : | # Create a header for the relation subsection. | ||
522 : | parrello | 1.45 | $retVal .= "<h4>Relations for <b>$key</b></h4>\n"; |
523 : | parrello | 1.10 | # Loop through the relations, displaying them. |
524 : | for my $relation (sort keys %{$relationList}) { | ||
525 : | my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); | ||
526 : | parrello | 1.45 | $retVal .= $htmlString; |
527 : | parrello | 1.10 | } |
528 : | } | ||
529 : | # Denote we're starting the relationship section. | ||
530 : | parrello | 1.45 | $retVal .= "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; |
531 : | parrello | 1.10 | # Loop through the relationships. |
532 : | for my $key (sort keys %{$relationshipList}) { | ||
533 : | Trace("Building MetaData entry for $key relationship.") if T(4); | ||
534 : | # Get the relationship's structure. | ||
535 : | my $relationshipStructure = $relationshipList->{$key}; | ||
536 : | # Create the relationship header. | ||
537 : | my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); | ||
538 : | parrello | 1.45 | $retVal .= "<h3><a name=\"$key\"></a>$headerText</h3>\n"; |
539 : | parrello | 1.10 | # Get the entity names. |
540 : | my $fromEntity = $relationshipStructure->{from}; | ||
541 : | my $toEntity = $relationshipStructure->{to}; | ||
542 : | # Describe the relationship arity. Note there's a bit of trickiness involving recursive | ||
543 : | # many-to-many relationships. In a normal many-to-many we use two sentences to describe | ||
544 : | # the arity (one for each direction). This is a bad idea for a recursive relationship, | ||
545 : | # since both sentences will say the same thing. | ||
546 : | my $arity = $relationshipStructure->{arity}; | ||
547 : | if ($arity eq "11") { | ||
548 : | parrello | 1.45 | $retVal .= "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; |
549 : | parrello | 1.10 | } else { |
550 : | parrello | 1.45 | $retVal .= "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; |
551 : | parrello | 1.10 | if ($arity eq "MM" && $fromEntity ne $toEntity) { |
552 : | parrello | 1.45 | $retVal .= "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; |
553 : | parrello | 1.10 | } |
554 : | } | ||
555 : | parrello | 1.45 | $retVal .= "</p>\n"; |
556 : | parrello | 1.10 | # If there are notes on this relationship, display them. |
557 : | if (my $notes = $relationshipStructure->{Notes}) { | ||
558 : | parrello | 1.45 | $retVal .= "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; |
559 : | parrello | 1.10 | } |
560 : | # Generate the relationship's relation table. | ||
561 : | my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); | ||
562 : | parrello | 1.45 | $retVal .= $htmlString; |
563 : | parrello | 1.10 | } |
564 : | Trace("Building MetaData join table.") if T(4); | ||
565 : | # Denote we're starting the join table. | ||
566 : | parrello | 1.45 | $retVal .= "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; |
567 : | parrello | 1.10 | # Create a table header. |
568 : | parrello | 1.45 | $retVal .= _OpenTable("Join Table", "Source", "Target", "Join Condition"); |
569 : | parrello | 1.10 | # Loop through the joins. |
570 : | my $joinTable = $metadata->{Joins}; | ||
571 : | my @joinKeys = keys %{$joinTable}; | ||
572 : | for my $joinKey (sort @joinKeys) { | ||
573 : | # Separate out the source, the target, and the join clause. | ||
574 : | $joinKey =~ m!^([^/]+)/(.+)$!; | ||
575 : | my ($sourceRelation, $targetRelation) = ($1, $2); | ||
576 : | parrello | 1.30 | Trace("Join with key $joinKey is from $sourceRelation to $targetRelation.") if T(Joins => 4); |
577 : | parrello | 1.10 | my $source = $self->ComputeObjectSentence($sourceRelation); |
578 : | my $target = $self->ComputeObjectSentence($targetRelation); | ||
579 : | my $clause = $joinTable->{$joinKey}; | ||
580 : | # Display them in a table row. | ||
581 : | parrello | 1.45 | $retVal .= "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; |
582 : | parrello | 1.10 | } |
583 : | # Close the table. | ||
584 : | parrello | 1.45 | $retVal .= _CloseTable(); |
585 : | Trace("Built MetaData HTML.") if T(3); | ||
586 : | # Return the HTML. | ||
587 : | return $retVal; | ||
588 : | parrello | 1.1 | } |
589 : | |||
590 : | =head3 DumpMetaData | ||
591 : | |||
592 : | parrello | 1.18 | C<< $erdb->DumpMetaData(); >> |
593 : | parrello | 1.1 | |
594 : | Return a dump of the metadata structure. | ||
595 : | |||
596 : | =cut | ||
597 : | |||
598 : | sub DumpMetaData { | ||
599 : | parrello | 1.10 | # Get the parameters. |
600 : | my ($self) = @_; | ||
601 : | # Dump the meta-data. | ||
602 : | return Data::Dumper::Dumper($self->{_metaData}); | ||
603 : | parrello | 1.1 | } |
604 : | |||
605 : | =head3 CreateTables | ||
606 : | |||
607 : | parrello | 1.18 | C<< $erdb->CreateTables(); >> |
608 : | parrello | 1.1 | |
609 : | This method creates the tables for the database from the metadata structure loaded by the | ||
610 : | constructor. It is expected this function will only be used on rare occasions, when the | ||
611 : | parrello | 1.2 | user needs to start with an empty database. Otherwise, the L</LoadTables> method can be |
612 : | parrello | 1.1 | used by itself with the truncate flag turned on. |
613 : | |||
614 : | =cut | ||
615 : | |||
616 : | sub CreateTables { | ||
617 : | parrello | 1.10 | # Get the parameters. |
618 : | my ($self) = @_; | ||
619 : | parrello | 1.23 | # Get the relation names. |
620 : | my @relNames = $self->GetTableNames(); | ||
621 : | # Loop through the relations. | ||
622 : | for my $relationName (@relNames) { | ||
623 : | # Create a table for this relation. | ||
624 : | $self->CreateTable($relationName); | ||
625 : | Trace("Relation $relationName created.") if T(2); | ||
626 : | parrello | 1.10 | } |
627 : | parrello | 1.1 | } |
628 : | |||
629 : | =head3 CreateTable | ||
630 : | |||
631 : | parrello | 1.18 | C<< $erdb->CreateTable($tableName, $indexFlag, $estimatedRows); >> |
632 : | parrello | 1.1 | |
633 : | Create the table for a relation and optionally create its indexes. | ||
634 : | |||
635 : | =over 4 | ||
636 : | |||
637 : | =item relationName | ||
638 : | |||
639 : | Name of the relation (which will also be the table name). | ||
640 : | |||
641 : | parrello | 1.18 | =item indexFlag |
642 : | parrello | 1.1 | |
643 : | TRUE if the indexes for the relation should be created, else FALSE. If FALSE, | ||
644 : | L</CreateIndexes> must be called later to bring the indexes into existence. | ||
645 : | |||
646 : | parrello | 1.18 | =item estimatedRows (optional) |
647 : | |||
648 : | If specified, the estimated maximum number of rows for the relation. This | ||
649 : | information allows the creation of tables using storage engines that are | ||
650 : | faster but require size estimates, such as MyISAM. | ||
651 : | |||
652 : | parrello | 1.1 | =back |
653 : | |||
654 : | =cut | ||
655 : | |||
656 : | sub CreateTable { | ||
657 : | parrello | 1.10 | # Get the parameters. |
658 : | parrello | 1.18 | my ($self, $relationName, $indexFlag, $estimatedRows) = @_; |
659 : | parrello | 1.10 | # Get the database handle. |
660 : | my $dbh = $self->{_dbh}; | ||
661 : | # Get the relation data and determine whether or not the relation is primary. | ||
662 : | my $relationData = $self->_FindRelation($relationName); | ||
663 : | my $rootFlag = $self->_IsPrimary($relationName); | ||
664 : | # Create a list of the field data. | ||
665 : | my @fieldList; | ||
666 : | for my $fieldData (@{$relationData->{Fields}}) { | ||
667 : | # Assemble the field name and type. | ||
668 : | my $fieldName = _FixName($fieldData->{name}); | ||
669 : | my $fieldString = "$fieldName $TypeTable{$fieldData->{type}}->{sqlType} NOT NULL "; | ||
670 : | # Push the result into the field list. | ||
671 : | push @fieldList, $fieldString; | ||
672 : | } | ||
673 : | # If this is a root table, add the "new_record" flag. It defaults to 0, so | ||
674 : | if ($rootFlag) { | ||
675 : | push @fieldList, "new_record $TypeTable{boolean}->{sqlType} NOT NULL DEFAULT 0"; | ||
676 : | } | ||
677 : | # Convert the field list into a comma-delimited string. | ||
678 : | my $fieldThing = join(', ', @fieldList); | ||
679 : | # Insure the table is not already there. | ||
680 : | $dbh->drop_table(tbl => $relationName); | ||
681 : | Trace("Table $relationName dropped.") if T(2); | ||
682 : | parrello | 1.18 | # If there are estimated rows, create an estimate so we can take advantage of |
683 : | # faster DB technologies. | ||
684 : | my $estimation = undef; | ||
685 : | if ($estimatedRows) { | ||
686 : | $estimation = [$self->EstimateRowSize($relationName), $estimatedRows]; | ||
687 : | } | ||
688 : | parrello | 1.10 | # Create the table. |
689 : | Trace("Creating table $relationName: $fieldThing") if T(2); | ||
690 : | parrello | 1.18 | $dbh->create_table(tbl => $relationName, flds => $fieldThing, estimates => $estimation); |
691 : | parrello | 1.10 | Trace("Relation $relationName created in database.") if T(2); |
692 : | # If we want to build the indexes, we do it here. | ||
693 : | if ($indexFlag) { | ||
694 : | $self->CreateIndex($relationName); | ||
695 : | } | ||
696 : | parrello | 1.1 | } |
697 : | |||
698 : | parrello | 1.31 | =head3 VerifyFields |
699 : | |||
700 : | C<< my $count = $erdb->VerifyFields($relName, \@fieldList); >> | ||
701 : | |||
702 : | Run through the list of proposed field values, insuring that all the character fields are | ||
703 : | below the maximum length. If any fields are too long, they will be truncated in place. | ||
704 : | |||
705 : | =over 4 | ||
706 : | |||
707 : | =item relName | ||
708 : | |||
709 : | Name of the relation for which the specified fields are destined. | ||
710 : | |||
711 : | =item fieldList | ||
712 : | |||
713 : | Reference to a list, in order, of the fields to be put into the relation. | ||
714 : | |||
715 : | =item RETURN | ||
716 : | |||
717 : | Returns the number of fields truncated. | ||
718 : | |||
719 : | =back | ||
720 : | |||
721 : | =cut | ||
722 : | |||
723 : | sub VerifyFields { | ||
724 : | # Get the parameters. | ||
725 : | my ($self, $relName, $fieldList) = @_; | ||
726 : | # Initialize the return value. | ||
727 : | my $retVal = 0; | ||
728 : | # Get the relation definition. | ||
729 : | my $relData = $self->_FindRelation($relName); | ||
730 : | # Get the list of field descriptors. | ||
731 : | my $fieldTypes = $relData->{Fields}; | ||
732 : | my $fieldCount = scalar @{$fieldTypes}; | ||
733 : | # Loop through the two lists. | ||
734 : | for (my $i = 0; $i < $fieldCount; $i++) { | ||
735 : | # Get the type of the current field. | ||
736 : | my $fieldType = $fieldTypes->[$i]->{type}; | ||
737 : | # If it's a character field, verify the length. | ||
738 : | if ($fieldType =~ /string/) { | ||
739 : | my $maxLen = $TypeTable{$fieldType}->{maxLen}; | ||
740 : | my $oldString = $fieldList->[$i]; | ||
741 : | if (length($oldString) > $maxLen) { | ||
742 : | # Here it's too big, so we truncate it. | ||
743 : | Trace("Truncating field $i in relation $relName to $maxLen characters from \"$oldString\".") if T(1); | ||
744 : | $fieldList->[$i] = substr $oldString, 0, $maxLen; | ||
745 : | $retVal++; | ||
746 : | } | ||
747 : | } | ||
748 : | } | ||
749 : | # Return the truncation count. | ||
750 : | return $retVal; | ||
751 : | } | ||
752 : | |||
753 : | parrello | 1.42 | =head3 DigestFields |
754 : | |||
755 : | C<< $erdb->DigestFields($relName, $fieldList); >> | ||
756 : | |||
757 : | Digest the strings in the field list that correspond to data type C<hash-string> in the | ||
758 : | specified relation. | ||
759 : | |||
760 : | =over 4 | ||
761 : | |||
762 : | =item relName | ||
763 : | |||
764 : | Name of the relation to which the fields belong. | ||
765 : | |||
766 : | =item fieldList | ||
767 : | |||
768 : | List of field contents to be loaded into the relation. | ||
769 : | |||
770 : | =back | ||
771 : | |||
772 : | =cut | ||
773 : | #: Return Type ; | ||
774 : | sub DigestFields { | ||
775 : | # Get the parameters. | ||
776 : | my ($self, $relName, $fieldList) = @_; | ||
777 : | # Get the relation definition. | ||
778 : | my $relData = $self->_FindRelation($relName); | ||
779 : | # Get the list of field descriptors. | ||
780 : | my $fieldTypes = $relData->{Fields}; | ||
781 : | my $fieldCount = scalar @{$fieldTypes}; | ||
782 : | # Loop through the two lists. | ||
783 : | for (my $i = 0; $i < $fieldCount; $i++) { | ||
784 : | # Get the type of the current field. | ||
785 : | my $fieldType = $fieldTypes->[$i]->{type}; | ||
786 : | # If it's a hash string, digest it in place. | ||
787 : | if ($fieldType eq 'hash-string') { | ||
788 : | parrello | 1.46 | $fieldList->[$i] = $self->DigestKey($fieldList->[$i]); |
789 : | parrello | 1.42 | } |
790 : | } | ||
791 : | } | ||
792 : | |||
793 : | parrello | 1.46 | =head3 DigestKey |
794 : | |||
795 : | C<< my $digested = $erdb->DigestKey($keyValue); >> | ||
796 : | |||
797 : | Return the digested value of a symbolic key. The digested value can then be plugged into a | ||
798 : | key-based search into a table with key-type hash-string. | ||
799 : | |||
800 : | Currently the digesting process is independent of the database structure, but that may not | ||
801 : | always be the case, so this is an instance method instead of a static method. | ||
802 : | |||
803 : | =over 4 | ||
804 : | |||
805 : | =item keyValue | ||
806 : | |||
807 : | Key value to digest. | ||
808 : | |||
809 : | =item RETURN | ||
810 : | |||
811 : | parrello | 1.56 | Digested value of the key. |
812 : | parrello | 1.46 | |
813 : | =back | ||
814 : | |||
815 : | =cut | ||
816 : | |||
817 : | sub DigestKey { | ||
818 : | # Get the parameters. | ||
819 : | my ($self, $keyValue) = @_; | ||
820 : | # Compute the digest. | ||
821 : | my $retVal = md5_base64($keyValue); | ||
822 : | # Return the result. | ||
823 : | return $retVal; | ||
824 : | } | ||
825 : | |||
826 : | parrello | 1.1 | =head3 CreateIndex |
827 : | |||
828 : | parrello | 1.18 | C<< $erdb->CreateIndex($relationName); >> |
829 : | parrello | 1.1 | |
830 : | Create the indexes for a relation. If a table is being loaded from a large source file (as | ||
831 : | parrello | 1.12 | is the case in L</LoadTable>), it is sometimes best to create the indexes after the load. |
832 : | If that is the case, then L</CreateTable> should be called with the index flag set to | ||
833 : | FALSE, and this method used after the load to create the indexes for the table. | ||
834 : | parrello | 1.1 | |
835 : | =cut | ||
836 : | |||
837 : | sub CreateIndex { | ||
838 : | parrello | 1.10 | # Get the parameters. |
839 : | my ($self, $relationName) = @_; | ||
840 : | # Get the relation's descriptor. | ||
841 : | my $relationData = $self->_FindRelation($relationName); | ||
842 : | # Get the database handle. | ||
843 : | my $dbh = $self->{_dbh}; | ||
844 : | # Now we need to create this relation's indexes. We do this by looping through its index table. | ||
845 : | my $indexHash = $relationData->{Indexes}; | ||
846 : | for my $indexName (keys %{$indexHash}) { | ||
847 : | my $indexData = $indexHash->{$indexName}; | ||
848 : | # Get the index's field list. | ||
849 : | my @fieldList = _FixNames(@{$indexData->{IndexFields}}); | ||
850 : | my $flds = join(', ', @fieldList); | ||
851 : | # Get the index's uniqueness flag. | ||
852 : | my $unique = (exists $indexData->{Unique} ? $indexData->{Unique} : 'false'); | ||
853 : | # Create the index. | ||
854 : | parrello | 1.24 | my $rv = $dbh->create_index(idx => $indexName, tbl => $relationName, |
855 : | flds => $flds, unique => $unique); | ||
856 : | if ($rv) { | ||
857 : | Trace("Index created: $indexName for $relationName ($flds)") if T(1); | ||
858 : | } else { | ||
859 : | Confess("Error creating index $indexName for $relationName using ($flds): " . $dbh->error_message()); | ||
860 : | } | ||
861 : | parrello | 1.10 | } |
862 : | parrello | 1.1 | } |
863 : | |||
864 : | =head3 LoadTables | ||
865 : | |||
866 : | parrello | 1.18 | C<< my $stats = $erdb->LoadTables($directoryName, $rebuild); >> |
867 : | parrello | 1.1 | |
868 : | This method will load the database tables from a directory. The tables must already have been created | ||
869 : | in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; | ||
870 : | all of the relations to be loaded must have a file in the directory with the same name as the relation | ||
871 : | (optionally with a suffix of C<.dtx>). Each file must be a tab-delimited table of field values. Each | ||
872 : | line of the file will be loaded as a row of the target relation table. The field values should be in | ||
873 : | the same order as the fields in the relation tables generated by L</ShowMetaData>. The old data is | ||
874 : | erased before the new data is loaded in. | ||
875 : | |||
876 : | A certain amount of translation automatically takes place. Ctrl-M characters are deleted, and | ||
877 : | tab and new-line characters inside a field are escaped as C<\t> and C<\n>, respectively. Dates must | ||
878 : | be entered as a Unix timestamp, that is, as an integer number of seconds since the base epoch. | ||
879 : | |||
880 : | =over 4 | ||
881 : | |||
882 : | =item directoryName | ||
883 : | |||
884 : | Name of the directory containing the relation files to be loaded. | ||
885 : | |||
886 : | =item rebuild | ||
887 : | |||
888 : | TRUE if the tables should be dropped and rebuilt, else FALSE. This is, unfortunately, the | ||
889 : | only way to erase existing data in the tables, since the TRUNCATE command is not supported | ||
890 : | by all of the DB engines we use. | ||
891 : | |||
892 : | =item RETURN | ||
893 : | |||
894 : | Returns a statistical object describing the number of records read and a list of the error messages. | ||
895 : | |||
896 : | =back | ||
897 : | |||
898 : | =cut | ||
899 : | |||
900 : | sub LoadTables { | ||
901 : | parrello | 1.10 | # Get the parameters. |
902 : | my ($self, $directoryName, $rebuild) = @_; | ||
903 : | # Start the timer. | ||
904 : | my $startTime = gettimeofday; | ||
905 : | # Clean any trailing slash from the directory name. | ||
906 : | $directoryName =~ s!/\\$!!; | ||
907 : | # Declare the return variable. | ||
908 : | my $retVal = Stats->new(); | ||
909 : | parrello | 1.23 | # Get the relation names. |
910 : | my @relNames = $self->GetTableNames(); | ||
911 : | for my $relationName (@relNames) { | ||
912 : | # Try to load this relation. | ||
913 : | my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); | ||
914 : | parrello | 1.10 | # Accumulate the statistics. |
915 : | $retVal->Accumulate($result); | ||
916 : | } | ||
917 : | # Add the duration of the load to the statistical object. | ||
918 : | $retVal->Add('duration', gettimeofday - $startTime); | ||
919 : | # Return the accumulated statistics. | ||
920 : | return $retVal; | ||
921 : | parrello | 1.1 | } |
922 : | |||
923 : | parrello | 1.23 | |
924 : | parrello | 1.1 | =head3 GetTableNames |
925 : | |||
926 : | parrello | 1.18 | C<< my @names = $erdb->GetTableNames; >> |
927 : | parrello | 1.1 | |
928 : | Return a list of the relations required to implement this database. | ||
929 : | |||
930 : | =cut | ||
931 : | |||
932 : | sub GetTableNames { | ||
933 : | parrello | 1.10 | # Get the parameters. |
934 : | my ($self) = @_; | ||
935 : | # Get the relation list from the metadata. | ||
936 : | my $relationTable = $self->{_metaData}->{RelationTable}; | ||
937 : | # Return the relation names. | ||
938 : | return keys %{$relationTable}; | ||
939 : | parrello | 1.1 | } |
940 : | |||
941 : | =head3 GetEntityTypes | ||
942 : | |||
943 : | parrello | 1.18 | C<< my @names = $erdb->GetEntityTypes; >> |
944 : | parrello | 1.1 | |
945 : | Return a list of the entity type names. | ||
946 : | |||
947 : | =cut | ||
948 : | |||
949 : | sub GetEntityTypes { | ||
950 : | parrello | 1.10 | # Get the database object. |
951 : | my ($self) = @_; | ||
952 : | # Get the entity list from the metadata object. | ||
953 : | my $entityList = $self->{_metaData}->{Entities}; | ||
954 : | # Return the list of entity names in alphabetical order. | ||
955 : | return sort keys %{$entityList}; | ||
956 : | parrello | 1.1 | } |
957 : | |||
958 : | parrello | 1.20 | =head3 IsEntity |
959 : | |||
960 : | C<< my $flag = $erdb->IsEntity($entityName); >> | ||
961 : | |||
962 : | Return TRUE if the parameter is an entity name, else FALSE. | ||
963 : | |||
964 : | =over 4 | ||
965 : | |||
966 : | =item entityName | ||
967 : | |||
968 : | Object name to be tested. | ||
969 : | |||
970 : | =item RETURN | ||
971 : | |||
972 : | Returns TRUE if the specified string is an entity name, else FALSE. | ||
973 : | |||
974 : | =back | ||
975 : | |||
976 : | =cut | ||
977 : | |||
978 : | sub IsEntity { | ||
979 : | # Get the parameters. | ||
980 : | my ($self, $entityName) = @_; | ||
981 : | # Test to see if it's an entity. | ||
982 : | return exists $self->{_metaData}->{Entities}->{$entityName}; | ||
983 : | } | ||
984 : | |||
985 : | parrello | 1.1 | =head3 Get |
986 : | |||
987 : | parrello | 1.45 | C<< my $query = $erdb->Get(\@objectNames, $filterClause, \@params); >> |
988 : | parrello | 1.1 | |
989 : | This method returns a query object for entities of a specified type using a specified filter. | ||
990 : | The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each | ||
991 : | field name represented in the form B<I<objectName>(I<fieldName>)>. For example, the | ||
992 : | following call requests all B<Genome> objects for the genus specified in the variable | ||
993 : | $genus. | ||
994 : | |||
995 : | parrello | 1.45 | C<< $query = $erdb->Get(['Genome'], "Genome(genus) = ?", [$genus]); >> |
996 : | parrello | 1.1 | |
997 : | The WHERE clause contains a single question mark, so there is a single additional | ||
998 : | parameter representing the parameter value. It would also be possible to code | ||
999 : | |||
1000 : | parrello | 1.18 | C<< $query = $erdb->Get(['Genome'], "Genome(genus) = \'$genus\'"); >> |
1001 : | parrello | 1.1 | |
1002 : | however, this version of the call would generate a syntax error if there were any quote | ||
1003 : | characters inside the variable C<$genus>. | ||
1004 : | |||
1005 : | The use of the strange parenthesized notation for field names enables us to distinguish | ||
1006 : | hyphens contained within field names from minus signs that participate in the computation | ||
1007 : | of the WHERE clause. All of the methods that manipulate fields will use this same notation. | ||
1008 : | |||
1009 : | It is possible to specify multiple entity and relationship names in order to retrieve more than | ||
1010 : | one object's data at the same time, which allows highly complex joined queries. For example, | ||
1011 : | |||
1012 : | parrello | 1.45 | C<< $query = $erdb->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", [$genus]); >> |
1013 : | parrello | 1.1 | |
1014 : | If multiple names are specified, then the query processor will automatically determine a | ||
1015 : | join path between the entities and relationships. The algorithm used is very simplistic. | ||
1016 : | parrello | 1.39 | In particular, if a relationship is recursive, the path is determined by the order in which |
1017 : | the entity and the relationship appear. For example, consider a recursive relationship | ||
1018 : | B<IsParentOf> which relates B<People> objects to other B<People> objects. If the join path is | ||
1019 : | parrello | 1.1 | coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, |
1020 : | the join path is C<['IsParentOf', 'People']>, then the people returned will be children. | ||
1021 : | |||
1022 : | parrello | 1.39 | If an entity or relationship is mentioned twice, the name for the second occurrence will |
1023 : | be suffixed with C<2>, the third occurrence will be suffixed with C<3>, and so forth. So, | ||
1024 : | for example, if we have C<['Feature', 'HasContig', 'Contig', 'HasContig']>, then the | ||
1025 : | B<to-link> field of the first B<HasContig> is specified as C<HasContig(to-link)>, while | ||
1026 : | the B<to-link> field of the second B<HasContig> is specified as C<HasContig2(to-link)>. | ||
1027 : | |||
1028 : | parrello | 1.1 | =over 4 |
1029 : | |||
1030 : | =item objectNames | ||
1031 : | |||
1032 : | List containing the names of the entity and relationship objects to be retrieved. | ||
1033 : | |||
1034 : | =item filterClause | ||
1035 : | |||
1036 : | WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
1037 : | be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be | ||
1038 : | specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified | ||
1039 : | in the filter clause should be added to the parameter list as additional parameters. The | ||
1040 : | fields in a filter clause can come from primary entity relations, relationship relations, | ||
1041 : | or secondary entity relations; however, all of the entities and relationships involved must | ||
1042 : | be included in the list of object names. | ||
1043 : | |||
1044 : | The filter clause can also specify a sort order. To do this, simply follow the filter string | ||
1045 : | with an ORDER BY clause. For example, the following filter string gets all genomes for a | ||
1046 : | particular genus and sorts them by species name. | ||
1047 : | |||
1048 : | C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | ||
1049 : | |||
1050 : | parrello | 1.30 | Note that the case is important. Only an uppercase "ORDER BY" with a single space will |
1051 : | be processed. The idea is to make it less likely to find the verb by accident. | ||
1052 : | |||
1053 : | parrello | 1.1 | The rules for field references in a sort order are the same as those for field references in the |
1054 : | filter clause in general; however, odd things may happen if a sort field is from a secondary | ||
1055 : | relation. | ||
1056 : | |||
1057 : | parrello | 1.39 | Finally, you can limit the number of rows returned by adding a LIMIT clause. The LIMIT must |
1058 : | be the last thing in the filter clause, and it contains only the word "LIMIT" followed by | ||
1059 : | a positive number. So, for example | ||
1060 : | |||
1061 : | C<< "Genome(genus) = ? ORDER BY Genome(species) LIMIT 10" >> | ||
1062 : | |||
1063 : | will only return the first ten genomes for the specified genus. The ORDER BY clause is not | ||
1064 : | required. For example, to just get the first 10 genomes in the B<Genome> table, you could | ||
1065 : | use | ||
1066 : | |||
1067 : | C<< "LIMIT 10" >> | ||
1068 : | |||
1069 : | parrello | 1.45 | =item params |
1070 : | parrello | 1.1 | |
1071 : | parrello | 1.45 | Reference to a list of parameter values to be substituted into the filter clause. |
1072 : | parrello | 1.1 | |
1073 : | =item RETURN | ||
1074 : | |||
1075 : | Returns a B<DBQuery> that can be used to iterate through all of the results. | ||
1076 : | |||
1077 : | =back | ||
1078 : | |||
1079 : | =cut | ||
1080 : | |||
1081 : | sub Get { | ||
1082 : | parrello | 1.10 | # Get the parameters. |
1083 : | parrello | 1.45 | my ($self, $objectNames, $filterClause, $params) = @_; |
1084 : | # Process the SQL stuff. | ||
1085 : | my ($suffix, $mappedNameListRef, $mappedNameHashRef) = | ||
1086 : | $self->_SetupSQL($objectNames, $filterClause); | ||
1087 : | # Create the query. | ||
1088 : | my $command = "SELECT DISTINCT " . join(".*, ", @{$mappedNameListRef}) . | ||
1089 : | ".* $suffix"; | ||
1090 : | my $sth = $self->_GetStatementHandle($command, $params); | ||
1091 : | parrello | 1.39 | # Now we create the relation map, which enables DBQuery to determine the order, name |
1092 : | # and mapped name for each object in the query. | ||
1093 : | my @relationMap = (); | ||
1094 : | parrello | 1.45 | for my $mappedName (@{$mappedNameListRef}) { |
1095 : | push @relationMap, [$mappedName, $mappedNameHashRef->{$mappedName}]; | ||
1096 : | parrello | 1.39 | } |
1097 : | parrello | 1.10 | # Return the statement object. |
1098 : | parrello | 1.39 | my $retVal = DBQuery::_new($self, $sth, \@relationMap); |
1099 : | parrello | 1.10 | return $retVal; |
1100 : | parrello | 1.1 | } |
1101 : | |||
1102 : | parrello | 1.45 | =head3 GetFlat |
1103 : | |||
1104 : | C<< my @list = $erdb->GetFlat(\@objectNames, $filterClause, \@parameterList, $field); >> | ||
1105 : | |||
1106 : | This is a variation of L</GetAll> that asks for only a single field per record and | ||
1107 : | returns a single flattened list. | ||
1108 : | |||
1109 : | =over 4 | ||
1110 : | |||
1111 : | =item objectNames | ||
1112 : | |||
1113 : | List containing the names of the entity and relationship objects to be retrieved. | ||
1114 : | |||
1115 : | =item filterClause | ||
1116 : | |||
1117 : | WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
1118 : | be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form | ||
1119 : | B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the | ||
1120 : | parameter list as additional parameters. The fields in a filter clause can come from primary | ||
1121 : | entity relations, relationship relations, or secondary entity relations; however, all of the | ||
1122 : | entities and relationships involved must be included in the list of object names. | ||
1123 : | |||
1124 : | =item parameterList | ||
1125 : | |||
1126 : | List of the parameters to be substituted in for the parameters marks in the filter clause. | ||
1127 : | |||
1128 : | =item field | ||
1129 : | |||
1130 : | Name of the field to be used to get the elements of the list returned. | ||
1131 : | |||
1132 : | =item RETURN | ||
1133 : | |||
1134 : | Returns a list of values. | ||
1135 : | |||
1136 : | =back | ||
1137 : | |||
1138 : | =cut | ||
1139 : | #: Return Type @; | ||
1140 : | sub GetFlat { | ||
1141 : | # Get the parameters. | ||
1142 : | my ($self, $objectNames, $filterClause, $parameterList, $field) = @_; | ||
1143 : | # Construct the query. | ||
1144 : | my $query = $self->Get($objectNames, $filterClause, $parameterList); | ||
1145 : | # Create the result list. | ||
1146 : | my @retVal = (); | ||
1147 : | # Loop through the records, adding the field values found to the result list. | ||
1148 : | while (my $row = $query->Fetch()) { | ||
1149 : | push @retVal, $row->Value($field); | ||
1150 : | } | ||
1151 : | # Return the list created. | ||
1152 : | return @retVal; | ||
1153 : | } | ||
1154 : | |||
1155 : | parrello | 1.32 | =head3 Delete |
1156 : | |||
1157 : | C<< my $stats = $erdb->Delete($entityName, $objectID); >> | ||
1158 : | |||
1159 : | Delete an entity instance from the database. The instance is deleted along with all entity and | ||
1160 : | relationship instances dependent on it. The idea of dependence here is recursive. An object is | ||
1161 : | always dependent on itself. An object is dependent if it is a 1-to-many or many-to-many | ||
1162 : | relationship connected to a dependent entity or the "to" entity connected to a 1-to-many | ||
1163 : | dependent relationship. | ||
1164 : | |||
1165 : | =over 4 | ||
1166 : | |||
1167 : | =item entityName | ||
1168 : | |||
1169 : | Name of the entity type for the instance being deleted. | ||
1170 : | |||
1171 : | =item objectID | ||
1172 : | |||
1173 : | ID of the entity instance to be deleted. If the ID contains a wild card character (C<%>), | ||
1174 : | then it is presumed to by a LIKE pattern. | ||
1175 : | |||
1176 : | =item testFlag | ||
1177 : | |||
1178 : | If TRUE, the delete statements will be traced without being executed. | ||
1179 : | |||
1180 : | =item RETURN | ||
1181 : | |||
1182 : | Returns a statistics object indicating how many records of each particular table were | ||
1183 : | deleted. | ||
1184 : | |||
1185 : | =back | ||
1186 : | |||
1187 : | =cut | ||
1188 : | #: Return Type $%; | ||
1189 : | sub Delete { | ||
1190 : | # Get the parameters. | ||
1191 : | my ($self, $entityName, $objectID, $testFlag) = @_; | ||
1192 : | # Declare the return variable. | ||
1193 : | my $retVal = Stats->new(); | ||
1194 : | # Get the DBKernel object. | ||
1195 : | my $db = $self->{_dbh}; | ||
1196 : | # We're going to generate all the paths branching out from the starting entity. One of | ||
1197 : | # the things we have to be careful about is preventing loops. We'll use a hash to | ||
1198 : | # determine if we've hit a loop. | ||
1199 : | my %alreadyFound = (); | ||
1200 : | parrello | 1.33 | # These next lists will serve as our result stack. We start by pushing object lists onto |
1201 : | parrello | 1.32 | # the stack, and then popping them off to do the deletes. This means the deletes will |
1202 : | # start with the longer paths before getting to the shorter ones. That, in turn, makes | ||
1203 : | # sure we don't delete records that might be needed to forge relationships back to the | ||
1204 : | parrello | 1.33 | # original item. We have two lists-- one for TO-relationships, and one for |
1205 : | # FROM-relationships and entities. | ||
1206 : | my @fromPathList = (); | ||
1207 : | my @toPathList = (); | ||
1208 : | parrello | 1.32 | # This final hash is used to remember what work still needs to be done. We push paths |
1209 : | # onto the list, then pop them off to extend the paths. We prime it with the starting | ||
1210 : | # point. Note that we will work hard to insure that the last item on a path in the | ||
1211 : | # TODO list is always an entity. | ||
1212 : | my @todoList = ([$entityName]); | ||
1213 : | while (@todoList) { | ||
1214 : | # Get the current path. | ||
1215 : | my $current = pop @todoList; | ||
1216 : | # Copy it into a list. | ||
1217 : | my @stackedPath = @{$current}; | ||
1218 : | # Pull off the last item on the path. It will always be an entity. | ||
1219 : | my $entityName = pop @stackedPath; | ||
1220 : | # Add it to the alreadyFound list. | ||
1221 : | $alreadyFound{$entityName} = 1; | ||
1222 : | # Get the entity data. | ||
1223 : | my $entityData = $self->_GetStructure($entityName); | ||
1224 : | # The first task is to loop through the entity's relation. A DELETE command will | ||
1225 : | # be needed for each of them. | ||
1226 : | my $relations = $entityData->{Relations}; | ||
1227 : | for my $relation (keys %{$relations}) { | ||
1228 : | my @augmentedList = (@stackedPath, $relation); | ||
1229 : | parrello | 1.33 | push @fromPathList, \@augmentedList; |
1230 : | parrello | 1.32 | } |
1231 : | # Now we need to look for relationships connected to this entity. | ||
1232 : | my $relationshipList = $self->{_metaData}->{Relationships}; | ||
1233 : | for my $relationshipName (keys %{$relationshipList}) { | ||
1234 : | my $relationship = $relationshipList->{$relationshipName}; | ||
1235 : | # Check the FROM field. We're only interested if it's us. | ||
1236 : | if ($relationship->{from} eq $entityName) { | ||
1237 : | # Add the path to this relationship. | ||
1238 : | my @augmentedList = (@stackedPath, $entityName, $relationshipName); | ||
1239 : | parrello | 1.33 | push @fromPathList, \@augmentedList; |
1240 : | parrello | 1.32 | # Check the arity. If it's MM we're done. If it's 1M |
1241 : | # and the target hasn't been seen yet, we want to | ||
1242 : | # stack the entity for future processing. | ||
1243 : | if ($relationship->{arity} eq '1M') { | ||
1244 : | my $toEntity = $relationship->{to}; | ||
1245 : | if (! exists $alreadyFound{$toEntity}) { | ||
1246 : | # Here we have a new entity that's dependent on | ||
1247 : | # the current entity, so we need to stack it. | ||
1248 : | my @stackList = (@augmentedList, $toEntity); | ||
1249 : | parrello | 1.33 | push @fromPathList, \@stackList; |
1250 : | parrello | 1.34 | } else { |
1251 : | Trace("$toEntity ignored because it occurred previously.") if T(4); | ||
1252 : | parrello | 1.32 | } |
1253 : | } | ||
1254 : | } | ||
1255 : | # Now check the TO field. In this case only the relationship needs | ||
1256 : | parrello | 1.33 | # deletion. |
1257 : | parrello | 1.32 | if ($relationship->{to} eq $entityName) { |
1258 : | my @augmentedList = (@stackedPath, $entityName, $relationshipName); | ||
1259 : | parrello | 1.33 | push @toPathList, \@augmentedList; |
1260 : | parrello | 1.32 | } |
1261 : | } | ||
1262 : | } | ||
1263 : | # Create the first qualifier for the WHERE clause. This selects the | ||
1264 : | # keys of the primary entity records to be deleted. When we're deleting | ||
1265 : | # from a dependent table, we construct a join page from the first qualifier | ||
1266 : | # to the table containing the dependent records to delete. | ||
1267 : | my $qualifier = ($objectID =~ /%/ ? "LIKE ?" : "= ?"); | ||
1268 : | parrello | 1.33 | # We need to make two passes. The first is through the to-list, and |
1269 : | # the second through the from-list. The from-list is second because | ||
1270 : | # the to-list may need to pass through some of the entities the | ||
1271 : | # from-list would delete. | ||
1272 : | my %stackList = ( from_link => \@fromPathList, to_link => \@toPathList ); | ||
1273 : | # Now it's time to do the deletes. We do it in two passes. | ||
1274 : | for my $keyName ('to_link', 'from_link') { | ||
1275 : | # Get the list for this key. | ||
1276 : | my @pathList = @{$stackList{$keyName}}; | ||
1277 : | parrello | 1.34 | Trace(scalar(@pathList) . " entries in path list for $keyName.") if T(3); |
1278 : | parrello | 1.33 | # Loop through this list. |
1279 : | while (my $path = pop @pathList) { | ||
1280 : | # Get the table whose rows are to be deleted. | ||
1281 : | my @pathTables = @{$path}; | ||
1282 : | parrello | 1.37 | # Start the DELETE statement. We need to call DBKernel because the |
1283 : | # syntax of a DELETE-USING varies among DBMSs. | ||
1284 : | parrello | 1.33 | my $target = $pathTables[$#pathTables]; |
1285 : | parrello | 1.37 | my $stmt = $db->SetUsing(@pathTables); |
1286 : | parrello | 1.33 | # Now start the WHERE. The first thing is the ID field from the starting table. That |
1287 : | # starting table will either be the entity relation or one of the entity's | ||
1288 : | # sub-relations. | ||
1289 : | $stmt .= " WHERE $pathTables[0].id $qualifier"; | ||
1290 : | # Now we run through the remaining entities in the path, connecting them up. | ||
1291 : | for (my $i = 1; $i <= $#pathTables; $i += 2) { | ||
1292 : | # Connect the current relationship to the preceding entity. | ||
1293 : | my ($entity, $rel) = @pathTables[$i-1,$i]; | ||
1294 : | # The style of connection depends on the direction of the relationship. | ||
1295 : | parrello | 1.35 | $stmt .= " AND $entity.id = $rel.$keyName"; |
1296 : | parrello | 1.32 | if ($i + 1 <= $#pathTables) { |
1297 : | # Here there's a next entity, so connect that to the relationship's | ||
1298 : | # to-link. | ||
1299 : | my $entity2 = $pathTables[$i+1]; | ||
1300 : | parrello | 1.35 | $stmt .= " AND $rel.to_link = $entity2.id"; |
1301 : | parrello | 1.32 | } |
1302 : | } | ||
1303 : | parrello | 1.33 | # Now we have our desired DELETE statement. |
1304 : | if ($testFlag) { | ||
1305 : | # Here the user wants to trace without executing. | ||
1306 : | Trace($stmt) if T(0); | ||
1307 : | } else { | ||
1308 : | # Here we can delete. Note that the SQL method dies with a confessing | ||
1309 : | # if an error occurs, so we just go ahead and do it. | ||
1310 : | parrello | 1.36 | Trace("Executing delete from $target using '$objectID'.") if T(3); |
1311 : | my $rv = $db->SQL($stmt, 0, $objectID); | ||
1312 : | parrello | 1.33 | # Accumulate the statistics for this delete. The only rows deleted |
1313 : | # are from the target table, so we use its name to record the | ||
1314 : | # statistic. | ||
1315 : | $retVal->Add($target, $rv); | ||
1316 : | } | ||
1317 : | parrello | 1.32 | } |
1318 : | } | ||
1319 : | # Return the result. | ||
1320 : | return $retVal; | ||
1321 : | } | ||
1322 : | |||
1323 : | parrello | 1.6 | =head3 GetList |
1324 : | |||
1325 : | parrello | 1.45 | C<< my @dbObjects = $erdb->GetList(\@objectNames, $filterClause, \@params); >> |
1326 : | parrello | 1.6 | |
1327 : | Return a list of object descriptors for the specified objects as determined by the | ||
1328 : | specified filter clause. | ||
1329 : | |||
1330 : | This method is essentially the same as L</Get> except it returns a list of objects rather | ||
1331 : | parrello | 1.7 | than a query object that can be used to get the results one record at a time. |
1332 : | parrello | 1.6 | |
1333 : | =over 4 | ||
1334 : | |||
1335 : | =item objectNames | ||
1336 : | |||
1337 : | List containing the names of the entity and relationship objects to be retrieved. | ||
1338 : | |||
1339 : | =item filterClause | ||
1340 : | |||
1341 : | WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
1342 : | be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be | ||
1343 : | specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified | ||
1344 : | in the filter clause should be added to the parameter list as additional parameters. The | ||
1345 : | fields in a filter clause can come from primary entity relations, relationship relations, | ||
1346 : | or secondary entity relations; however, all of the entities and relationships involved must | ||
1347 : | be included in the list of object names. | ||
1348 : | |||
1349 : | The filter clause can also specify a sort order. To do this, simply follow the filter string | ||
1350 : | with an ORDER BY clause. For example, the following filter string gets all genomes for a | ||
1351 : | particular genus and sorts them by species name. | ||
1352 : | |||
1353 : | C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | ||
1354 : | |||
1355 : | The rules for field references in a sort order are the same as those for field references in the | ||
1356 : | filter clause in general; however, odd things may happen if a sort field is from a secondary | ||
1357 : | relation. | ||
1358 : | |||
1359 : | parrello | 1.45 | =item params |
1360 : | parrello | 1.6 | |
1361 : | parrello | 1.45 | Reference to a list of parameter values to be substituted into the filter clause. |
1362 : | parrello | 1.6 | |
1363 : | =item RETURN | ||
1364 : | |||
1365 : | Returns a list of B<DBObject>s that satisfy the query conditions. | ||
1366 : | |||
1367 : | =back | ||
1368 : | |||
1369 : | =cut | ||
1370 : | #: Return Type @% | ||
1371 : | sub GetList { | ||
1372 : | # Get the parameters. | ||
1373 : | parrello | 1.45 | my ($self, $objectNames, $filterClause, $params) = @_; |
1374 : | parrello | 1.10 | # Declare the return variable. |
1375 : | my @retVal = (); | ||
1376 : | # Perform the query. | ||
1377 : | parrello | 1.45 | my $query = $self->Get($objectNames, $filterClause, $params); |
1378 : | parrello | 1.10 | # Loop through the results. |
1379 : | while (my $object = $query->Fetch) { | ||
1380 : | push @retVal, $object; | ||
1381 : | } | ||
1382 : | parrello | 1.6 | # Return the result. |
1383 : | return @retVal; | ||
1384 : | } | ||
1385 : | |||
1386 : | parrello | 1.45 | =head3 GetCount |
1387 : | |||
1388 : | C<< my $count = $erdb->GetCount(\@objectNames, $filter, \@params); >> | ||
1389 : | |||
1390 : | Return the number of rows found by a specified query. This method would | ||
1391 : | normally be used to count the records in a single table. For example, in a | ||
1392 : | genetics database | ||
1393 : | |||
1394 : | my $count = $erdb->GetCount(['Genome'], 'Genome(genus-species) LIKE ?', ['homo %']); | ||
1395 : | |||
1396 : | would return the number of genomes for the genus I<homo>. It is conceivable, however, | ||
1397 : | to use it to return records based on a join. For example, | ||
1398 : | |||
1399 : | parrello | 1.47 | my $count = $erdb->GetCount(['HasFeature', 'Genome'], 'Genome(genus-species) LIKE ?', |
1400 : | parrello | 1.45 | ['homo %']); |
1401 : | |||
1402 : | would return the number of features for genomes in the genus I<homo>. Note that | ||
1403 : | only the rows from the first table are counted. If the above command were | ||
1404 : | |||
1405 : | my $count = $erdb->GetCount(['Genome', 'Feature'], 'Genome(genus-species) LIKE ?', | ||
1406 : | ['homo %']); | ||
1407 : | |||
1408 : | it would return the number of genomes, not the number of genome/feature pairs. | ||
1409 : | |||
1410 : | =over 4 | ||
1411 : | |||
1412 : | =item objectNames | ||
1413 : | |||
1414 : | Reference to a list of the objects (entities and relationships) included in the | ||
1415 : | query. | ||
1416 : | |||
1417 : | =item filter | ||
1418 : | |||
1419 : | A filter clause for restricting the query. The rules are the same as for the L</Get> | ||
1420 : | method. | ||
1421 : | |||
1422 : | =item params | ||
1423 : | |||
1424 : | Reference to a list of the parameter values to be substituted for the parameter marks | ||
1425 : | in the filter. | ||
1426 : | |||
1427 : | =item RETURN | ||
1428 : | |||
1429 : | Returns a count of the number of records in the first table that would satisfy | ||
1430 : | the query. | ||
1431 : | |||
1432 : | =back | ||
1433 : | |||
1434 : | =cut | ||
1435 : | |||
1436 : | sub GetCount { | ||
1437 : | # Get the parameters. | ||
1438 : | my ($self, $objectNames, $filter, $params) = @_; | ||
1439 : | # Declare the return variable. | ||
1440 : | my $retVal; | ||
1441 : | parrello | 1.47 | # Find out if we're counting an entity or a relationship. |
1442 : | my $countedField; | ||
1443 : | if ($self->IsEntity($objectNames->[0])) { | ||
1444 : | $countedField = "id"; | ||
1445 : | } else { | ||
1446 : | # For a relationship we count the to-link because it's usually more | ||
1447 : | # numerous. Note we're automatically converting to the SQL form | ||
1448 : | # of the field name (to_link vs. to-link). | ||
1449 : | $countedField = "to_link"; | ||
1450 : | } | ||
1451 : | parrello | 1.45 | # Create the SQL command suffix to get the desired records. |
1452 : | my ($suffix, $mappedNameListRef, $mappedNameHashRef) = $self->_SetupSQL($objectNames, | ||
1453 : | $filter); | ||
1454 : | # Prefix it with text telling it we want a record count. | ||
1455 : | my $firstObject = $mappedNameListRef->[0]; | ||
1456 : | parrello | 1.47 | my $command = "SELECT COUNT($firstObject.$countedField) $suffix"; |
1457 : | parrello | 1.45 | # Prepare and execute the command. |
1458 : | my $sth = $self->_GetStatementHandle($command, $params); | ||
1459 : | # Get the count value. | ||
1460 : | ($retVal) = $sth->fetchrow_array(); | ||
1461 : | # Check for a problem. | ||
1462 : | if (! defined($retVal)) { | ||
1463 : | if ($sth->err) { | ||
1464 : | # Here we had an SQL error. | ||
1465 : | Confess("Error retrieving row count: " . $sth->errstr()); | ||
1466 : | } else { | ||
1467 : | # Here we have no result. | ||
1468 : | Confess("No result attempting to retrieve row count."); | ||
1469 : | } | ||
1470 : | } | ||
1471 : | # Return the result. | ||
1472 : | return $retVal; | ||
1473 : | } | ||
1474 : | |||
1475 : | parrello | 1.1 | =head3 ComputeObjectSentence |
1476 : | |||
1477 : | parrello | 1.18 | C<< my $sentence = $erdb->ComputeObjectSentence($objectName); >> |
1478 : | parrello | 1.1 | |
1479 : | Check an object name, and if it is a relationship convert it to a relationship sentence. | ||
1480 : | |||
1481 : | =over 4 | ||
1482 : | |||
1483 : | =item objectName | ||
1484 : | |||
1485 : | Name of the entity or relationship. | ||
1486 : | |||
1487 : | =item RETURN | ||
1488 : | |||
1489 : | Returns a string containing the entity name or a relationship sentence. | ||
1490 : | |||
1491 : | =back | ||
1492 : | |||
1493 : | =cut | ||
1494 : | |||
1495 : | sub ComputeObjectSentence { | ||
1496 : | parrello | 1.10 | # Get the parameters. |
1497 : | my ($self, $objectName) = @_; | ||
1498 : | # Set the default return value. | ||
1499 : | my $retVal = $objectName; | ||
1500 : | # Look for the object as a relationship. | ||
1501 : | my $relTable = $self->{_metaData}->{Relationships}; | ||
1502 : | if (exists $relTable->{$objectName}) { | ||
1503 : | # Get the relationship sentence. | ||
1504 : | $retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); | ||
1505 : | } | ||
1506 : | # Return the result. | ||
1507 : | return $retVal; | ||
1508 : | parrello | 1.1 | } |
1509 : | |||
1510 : | =head3 DumpRelations | ||
1511 : | |||
1512 : | parrello | 1.18 | C<< $erdb->DumpRelations($outputDirectory); >> |
1513 : | parrello | 1.1 | |
1514 : | Write the contents of all the relations to tab-delimited files in the specified directory. | ||
1515 : | Each file will have the same name as the relation dumped, with an extension of DTX. | ||
1516 : | |||
1517 : | =over 4 | ||
1518 : | |||
1519 : | =item outputDirectory | ||
1520 : | |||
1521 : | Name of the directory into which the relation files should be dumped. | ||
1522 : | |||
1523 : | =back | ||
1524 : | |||
1525 : | =cut | ||
1526 : | |||
1527 : | sub DumpRelations { | ||
1528 : | parrello | 1.10 | # Get the parameters. |
1529 : | my ($self, $outputDirectory) = @_; | ||
1530 : | # Now we need to run through all the relations. First, we loop through the entities. | ||
1531 : | my $metaData = $self->{_metaData}; | ||
1532 : | my $entities = $metaData->{Entities}; | ||
1533 : | for my $entityName (keys %{$entities}) { | ||
1534 : | my $entityStructure = $entities->{$entityName}; | ||
1535 : | # Get the entity's relations. | ||
1536 : | my $relationList = $entityStructure->{Relations}; | ||
1537 : | # Loop through the relations, dumping them. | ||
1538 : | for my $relationName (keys %{$relationList}) { | ||
1539 : | my $relation = $relationList->{$relationName}; | ||
1540 : | $self->_DumpRelation($outputDirectory, $relationName, $relation); | ||
1541 : | } | ||
1542 : | } | ||
1543 : | # Next, we loop through the relationships. | ||
1544 : | my $relationships = $metaData->{Relationships}; | ||
1545 : | for my $relationshipName (keys %{$relationships}) { | ||
1546 : | my $relationshipStructure = $relationships->{$relationshipName}; | ||
1547 : | # Dump this relationship's relation. | ||
1548 : | $self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); | ||
1549 : | } | ||
1550 : | parrello | 1.1 | } |
1551 : | |||
1552 : | parrello | 1.51 | =head3 InsertValue |
1553 : | |||
1554 : | C<< $erdb->InsertValue($entityID, $fieldName, $value); >> | ||
1555 : | |||
1556 : | This method will insert a new value into the database. The value must be one | ||
1557 : | associated with a secondary relation, since primary values cannot be inserted: | ||
1558 : | they occur exactly once. Secondary values, on the other hand, can be missing | ||
1559 : | or multiply-occurring. | ||
1560 : | |||
1561 : | =over 4 | ||
1562 : | |||
1563 : | =item entityID | ||
1564 : | |||
1565 : | ID of the object that is to receive the new value. | ||
1566 : | |||
1567 : | =item fieldName | ||
1568 : | |||
1569 : | Field name for the new value-- this includes the entity name, since | ||
1570 : | field names are of the format I<objectName>C<(>I<fieldName>C<)>. | ||
1571 : | |||
1572 : | =item value | ||
1573 : | |||
1574 : | New value to be put in the field. | ||
1575 : | |||
1576 : | =back | ||
1577 : | |||
1578 : | =cut | ||
1579 : | |||
1580 : | sub InsertValue { | ||
1581 : | # Get the parameters. | ||
1582 : | my ($self, $entityID, $fieldName, $value) = @_; | ||
1583 : | # Parse the entity name and the real field name. | ||
1584 : | if ($fieldName =~ /^([^(]+)\(([^)]+)\)/) { | ||
1585 : | my $entityName = $1; | ||
1586 : | my $fieldTitle = $2; | ||
1587 : | # Get its descriptor. | ||
1588 : | if (!$self->IsEntity($entityName)) { | ||
1589 : | Confess("$entityName is not a valid entity."); | ||
1590 : | } else { | ||
1591 : | my $entityData = $self->{_metaData}->{Entities}->{$entityName}; | ||
1592 : | # Find the relation containing this field. | ||
1593 : | my $fieldHash = $entityData->{Fields}; | ||
1594 : | parrello | 1.52 | if (! exists $fieldHash->{$fieldTitle}) { |
1595 : | parrello | 1.51 | Confess("$fieldTitle not found in $entityName."); |
1596 : | } else { | ||
1597 : | my $relation = $fieldHash->{$fieldTitle}->{relation}; | ||
1598 : | if ($relation eq $entityName) { | ||
1599 : | Confess("Cannot do InsertValue on primary field $fieldTitle of $entityName."); | ||
1600 : | } else { | ||
1601 : | # Now we can create an INSERT statement. | ||
1602 : | my $dbh = $self->{_dbh}; | ||
1603 : | my $fixedName = _FixName($fieldTitle); | ||
1604 : | parrello | 1.53 | my $statement = "INSERT INTO $relation (id, $fixedName) VALUES(?, ?)"; |
1605 : | parrello | 1.51 | # Execute the command. |
1606 : | $dbh->SQL($statement, 0, $entityID, $value); | ||
1607 : | } | ||
1608 : | } | ||
1609 : | } | ||
1610 : | } else { | ||
1611 : | Confess("$fieldName is not a valid field name."); | ||
1612 : | } | ||
1613 : | } | ||
1614 : | |||
1615 : | parrello | 1.1 | =head3 InsertObject |
1616 : | |||
1617 : | parrello | 1.18 | C<< my $ok = $erdb->InsertObject($objectType, \%fieldHash); >> |
1618 : | parrello | 1.1 | |
1619 : | Insert an object into the database. The object is defined by a type name and then a hash | ||
1620 : | of field names to values. Field values in the primary relation are represented by scalars. | ||
1621 : | (Note that for relationships, the primary relation is the B<only> relation.) | ||
1622 : | Field values for the other relations comprising the entity are always list references. For | ||
1623 : | example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases | ||
1624 : | C<ZP_00210270.1> and C<gi|46206278>. | ||
1625 : | |||
1626 : | parrello | 1.18 | C<< $erdb->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); >> |
1627 : | parrello | 1.1 | |
1628 : | The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and | ||
1629 : | property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. | ||
1630 : | |||
1631 : | parrello | 1.57 | C<< $erdb->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence => 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); >> |
1632 : | parrello | 1.1 | |
1633 : | =over 4 | ||
1634 : | |||
1635 : | =item newObjectType | ||
1636 : | |||
1637 : | Type name of the object to insert. | ||
1638 : | |||
1639 : | =item fieldHash | ||
1640 : | |||
1641 : | Hash of field names to values. | ||
1642 : | |||
1643 : | =item RETURN | ||
1644 : | |||
1645 : | Returns 1 if successful, 0 if an error occurred. | ||
1646 : | |||
1647 : | =back | ||
1648 : | |||
1649 : | =cut | ||
1650 : | |||
1651 : | sub InsertObject { | ||
1652 : | parrello | 1.10 | # Get the parameters. |
1653 : | my ($self, $newObjectType, $fieldHash) = @_; | ||
1654 : | # Denote that so far we appear successful. | ||
1655 : | my $retVal = 1; | ||
1656 : | # Get the database handle. | ||
1657 : | my $dbh = $self->{_dbh}; | ||
1658 : | # Get the relation list. | ||
1659 : | my $relationTable = $self->_GetRelationTable($newObjectType); | ||
1660 : | # Loop through the relations. We'll build insert statements for each one. If a relation is | ||
1661 : | # secondary, we may end up generating multiple insert statements. If an error occurs, we | ||
1662 : | # stop the loop. | ||
1663 : | my @relationList = keys %{$relationTable}; | ||
1664 : | for (my $i = 0; $retVal && $i <= $#relationList; $i++) { | ||
1665 : | my $relationName = $relationList[$i]; | ||
1666 : | my $relationDefinition = $relationTable->{$relationName}; | ||
1667 : | # Get the relation's fields. For each field we will collect a value in the corresponding | ||
1668 : | # position of the @valueList array. If one of the fields is missing, we will add it to the | ||
1669 : | # @missing list. | ||
1670 : | my @fieldList = @{$relationDefinition->{Fields}}; | ||
1671 : | my @fieldNameList = (); | ||
1672 : | my @valueList = (); | ||
1673 : | my @missing = (); | ||
1674 : | my $recordCount = 1; | ||
1675 : | for my $fieldDescriptor (@fieldList) { | ||
1676 : | # Get the field name and save it. Note we need to fix it up so the hyphens | ||
1677 : | # are converted to underscores. | ||
1678 : | my $fieldName = $fieldDescriptor->{name}; | ||
1679 : | push @fieldNameList, _FixName($fieldName); | ||
1680 : | # Look for the named field in the incoming structure. Note that we are looking | ||
1681 : | # for the real field name, not the fixed-up one! | ||
1682 : | if (exists $fieldHash->{$fieldName}) { | ||
1683 : | # Here we found the field. Stash it in the value list. | ||
1684 : | my $value = $fieldHash->{$fieldName}; | ||
1685 : | push @valueList, $value; | ||
1686 : | # If the value is a list, we may need to increment the record count. | ||
1687 : | if (ref $value eq "ARRAY") { | ||
1688 : | my $thisCount = @{$value}; | ||
1689 : | if ($recordCount == 1) { | ||
1690 : | # Here we have our first list, so we save its count. | ||
1691 : | $recordCount = $thisCount; | ||
1692 : | } elsif ($recordCount != $thisCount) { | ||
1693 : | # Here we have a second list, so its length has to match the | ||
1694 : | # previous lists. | ||
1695 : | Trace("Field $value in new $newObjectType object has an invalid list length $thisCount. Expected $recordCount.") if T(0); | ||
1696 : | $retVal = 0; | ||
1697 : | } | ||
1698 : | } | ||
1699 : | } else { | ||
1700 : | # Here the field is not present. Flag it as missing. | ||
1701 : | push @missing, $fieldName; | ||
1702 : | } | ||
1703 : | } | ||
1704 : | # If we are the primary relation, add the new-record flag. | ||
1705 : | if ($relationName eq $newObjectType) { | ||
1706 : | push @valueList, 1; | ||
1707 : | push @fieldNameList, "new_record"; | ||
1708 : | } | ||
1709 : | # Only proceed if there are no missing fields. | ||
1710 : | if (@missing > 0) { | ||
1711 : | Trace("Relation $relationName for $newObjectType skipped due to missing fields: " . | ||
1712 : | join(' ', @missing)) if T(1); | ||
1713 : | } else { | ||
1714 : | # Build the INSERT statement. | ||
1715 : | my $statement = "INSERT INTO $relationName (" . join (', ', @fieldNameList) . | ||
1716 : | ") VALUES ("; | ||
1717 : | # Create a marker list of the proper size and put it in the statement. | ||
1718 : | my @markers = (); | ||
1719 : | while (@markers < @fieldNameList) { push @markers, '?'; } | ||
1720 : | $statement .= join(', ', @markers) . ")"; | ||
1721 : | # We have the insert statement, so prepare it. | ||
1722 : | my $sth = $dbh->prepare_command($statement); | ||
1723 : | Trace("Insert statement prepared: $statement") if T(3); | ||
1724 : | # Now we loop through the values. If a value is scalar, we use it unmodified. If it's | ||
1725 : | # a list, we use the current element. The values are stored in the @parameterList array. | ||
1726 : | my $done = 0; | ||
1727 : | for (my $i = 0; $i < $recordCount; $i++) { | ||
1728 : | # Clear the parameter list array. | ||
1729 : | my @parameterList = (); | ||
1730 : | # Loop through the values. | ||
1731 : | for my $value (@valueList) { | ||
1732 : | # Check to see if this is a scalar value. | ||
1733 : | if (ref $value eq "ARRAY") { | ||
1734 : | # Here we have a list value. Pull the current entry. | ||
1735 : | push @parameterList, $value->[$i]; | ||
1736 : | } else { | ||
1737 : | # Here we have a scalar value. Use it unmodified. | ||
1738 : | push @parameterList, $value; | ||
1739 : | } | ||
1740 : | } | ||
1741 : | # Execute the INSERT statement with the specified parameter list. | ||
1742 : | $retVal = $sth->execute(@parameterList); | ||
1743 : | if (!$retVal) { | ||
1744 : | my $errorString = $sth->errstr(); | ||
1745 : | Trace("Insert error: $errorString.") if T(0); | ||
1746 : | } | ||
1747 : | } | ||
1748 : | } | ||
1749 : | } | ||
1750 : | # Return the success indicator. | ||
1751 : | return $retVal; | ||
1752 : | parrello | 1.1 | } |
1753 : | |||
1754 : | =head3 LoadTable | ||
1755 : | |||
1756 : | parrello | 1.18 | C<< my %results = $erdb->LoadTable($fileName, $relationName, $truncateFlag); >> |
1757 : | parrello | 1.1 | |
1758 : | parrello | 1.9 | Load data from a tab-delimited file into a specified table, optionally re-creating the table |
1759 : | first. | ||
1760 : | parrello | 1.1 | |
1761 : | =over 4 | ||
1762 : | |||
1763 : | =item fileName | ||
1764 : | |||
1765 : | Name of the file from which the table data should be loaded. | ||
1766 : | |||
1767 : | =item relationName | ||
1768 : | |||
1769 : | Name of the relation to be loaded. This is the same as the table name. | ||
1770 : | |||
1771 : | =item truncateFlag | ||
1772 : | |||
1773 : | TRUE if the table should be dropped and re-created, else FALSE | ||
1774 : | |||
1775 : | =item RETURN | ||
1776 : | |||
1777 : | parrello | 1.28 | Returns a statistical object containing a list of the error messages. |
1778 : | parrello | 1.1 | |
1779 : | =back | ||
1780 : | |||
1781 : | =cut | ||
1782 : | sub LoadTable { | ||
1783 : | parrello | 1.10 | # Get the parameters. |
1784 : | my ($self, $fileName, $relationName, $truncateFlag) = @_; | ||
1785 : | # Create the statistical return object. | ||
1786 : | my $retVal = _GetLoadStats(); | ||
1787 : | # Trace the fact of the load. | ||
1788 : | Trace("Loading table $relationName from $fileName") if T(2); | ||
1789 : | # Get the database handle. | ||
1790 : | my $dbh = $self->{_dbh}; | ||
1791 : | parrello | 1.22 | # Get the input file size. |
1792 : | my $fileSize = -s $fileName; | ||
1793 : | parrello | 1.10 | # Get the relation data. |
1794 : | my $relation = $self->_FindRelation($relationName); | ||
1795 : | # Check the truncation flag. | ||
1796 : | if ($truncateFlag) { | ||
1797 : | Trace("Creating table $relationName") if T(2); | ||
1798 : | parrello | 1.19 | # Compute the row count estimate. We take the size of the load file, |
1799 : | # divide it by the estimated row size, and then multiply by 1.5 to | ||
1800 : | # leave extra room. We postulate a minimum row count of 1000 to | ||
1801 : | # prevent problems with incoming empty load files. | ||
1802 : | my $rowSize = $self->EstimateRowSize($relationName); | ||
1803 : | my $estimate = FIG::max($fileSize * 1.5 / $rowSize, 1000); | ||
1804 : | parrello | 1.10 | # Re-create the table without its index. |
1805 : | parrello | 1.19 | $self->CreateTable($relationName, 0, $estimate); |
1806 : | parrello | 1.10 | # If this is a pre-index DBMS, create the index here. |
1807 : | if ($dbh->{_preIndex}) { | ||
1808 : | eval { | ||
1809 : | $self->CreateIndex($relationName); | ||
1810 : | }; | ||
1811 : | if ($@) { | ||
1812 : | $retVal->AddMessage($@); | ||
1813 : | } | ||
1814 : | } | ||
1815 : | } | ||
1816 : | parrello | 1.3 | # Load the table. |
1817 : | parrello | 1.10 | my $rv; |
1818 : | eval { | ||
1819 : | parrello | 1.20 | $rv = $dbh->load_table(file => $fileName, tbl => $relationName); |
1820 : | parrello | 1.10 | }; |
1821 : | if (!defined $rv) { | ||
1822 : | parrello | 1.3 | $retVal->AddMessage($@) if ($@); |
1823 : | parrello | 1.20 | $retVal->AddMessage("Table load failed for $relationName using $fileName."); |
1824 : | parrello | 1.10 | Trace("Table load failed for $relationName.") if T(1); |
1825 : | } else { | ||
1826 : | parrello | 1.22 | # Here we successfully loaded the table. |
1827 : | $retVal->Add("tables"); | ||
1828 : | my $size = -s $fileName; | ||
1829 : | Trace("$size bytes loaded into $relationName.") if T(2); | ||
1830 : | parrello | 1.10 | # If we're rebuilding, we need to create the table indexes. |
1831 : | if ($truncateFlag && ! $dbh->{_preIndex}) { | ||
1832 : | eval { | ||
1833 : | $self->CreateIndex($relationName); | ||
1834 : | }; | ||
1835 : | if ($@) { | ||
1836 : | $retVal->AddMessage($@); | ||
1837 : | } | ||
1838 : | } | ||
1839 : | } | ||
1840 : | parrello | 1.20 | # Analyze the table to improve performance. |
1841 : | parrello | 1.61 | Trace("Analyzing and compacting $relationName.") if T(3); |
1842 : | olson | 1.16 | $dbh->vacuum_it($relationName); |
1843 : | parrello | 1.61 | Trace("$relationName load completed.") if T(3); |
1844 : | parrello | 1.10 | # Return the statistics. |
1845 : | return $retVal; | ||
1846 : | parrello | 1.1 | } |
1847 : | |||
1848 : | =head3 GenerateEntity | ||
1849 : | |||
1850 : | parrello | 1.18 | C<< my $fieldHash = $erdb->GenerateEntity($id, $type, \%values); >> |
1851 : | parrello | 1.1 | |
1852 : | Generate the data for a new entity instance. This method creates a field hash suitable for | ||
1853 : | passing as a parameter to L</InsertObject>. The ID is specified by the callr, but the rest | ||
1854 : | of the fields are generated using information in the database schema. | ||
1855 : | |||
1856 : | Each data type has a default algorithm for generating random test data. This can be overridden | ||
1857 : | by including a B<DataGen> element in the field. If this happens, the content of the element is | ||
1858 : | executed as a PERL program in the context of this module. The element may make use of a C<$this> | ||
1859 : | variable which contains the field hash as it has been built up to the current point. If any | ||
1860 : | fields are dependent on other fields, the C<pass> attribute can be used to control the order | ||
1861 : | in which the fields are generated. A field with a high data pass number will be generated after | ||
1862 : | a field with a lower one. If any external values are needed, they should be passed in via the | ||
1863 : | optional third parameter, which will be available to the data generation script under the name | ||
1864 : | C<$value>. Several useful utility methods are provided for generating random values, including | ||
1865 : | L</IntGen>, L</StringGen>, L</FloatGen>, and L</DateGen>. Note that dates are stored and generated | ||
1866 : | in the form of a timestamp number rather than a string. | ||
1867 : | |||
1868 : | =over 4 | ||
1869 : | |||
1870 : | =item id | ||
1871 : | |||
1872 : | ID to assign to the new entity. | ||
1873 : | |||
1874 : | =item type | ||
1875 : | |||
1876 : | Type name for the new entity. | ||
1877 : | |||
1878 : | =item values | ||
1879 : | |||
1880 : | Hash containing additional values that might be needed by the data generation methods (optional). | ||
1881 : | |||
1882 : | =back | ||
1883 : | |||
1884 : | =cut | ||
1885 : | |||
1886 : | sub GenerateEntity { | ||
1887 : | parrello | 1.10 | # Get the parameters. |
1888 : | my ($self, $id, $type, $values) = @_; | ||
1889 : | # Create the return hash. | ||
1890 : | my $this = { id => $id }; | ||
1891 : | # Get the metadata structure. | ||
1892 : | my $metadata = $self->{_metaData}; | ||
1893 : | # Get this entity's list of fields. | ||
1894 : | if (!exists $metadata->{Entities}->{$type}) { | ||
1895 : | Confess("Unrecognized entity type $type in GenerateEntity."); | ||
1896 : | } else { | ||
1897 : | my $entity = $metadata->{Entities}->{$type}; | ||
1898 : | my $fields = $entity->{Fields}; | ||
1899 : | # Generate data from the fields. | ||
1900 : | _GenerateFields($this, $fields, $type, $values); | ||
1901 : | } | ||
1902 : | # Return the hash created. | ||
1903 : | return $this; | ||
1904 : | parrello | 1.1 | } |
1905 : | |||
1906 : | parrello | 1.6 | =head3 GetEntity |
1907 : | |||
1908 : | parrello | 1.18 | C<< my $entityObject = $erdb->GetEntity($entityType, $ID); >> |
1909 : | parrello | 1.6 | |
1910 : | Return an object describing the entity instance with a specified ID. | ||
1911 : | |||
1912 : | =over 4 | ||
1913 : | |||
1914 : | =item entityType | ||
1915 : | |||
1916 : | Entity type name. | ||
1917 : | |||
1918 : | =item ID | ||
1919 : | |||
1920 : | ID of the desired entity. | ||
1921 : | |||
1922 : | =item RETURN | ||
1923 : | |||
1924 : | Returns a B<DBObject> representing the desired entity instance, or an undefined value if no | ||
1925 : | instance is found with the specified key. | ||
1926 : | |||
1927 : | =back | ||
1928 : | |||
1929 : | =cut | ||
1930 : | |||
1931 : | sub GetEntity { | ||
1932 : | parrello | 1.10 | # Get the parameters. |
1933 : | my ($self, $entityType, $ID) = @_; | ||
1934 : | # Create a query. | ||
1935 : | parrello | 1.45 | my $query = $self->Get([$entityType], "$entityType(id) = ?", [$ID]); |
1936 : | parrello | 1.10 | # Get the first (and only) object. |
1937 : | my $retVal = $query->Fetch(); | ||
1938 : | # Return the result. | ||
1939 : | return $retVal; | ||
1940 : | parrello | 1.6 | } |
1941 : | |||
1942 : | =head3 GetEntityValues | ||
1943 : | |||
1944 : | parrello | 1.18 | C<< my @values = $erdb->GetEntityValues($entityType, $ID, \@fields); >> |
1945 : | parrello | 1.6 | |
1946 : | Return a list of values from a specified entity instance. | ||
1947 : | |||
1948 : | =over 4 | ||
1949 : | |||
1950 : | =item entityType | ||
1951 : | |||
1952 : | Entity type name. | ||
1953 : | |||
1954 : | =item ID | ||
1955 : | |||
1956 : | ID of the desired entity. | ||
1957 : | |||
1958 : | =item fields | ||
1959 : | |||
1960 : | List of field names, each of the form I<objectName>C<(>I<fieldName>C<)>. | ||
1961 : | |||
1962 : | =item RETURN | ||
1963 : | |||
1964 : | Returns a flattened list of the values of the specified fields for the specified entity. | ||
1965 : | |||
1966 : | =back | ||
1967 : | |||
1968 : | =cut | ||
1969 : | |||
1970 : | sub GetEntityValues { | ||
1971 : | parrello | 1.10 | # Get the parameters. |
1972 : | my ($self, $entityType, $ID, $fields) = @_; | ||
1973 : | # Get the specified entity. | ||
1974 : | my $entity = $self->GetEntity($entityType, $ID); | ||
1975 : | # Declare the return list. | ||
1976 : | my @retVal = (); | ||
1977 : | # If we found the entity, push the values into the return list. | ||
1978 : | if ($entity) { | ||
1979 : | push @retVal, $entity->Values($fields); | ||
1980 : | } | ||
1981 : | # Return the result. | ||
1982 : | return @retVal; | ||
1983 : | parrello | 1.6 | } |
1984 : | parrello | 1.1 | |
1985 : | parrello | 1.7 | =head3 GetAll |
1986 : | |||
1987 : | parrello | 1.18 | C<< my @list = $erdb->GetAll(\@objectNames, $filterClause, \@parameters, \@fields, $count); >> |
1988 : | parrello | 1.7 | |
1989 : | Return a list of values taken from the objects returned by a query. The first three | ||
1990 : | parameters correspond to the parameters of the L</Get> method. The final parameter is | ||
1991 : | a list of the fields desired from each record found by the query. The field name | ||
1992 : | syntax is the standard syntax used for fields in the B<ERDB> system-- | ||
1993 : | B<I<objectName>(I<fieldName>)>-- where I<objectName> is the name of the relevant entity | ||
1994 : | or relationship and I<fieldName> is the name of the field. | ||
1995 : | |||
1996 : | The list returned will be a list of lists. Each element of the list will contain | ||
1997 : | the values returned for the fields specified in the fourth parameter. If one of the | ||
1998 : | fields specified returns multiple values, they are flattened in with the rest. For | ||
1999 : | example, the following call will return a list of the features in a particular | ||
2000 : | spreadsheet cell, and each feature will be represented by a list containing the | ||
2001 : | feature ID followed by all of its aliases. | ||
2002 : | |||
2003 : | parrello | 1.18 | C<< $query = $erdb->Get(['ContainsFeature', 'Feature'], "ContainsFeature(from-link) = ?", [$ssCellID], ['Feature(id)', 'Feature(alias)']); >> |
2004 : | parrello | 1.7 | |
2005 : | =over 4 | ||
2006 : | |||
2007 : | =item objectNames | ||
2008 : | |||
2009 : | List containing the names of the entity and relationship objects to be retrieved. | ||
2010 : | |||
2011 : | =item filterClause | ||
2012 : | |||
2013 : | WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
2014 : | be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form | ||
2015 : | B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the | ||
2016 : | parameter list as additional parameters. The fields in a filter clause can come from primary | ||
2017 : | entity relations, relationship relations, or secondary entity relations; however, all of the | ||
2018 : | entities and relationships involved must be included in the list of object names. | ||
2019 : | |||
2020 : | =item parameterList | ||
2021 : | |||
2022 : | List of the parameters to be substituted in for the parameters marks in the filter clause. | ||
2023 : | |||
2024 : | =item fields | ||
2025 : | |||
2026 : | List of the fields to be returned in each element of the list returned. | ||
2027 : | |||
2028 : | =item count | ||
2029 : | |||
2030 : | Maximum number of records to return. If omitted or 0, all available records will be returned. | ||
2031 : | |||
2032 : | =item RETURN | ||
2033 : | |||
2034 : | Returns a list of list references. Each element of the return list contains the values for the | ||
2035 : | fields specified in the B<fields> parameter. | ||
2036 : | |||
2037 : | =back | ||
2038 : | |||
2039 : | =cut | ||
2040 : | #: Return Type @@; | ||
2041 : | sub GetAll { | ||
2042 : | parrello | 1.10 | # Get the parameters. |
2043 : | my ($self, $objectNames, $filterClause, $parameterList, $fields, $count) = @_; | ||
2044 : | # Translate the parameters from a list reference to a list. If the parameter | ||
2045 : | # list is a scalar we convert it into a singleton list. | ||
2046 : | my @parmList = (); | ||
2047 : | if (ref $parameterList eq "ARRAY") { | ||
2048 : | parrello | 1.45 | Trace("GetAll parm list is an array.") if T(4); |
2049 : | parrello | 1.10 | @parmList = @{$parameterList}; |
2050 : | } else { | ||
2051 : | parrello | 1.45 | Trace("GetAll parm list is a scalar: $parameterList.") if T(4); |
2052 : | parrello | 1.10 | push @parmList, $parameterList; |
2053 : | } | ||
2054 : | parrello | 1.30 | # Insure the counter has a value. |
2055 : | if (!defined $count) { | ||
2056 : | $count = 0; | ||
2057 : | } | ||
2058 : | # Add the row limit to the filter clause. | ||
2059 : | if ($count > 0) { | ||
2060 : | $filterClause .= " LIMIT $count"; | ||
2061 : | } | ||
2062 : | parrello | 1.10 | # Create the query. |
2063 : | parrello | 1.45 | my $query = $self->Get($objectNames, $filterClause, \@parmList); |
2064 : | parrello | 1.10 | # Set up a counter of the number of records read. |
2065 : | my $fetched = 0; | ||
2066 : | # Loop through the records returned, extracting the fields. Note that if the | ||
2067 : | # counter is non-zero, we stop when the number of records read hits the count. | ||
2068 : | my @retVal = (); | ||
2069 : | while (($count == 0 || $fetched < $count) && (my $row = $query->Fetch())) { | ||
2070 : | my @rowData = $row->Values($fields); | ||
2071 : | push @retVal, \@rowData; | ||
2072 : | $fetched++; | ||
2073 : | } | ||
2074 : | parrello | 1.54 | Trace("$fetched rows returned in GetAll.") if T(SQL => 4); |
2075 : | parrello | 1.10 | # Return the resulting list. |
2076 : | return @retVal; | ||
2077 : | parrello | 1.7 | } |
2078 : | |||
2079 : | parrello | 1.55 | =head3 Exists |
2080 : | |||
2081 : | C<< my $found = $sprout->Exists($entityName, $entityID); >> | ||
2082 : | |||
2083 : | Return TRUE if an entity exists, else FALSE. | ||
2084 : | |||
2085 : | =over 4 | ||
2086 : | |||
2087 : | =item entityName | ||
2088 : | |||
2089 : | Name of the entity type (e.g. C<Feature>) relevant to the existence check. | ||
2090 : | |||
2091 : | =item entityID | ||
2092 : | |||
2093 : | ID of the entity instance whose existence is to be checked. | ||
2094 : | |||
2095 : | =item RETURN | ||
2096 : | |||
2097 : | Returns TRUE if the entity instance exists, else FALSE. | ||
2098 : | |||
2099 : | =back | ||
2100 : | |||
2101 : | =cut | ||
2102 : | #: Return Type $; | ||
2103 : | sub Exists { | ||
2104 : | # Get the parameters. | ||
2105 : | my ($self, $entityName, $entityID) = @_; | ||
2106 : | # Check for the entity instance. | ||
2107 : | Trace("Checking existence of $entityName with ID=$entityID.") if T(4); | ||
2108 : | my $testInstance = $self->GetEntity($entityName, $entityID); | ||
2109 : | # Return an existence indicator. | ||
2110 : | my $retVal = ($testInstance ? 1 : 0); | ||
2111 : | return $retVal; | ||
2112 : | } | ||
2113 : | |||
2114 : | parrello | 1.18 | =head3 EstimateRowSize |
2115 : | |||
2116 : | C<< my $rowSize = $erdb->EstimateRowSize($relName); >> | ||
2117 : | |||
2118 : | Estimate the row size of the specified relation. The estimated row size is computed by adding | ||
2119 : | up the average length for each data type. | ||
2120 : | |||
2121 : | =over 4 | ||
2122 : | |||
2123 : | =item relName | ||
2124 : | |||
2125 : | Name of the relation whose estimated row size is desired. | ||
2126 : | |||
2127 : | =item RETURN | ||
2128 : | |||
2129 : | Returns an estimate of the row size for the specified relation. | ||
2130 : | |||
2131 : | =back | ||
2132 : | |||
2133 : | =cut | ||
2134 : | #: Return Type $; | ||
2135 : | sub EstimateRowSize { | ||
2136 : | # Get the parameters. | ||
2137 : | my ($self, $relName) = @_; | ||
2138 : | # Declare the return variable. | ||
2139 : | my $retVal = 0; | ||
2140 : | # Find the relation descriptor. | ||
2141 : | my $relation = $self->_FindRelation($relName); | ||
2142 : | # Get the list of fields. | ||
2143 : | for my $fieldData (@{$relation->{Fields}}) { | ||
2144 : | # Get the field type and add its length. | ||
2145 : | my $fieldLen = $TypeTable{$fieldData->{type}}->{avgLen}; | ||
2146 : | $retVal += $fieldLen; | ||
2147 : | } | ||
2148 : | # Return the result. | ||
2149 : | return $retVal; | ||
2150 : | } | ||
2151 : | |||
2152 : | parrello | 1.38 | =head3 GetFieldTable |
2153 : | |||
2154 : | C<< my $fieldHash = $self->GetFieldTable($objectnName); >> | ||
2155 : | |||
2156 : | Get the field structure for a specified entity or relationship. | ||
2157 : | |||
2158 : | =over 4 | ||
2159 : | |||
2160 : | =item objectName | ||
2161 : | |||
2162 : | Name of the desired entity or relationship. | ||
2163 : | |||
2164 : | =item RETURN | ||
2165 : | |||
2166 : | The table containing the field descriptors for the specified object. | ||
2167 : | |||
2168 : | =back | ||
2169 : | |||
2170 : | =cut | ||
2171 : | |||
2172 : | sub GetFieldTable { | ||
2173 : | # Get the parameters. | ||
2174 : | my ($self, $objectName) = @_; | ||
2175 : | # Get the descriptor from the metadata. | ||
2176 : | my $objectData = $self->_GetStructure($objectName); | ||
2177 : | # Return the object's field table. | ||
2178 : | return $objectData->{Fields}; | ||
2179 : | } | ||
2180 : | |||
2181 : | parrello | 1.48 | =head2 Data Mining Methods |
2182 : | |||
2183 : | parrello | 1.38 | =head3 GetUsefulCrossValues |
2184 : | |||
2185 : | C<< my @attrNames = $sprout->GetUsefulCrossValues($sourceEntity, $relationship); >> | ||
2186 : | |||
2187 : | Return a list of the useful attributes that would be returned by a B<Cross> call | ||
2188 : | from an entity of the source entity type through the specified relationship. This | ||
2189 : | means it will return the fields of the target entity type and the intersection data | ||
2190 : | fields in the relationship. Only primary table fields are returned. In other words, | ||
2191 : | the field names returned will be for fields where there is always one and only one | ||
2192 : | value. | ||
2193 : | |||
2194 : | =over 4 | ||
2195 : | |||
2196 : | =item sourceEntity | ||
2197 : | |||
2198 : | Name of the entity from which the relationship crossing will start. | ||
2199 : | |||
2200 : | =item relationship | ||
2201 : | |||
2202 : | Name of the relationship being crossed. | ||
2203 : | |||
2204 : | =item RETURN | ||
2205 : | |||
2206 : | Returns a list of field names in Sprout field format (I<objectName>C<(>I<fieldName>C<)>. | ||
2207 : | |||
2208 : | =back | ||
2209 : | |||
2210 : | =cut | ||
2211 : | #: Return Type @; | ||
2212 : | sub GetUsefulCrossValues { | ||
2213 : | # Get the parameters. | ||
2214 : | my ($self, $sourceEntity, $relationship) = @_; | ||
2215 : | # Declare the return variable. | ||
2216 : | my @retVal = (); | ||
2217 : | # Determine the target entity for the relationship. This is whichever entity is not | ||
2218 : | # the source entity. So, if the source entity is the FROM, we'll get the name of | ||
2219 : | # the TO, and vice versa. | ||
2220 : | my $relStructure = $self->_GetStructure($relationship); | ||
2221 : | my $targetEntityType = ($relStructure->{from} eq $sourceEntity ? "to" : "from"); | ||
2222 : | my $targetEntity = $relStructure->{$targetEntityType}; | ||
2223 : | # Get the field table for the entity. | ||
2224 : | my $entityFields = $self->GetFieldTable($targetEntity); | ||
2225 : | # The field table is a hash. The hash key is the field name. The hash value is a structure. | ||
2226 : | # For the entity fields, the key aspect of the target structure is that the {relation} value | ||
2227 : | # must match the entity name. | ||
2228 : | my @fieldList = map { "$targetEntity($_)" } grep { $entityFields->{$_}->{relation} eq $targetEntity } | ||
2229 : | keys %{$entityFields}; | ||
2230 : | # Push the fields found onto the return variable. | ||
2231 : | push @retVal, sort @fieldList; | ||
2232 : | # Get the field table for the relationship. | ||
2233 : | my $relationshipFields = $self->GetFieldTable($relationship); | ||
2234 : | # Here we have a different rule. We want all the fields other than "from-link" and "to-link". | ||
2235 : | # This may end up being an empty set. | ||
2236 : | my @fieldList2 = map { "$relationship($_)" } grep { $_ ne "from-link" && $_ ne "to-link" } | ||
2237 : | keys %{$relationshipFields}; | ||
2238 : | # Push these onto the return list. | ||
2239 : | push @retVal, sort @fieldList2; | ||
2240 : | # Return the result. | ||
2241 : | return @retVal; | ||
2242 : | } | ||
2243 : | |||
2244 : | parrello | 1.48 | =head3 FindColumn |
2245 : | |||
2246 : | C<< my $colIndex = ERDB::FindColumn($headerLine, $columnIdentifier); >> | ||
2247 : | |||
2248 : | Return the location a desired column in a data mining header line. The data | ||
2249 : | mining header line is a tab-separated list of column names. The column | ||
2250 : | identifier is either the numerical index of a column or the actual column | ||
2251 : | name. | ||
2252 : | |||
2253 : | =over 4 | ||
2254 : | |||
2255 : | =item headerLine | ||
2256 : | |||
2257 : | The header line from a data mining command, which consists of a tab-separated | ||
2258 : | list of column names. | ||
2259 : | |||
2260 : | =item columnIdentifier | ||
2261 : | |||
2262 : | Either the ordinal number of the desired column (1-based), or the name of the | ||
2263 : | desired column. | ||
2264 : | |||
2265 : | =item RETURN | ||
2266 : | |||
2267 : | Returns the array index (0-based) of the desired column. | ||
2268 : | |||
2269 : | =back | ||
2270 : | |||
2271 : | =cut | ||
2272 : | |||
2273 : | sub FindColumn { | ||
2274 : | # Get the parameters. | ||
2275 : | my ($headerLine, $columnIdentifier) = @_; | ||
2276 : | # Declare the return variable. | ||
2277 : | my $retVal; | ||
2278 : | # Split the header line into column names. | ||
2279 : | my @headers = ParseColumns($headerLine); | ||
2280 : | # Determine whether we have a number or a name. | ||
2281 : | if ($columnIdentifier =~ /^\d+$/) { | ||
2282 : | # Here we have a number. Subtract 1 and validate the result. | ||
2283 : | $retVal = $columnIdentifier - 1; | ||
2284 : | if ($retVal < 0 || $retVal > $#headers) { | ||
2285 : | Confess("Invalid column identifer \"$columnIdentifier\": value out of range."); | ||
2286 : | } | ||
2287 : | } else { | ||
2288 : | # Here we have a name. We need to find it in the list. | ||
2289 : | for (my $i = 0; $i <= $#headers && ! defined($retVal); $i++) { | ||
2290 : | if ($headers[$i] eq $columnIdentifier) { | ||
2291 : | $retVal = $i; | ||
2292 : | } | ||
2293 : | } | ||
2294 : | if (! defined($retVal)) { | ||
2295 : | Confess("Invalid column identifier \"$columnIdentifier\": value not found."); | ||
2296 : | } | ||
2297 : | } | ||
2298 : | # Return the result. | ||
2299 : | return $retVal; | ||
2300 : | } | ||
2301 : | |||
2302 : | =head3 ParseColumns | ||
2303 : | |||
2304 : | parrello | 1.50 | C<< my @columns = ERDB::ParseColumns($line); >> |
2305 : | parrello | 1.48 | |
2306 : | Convert the specified data line to a list of columns. | ||
2307 : | |||
2308 : | =over 4 | ||
2309 : | |||
2310 : | =item line | ||
2311 : | |||
2312 : | A data mining input, consisting of a tab-separated list of columns terminated by a | ||
2313 : | new-line. | ||
2314 : | |||
2315 : | =item RETURN | ||
2316 : | |||
2317 : | Returns a list consisting of the column values. | ||
2318 : | |||
2319 : | =back | ||
2320 : | |||
2321 : | =cut | ||
2322 : | |||
2323 : | sub ParseColumns { | ||
2324 : | # Get the parameters. | ||
2325 : | parrello | 1.50 | my ($line) = @_; |
2326 : | parrello | 1.48 | # Chop off the line-end. |
2327 : | chomp $line; | ||
2328 : | # Split it into a list. | ||
2329 : | my @retVal = split(/\t/, $line); | ||
2330 : | # Return the result. | ||
2331 : | return @retVal; | ||
2332 : | } | ||
2333 : | |||
2334 : | parrello | 1.1 | =head2 Internal Utility Methods |
2335 : | |||
2336 : | parrello | 1.45 | =head3 SetupSQL |
2337 : | |||
2338 : | Process a list of object names and a filter clause so that they can be used to | ||
2339 : | build an SQL statement. This method takes in a reference to a list of object names | ||
2340 : | and a filter clause. It will return a corrected filter clause, a list of mapped | ||
2341 : | names and the mapped name hash. | ||
2342 : | |||
2343 : | This is an instance method. | ||
2344 : | |||
2345 : | =over 4 | ||
2346 : | |||
2347 : | =item objectNames | ||
2348 : | |||
2349 : | Reference to a list of the object names to be included in the query. | ||
2350 : | |||
2351 : | =item filterClause | ||
2352 : | |||
2353 : | A string containing the WHERE clause for the query (without the C<WHERE>) and also | ||
2354 : | optionally the C<ORDER BY> and C<LIMIT> clauses. | ||
2355 : | |||
2356 : | =item RETURN | ||
2357 : | |||
2358 : | Returns a three-element list. The first element is the SQL statement suffix, beginning | ||
2359 : | with the FROM clause. The second element is a reference to a list of the names to be | ||
2360 : | used in retrieving the fields. The third element is a hash mapping the names to the | ||
2361 : | objects they represent. | ||
2362 : | |||
2363 : | =back | ||
2364 : | |||
2365 : | =cut | ||
2366 : | |||
2367 : | sub _SetupSQL { | ||
2368 : | my ($self, $objectNames, $filterClause) = @_; | ||
2369 : | # Adjust the list of object names to account for multiple occurrences of the | ||
2370 : | # same object. We start with a hash table keyed on object name that will | ||
2371 : | # return the object suffix. The first time an object is encountered it will | ||
2372 : | # not be found in the hash. The next time the hash will map the object name | ||
2373 : | # to 2, then 3, and so forth. | ||
2374 : | my %objectHash = (); | ||
2375 : | # This list will contain the object names as they are to appear in the | ||
2376 : | # FROM list. | ||
2377 : | my @fromList = (); | ||
2378 : | # This list contains the suffixed object name for each object. It is exactly | ||
2379 : | # parallel to the list in the $objectNames parameter. | ||
2380 : | my @mappedNameList = (); | ||
2381 : | # Finally, this hash translates from a mapped name to its original object name. | ||
2382 : | my %mappedNameHash = (); | ||
2383 : | # Now we create the lists. Note that for every single name we push something into | ||
2384 : | # @fromList and @mappedNameList. This insures that those two arrays are exactly | ||
2385 : | # parallel to $objectNames. | ||
2386 : | for my $objectName (@{$objectNames}) { | ||
2387 : | # Get the next suffix for this object. | ||
2388 : | my $suffix = $objectHash{$objectName}; | ||
2389 : | if (! $suffix) { | ||
2390 : | # Here we are seeing the object for the first time. The object name | ||
2391 : | # is used as is. | ||
2392 : | push @mappedNameList, $objectName; | ||
2393 : | push @fromList, $objectName; | ||
2394 : | $mappedNameHash{$objectName} = $objectName; | ||
2395 : | # Denote the next suffix will be 2. | ||
2396 : | $objectHash{$objectName} = 2; | ||
2397 : | } else { | ||
2398 : | # Here we've seen the object before. We construct a new name using | ||
2399 : | # the suffix from the hash and update the hash. | ||
2400 : | my $mappedName = "$objectName$suffix"; | ||
2401 : | $objectHash{$objectName} = $suffix + 1; | ||
2402 : | # The FROM list has the object name followed by the mapped name. This | ||
2403 : | # tells SQL it's still the same table, but we're using a different name | ||
2404 : | # for it to avoid confusion. | ||
2405 : | push @fromList, "$objectName $mappedName"; | ||
2406 : | # The mapped-name list contains the real mapped name. | ||
2407 : | push @mappedNameList, $mappedName; | ||
2408 : | # Finally, enable us to get back from the mapped name to the object name. | ||
2409 : | $mappedNameHash{$mappedName} = $objectName; | ||
2410 : | } | ||
2411 : | } | ||
2412 : | # Begin the SELECT suffix. It starts with | ||
2413 : | # | ||
2414 : | # FROM name1, name2, ... nameN | ||
2415 : | # | ||
2416 : | my $suffix = "FROM " . join(', ', @fromList); | ||
2417 : | # Check for a filter clause. | ||
2418 : | if ($filterClause) { | ||
2419 : | # Here we have one, so we convert its field names and add it to the query. First, | ||
2420 : | # We create a copy of the filter string we can work with. | ||
2421 : | my $filterString = $filterClause; | ||
2422 : | # Next, we sort the object names by length. This helps protect us from finding | ||
2423 : | # object names inside other object names when we're doing our search and replace. | ||
2424 : | my @sortedNames = sort { length($b) - length($a) } @mappedNameList; | ||
2425 : | # We will also keep a list of conditions to add to the WHERE clause in order to link | ||
2426 : | # entities and relationships as well as primary relations to secondary ones. | ||
2427 : | my @joinWhere = (); | ||
2428 : | # The final preparatory step is to create a hash table of relation names. The | ||
2429 : | # table begins with the relation names already in the SELECT command. We may | ||
2430 : | # need to add relations later if there is filtering on a field in a secondary | ||
2431 : | # relation. The secondary relations are the ones that contain multiply- | ||
2432 : | # occurring or optional fields. | ||
2433 : | my %fromNames = map { $_ => 1 } @sortedNames; | ||
2434 : | # We are ready to begin. We loop through the object names, replacing each | ||
2435 : | # object name's field references by the corresponding SQL field reference. | ||
2436 : | # Along the way, if we find a secondary relation, we will need to add it | ||
2437 : | # to the FROM clause. | ||
2438 : | for my $mappedName (@sortedNames) { | ||
2439 : | # Get the length of the object name plus 2. This is the value we add to the | ||
2440 : | # size of the field name to determine the size of the field reference as a | ||
2441 : | # whole. | ||
2442 : | my $nameLength = 2 + length $mappedName; | ||
2443 : | # Get the real object name for this mapped name. | ||
2444 : | my $objectName = $mappedNameHash{$mappedName}; | ||
2445 : | Trace("Processing $mappedName for object $objectName.") if T(4); | ||
2446 : | # Get the object's field list. | ||
2447 : | my $fieldList = $self->GetFieldTable($objectName); | ||
2448 : | # Find the field references for this object. | ||
2449 : | while ($filterString =~ m/$mappedName\(([^)]*)\)/g) { | ||
2450 : | # At this point, $1 contains the field name, and the current position | ||
2451 : | # is set immediately after the final parenthesis. We pull out the name of | ||
2452 : | # the field and the position and length of the field reference as a whole. | ||
2453 : | my $fieldName = $1; | ||
2454 : | my $len = $nameLength + length $fieldName; | ||
2455 : | my $pos = pos($filterString) - $len; | ||
2456 : | # Insure the field exists. | ||
2457 : | if (!exists $fieldList->{$fieldName}) { | ||
2458 : | Confess("Field $fieldName not found for object $objectName."); | ||
2459 : | } else { | ||
2460 : | Trace("Processing $fieldName at position $pos.") if T(4); | ||
2461 : | # Get the field's relation. | ||
2462 : | my $relationName = $fieldList->{$fieldName}->{relation}; | ||
2463 : | # Now we have a secondary relation. We need to insure it matches the | ||
2464 : | # mapped name of the primary relation. First we peel off the suffix | ||
2465 : | # from the mapped name. | ||
2466 : | my $mappingSuffix = substr $mappedName, length($objectName); | ||
2467 : | # Put the mapping suffix onto the relation name to get the | ||
2468 : | # mapped relation name. | ||
2469 : | my $mappedRelationName = "$relationName$mappingSuffix"; | ||
2470 : | # Insure the relation is in the FROM clause. | ||
2471 : | if (!exists $fromNames{$mappedRelationName}) { | ||
2472 : | # Add the relation to the FROM clause. | ||
2473 : | if ($mappedRelationName eq $relationName) { | ||
2474 : | # The name is un-mapped, so we add it without | ||
2475 : | # any frills. | ||
2476 : | $suffix .= ", $relationName"; | ||
2477 : | push @joinWhere, "$objectName.id = $relationName.id"; | ||
2478 : | } else { | ||
2479 : | # Here we have a mapping situation. | ||
2480 : | $suffix .= ", $relationName $mappedRelationName"; | ||
2481 : | push @joinWhere, "$mappedRelationName.id = $mappedName.id"; | ||
2482 : | } | ||
2483 : | # Denote we have this relation available for future fields. | ||
2484 : | $fromNames{$mappedRelationName} = 1; | ||
2485 : | } | ||
2486 : | # Form an SQL field reference from the relation name and the field name. | ||
2487 : | my $sqlReference = "$mappedRelationName." . _FixName($fieldName); | ||
2488 : | # Put it into the filter string in place of the old value. | ||
2489 : | substr($filterString, $pos, $len) = $sqlReference; | ||
2490 : | # Reposition the search. | ||
2491 : | pos $filterString = $pos + length $sqlReference; | ||
2492 : | } | ||
2493 : | } | ||
2494 : | } | ||
2495 : | # The next step is to join the objects together. We only need to do this if there | ||
2496 : | # is more than one object in the object list. We start with the first object and | ||
2497 : | # run through the objects after it. Note also that we make a safety copy of the | ||
2498 : | # list before running through it. | ||
2499 : | my @mappedObjectList = @mappedNameList; | ||
2500 : | my $lastMappedObject = shift @mappedObjectList; | ||
2501 : | # Get the join table. | ||
2502 : | my $joinTable = $self->{_metaData}->{Joins}; | ||
2503 : | # Loop through the object list. | ||
2504 : | for my $thisMappedObject (@mappedObjectList) { | ||
2505 : | # Look for a join using the real object names. | ||
2506 : | my $lastObject = $mappedNameHash{$lastMappedObject}; | ||
2507 : | my $thisObject = $mappedNameHash{$thisMappedObject}; | ||
2508 : | my $joinKey = "$lastObject/$thisObject"; | ||
2509 : | if (!exists $joinTable->{$joinKey}) { | ||
2510 : | # Here there's no join, so we throw an error. | ||
2511 : | Confess("No join exists to connect from $lastMappedObject to $thisMappedObject."); | ||
2512 : | } else { | ||
2513 : | # Get the join clause. | ||
2514 : | my $unMappedJoin = $joinTable->{$joinKey}; | ||
2515 : | # Fix the names. | ||
2516 : | $unMappedJoin =~ s/$lastObject/$lastMappedObject/; | ||
2517 : | $unMappedJoin =~ s/$thisObject/$thisMappedObject/; | ||
2518 : | push @joinWhere, $unMappedJoin; | ||
2519 : | # Save this object as the last object for the next iteration. | ||
2520 : | $lastMappedObject = $thisMappedObject; | ||
2521 : | } | ||
2522 : | } | ||
2523 : | # Now we need to handle the whole ORDER BY / LIMIT thing. The important part | ||
2524 : | # here is we want the filter clause to be empty if there's no WHERE filter. | ||
2525 : | # We'll put the ORDER BY / LIMIT clauses in the following variable. | ||
2526 : | my $orderClause = ""; | ||
2527 : | # Locate the ORDER BY or LIMIT verbs (if any). We use a non-greedy | ||
2528 : | # operator so that we find the first occurrence of either verb. | ||
2529 : | if ($filterString =~ m/^(.*?)\s*(ORDER BY|LIMIT)/g) { | ||
2530 : | # Here we have an ORDER BY or LIMIT verb. Split it off of the filter string. | ||
2531 : | my $pos = pos $filterString; | ||
2532 : | $orderClause = $2 . substr($filterString, $pos); | ||
2533 : | $filterString = $1; | ||
2534 : | } | ||
2535 : | # Add the filter and the join clauses (if any) to the SELECT command. | ||
2536 : | if ($filterString) { | ||
2537 : | Trace("Filter string is \"$filterString\".") if T(4); | ||
2538 : | push @joinWhere, "($filterString)"; | ||
2539 : | } | ||
2540 : | if (@joinWhere) { | ||
2541 : | $suffix .= " WHERE " . join(' AND ', @joinWhere); | ||
2542 : | } | ||
2543 : | # Add the sort or limit clause (if any) to the SELECT command. | ||
2544 : | if ($orderClause) { | ||
2545 : | $suffix .= " $orderClause"; | ||
2546 : | } | ||
2547 : | } | ||
2548 : | # Return the suffix, the mapped name list, and the mapped name hash. | ||
2549 : | return ($suffix, \@mappedNameList, \%mappedNameHash); | ||
2550 : | } | ||
2551 : | |||
2552 : | =head3 GetStatementHandle | ||
2553 : | |||
2554 : | This method will prepare and execute an SQL query, returning the statement handle. | ||
2555 : | The main reason for doing this here is so that everybody who does SQL queries gets | ||
2556 : | the benefit of tracing. | ||
2557 : | |||
2558 : | This is an instance method. | ||
2559 : | |||
2560 : | =over 4 | ||
2561 : | |||
2562 : | =item command | ||
2563 : | |||
2564 : | Command to prepare and execute. | ||
2565 : | |||
2566 : | =item params | ||
2567 : | |||
2568 : | Reference to a list of the values to be substituted in for the parameter marks. | ||
2569 : | |||
2570 : | =item RETURN | ||
2571 : | |||
2572 : | Returns a prepared and executed statement handle from which the caller can extract | ||
2573 : | results. | ||
2574 : | |||
2575 : | =back | ||
2576 : | |||
2577 : | =cut | ||
2578 : | |||
2579 : | sub _GetStatementHandle { | ||
2580 : | # Get the parameters. | ||
2581 : | my ($self, $command, $params) = @_; | ||
2582 : | # Trace the query. | ||
2583 : | Trace("SQL query: $command") if T(SQL => 3); | ||
2584 : | Trace("PARMS: '" . (join "', '", @{$params}) . "'") if (T(SQL => 4) && (@{$params} > 0)); | ||
2585 : | # Get the database handle. | ||
2586 : | my $dbh = $self->{_dbh}; | ||
2587 : | # Prepare the command. | ||
2588 : | my $sth = $dbh->prepare_command($command); | ||
2589 : | # Execute it with the parameters bound in. | ||
2590 : | $sth->execute(@{$params}) || Confess("SELECT error" . $sth->errstr()); | ||
2591 : | # Return the statement handle. | ||
2592 : | return $sth; | ||
2593 : | } | ||
2594 : | |||
2595 : | parrello | 1.1 | =head3 GetLoadStats |
2596 : | |||
2597 : | Return a blank statistics object for use by the load methods. | ||
2598 : | |||
2599 : | This is a static method. | ||
2600 : | |||
2601 : | =cut | ||
2602 : | |||
2603 : | parrello | 1.29 | sub _GetLoadStats{ |
2604 : | parrello | 1.28 | return Stats->new(); |
2605 : | parrello | 1.1 | } |
2606 : | |||
2607 : | =head3 GenerateFields | ||
2608 : | |||
2609 : | Generate field values from a field structure and store in a specified table. The field names | ||
2610 : | are first sorted by pass count, certain pre-defined fields are removed from the list, and | ||
2611 : | then we rip through them evaluation the data generation string. Fields in the primary relation | ||
2612 : | are stored as scalars; fields in secondary relations are stored as value lists. | ||
2613 : | |||
2614 : | This is a static method. | ||
2615 : | |||
2616 : | =over 4 | ||
2617 : | |||
2618 : | =item this | ||
2619 : | |||
2620 : | Hash table into which the field values should be placed. | ||
2621 : | |||
2622 : | =item fields | ||
2623 : | |||
2624 : | Field structure from which the field descriptors should be taken. | ||
2625 : | |||
2626 : | =item type | ||
2627 : | |||
2628 : | Type name of the object whose fields are being generated. | ||
2629 : | |||
2630 : | =item values (optional) | ||
2631 : | |||
2632 : | Reference to a value structure from which additional values can be taken. | ||
2633 : | |||
2634 : | =item from (optiona) | ||
2635 : | |||
2636 : | Reference to the source entity instance if relationship data is being generated. | ||
2637 : | |||
2638 : | =item to (optional) | ||
2639 : | |||
2640 : | Reference to the target entity instance if relationship data is being generated. | ||
2641 : | |||
2642 : | =back | ||
2643 : | |||
2644 : | =cut | ||
2645 : | |||
2646 : | sub _GenerateFields { | ||
2647 : | parrello | 1.10 | # Get the parameters. |
2648 : | my ($this, $fields, $type, $values, $from, $to) = @_; | ||
2649 : | # Sort the field names by pass number. | ||
2650 : | my @fieldNames = sort { $fields->{$a}->{DataGen}->{pass} <=> $fields->{$b}->{DataGen}->{pass} } keys %{$fields}; | ||
2651 : | # Loop through the field names, generating data. | ||
2652 : | for my $name (@fieldNames) { | ||
2653 : | # Only proceed if this field needs to be generated. | ||
2654 : | if (!exists $this->{$name}) { | ||
2655 : | # Get this field's data generation descriptor. | ||
2656 : | my $fieldDescriptor = $fields->{$name}; | ||
2657 : | my $data = $fieldDescriptor->{DataGen}; | ||
2658 : | # Get the code to generate the field value. | ||
2659 : | my $codeString = $data->{content}; | ||
2660 : | # Determine whether or not this field is in the primary relation. | ||
2661 : | if ($fieldDescriptor->{relation} eq $type) { | ||
2662 : | # Here we have a primary relation field. Store the field value as | ||
2663 : | # a scalar. | ||
2664 : | $this->{$name} = eval($codeString); | ||
2665 : | } else { | ||
2666 : | # Here we have a secondary relation field. Create a null list | ||
2667 : | # and push the desired number of field values onto it. | ||
2668 : | my @fieldValues = (); | ||
2669 : | my $count = IntGen(0,$data->{testCount}); | ||
2670 : | for (my $i = 0; $i < $count; $i++) { | ||
2671 : | my $newValue = eval($codeString); | ||
2672 : | push @fieldValues, $newValue; | ||
2673 : | } | ||
2674 : | # Store the value list in the main hash. | ||
2675 : | $this->{$name} = \@fieldValues; | ||
2676 : | } | ||
2677 : | } | ||
2678 : | } | ||
2679 : | parrello | 1.1 | } |
2680 : | |||
2681 : | =head3 DumpRelation | ||
2682 : | |||
2683 : | Dump the specified relation's to the specified output file in tab-delimited format. | ||
2684 : | |||
2685 : | This is an instance method. | ||
2686 : | |||
2687 : | =over 4 | ||
2688 : | |||
2689 : | =item outputDirectory | ||
2690 : | |||
2691 : | Directory to contain the output file. | ||
2692 : | |||
2693 : | =item relationName | ||
2694 : | |||
2695 : | Name of the relation to dump. | ||
2696 : | |||
2697 : | =item relation | ||
2698 : | |||
2699 : | Structure describing the relation to be dumped. | ||
2700 : | |||
2701 : | =back | ||
2702 : | |||
2703 : | =cut | ||
2704 : | |||
2705 : | sub _DumpRelation { | ||
2706 : | parrello | 1.10 | # Get the parameters. |
2707 : | my ($self, $outputDirectory, $relationName, $relation) = @_; | ||
2708 : | # Open the output file. | ||
2709 : | my $fileName = "$outputDirectory/$relationName.dtx"; | ||
2710 : | open(DTXOUT, ">$fileName") || Confess("Could not open dump file $fileName: $!"); | ||
2711 : | # Create a query for the specified relation. | ||
2712 : | my $dbh = $self->{_dbh}; | ||
2713 : | my $query = $dbh->prepare_command("SELECT * FROM $relationName"); | ||
2714 : | # Execute the query. | ||
2715 : | $query->execute() || Confess("SELECT error dumping $relationName."); | ||
2716 : | # Loop through the results. | ||
2717 : | while (my @row = $query->fetchrow) { | ||
2718 : | # Escape any tabs or new-lines in the row text. | ||
2719 : | for my $field (@row) { | ||
2720 : | $field =~ s/\n/\\n/g; | ||
2721 : | $field =~ s/\t/\\t/g; | ||
2722 : | } | ||
2723 : | # Tab-join the row and write it to the output file. | ||
2724 : | my $rowText = join("\t", @row); | ||
2725 : | print DTXOUT "$rowText\n"; | ||
2726 : | } | ||
2727 : | # Close the output file. | ||
2728 : | close DTXOUT; | ||
2729 : | parrello | 1.1 | } |
2730 : | |||
2731 : | =head3 GetStructure | ||
2732 : | |||
2733 : | Get the data structure for a specified entity or relationship. | ||
2734 : | |||
2735 : | This is an instance method. | ||
2736 : | |||
2737 : | =over 4 | ||
2738 : | |||
2739 : | =item objectName | ||
2740 : | |||
2741 : | Name of the desired entity or relationship. | ||
2742 : | |||
2743 : | =item RETURN | ||
2744 : | |||
2745 : | The descriptor for the specified object. | ||
2746 : | |||
2747 : | =back | ||
2748 : | |||
2749 : | =cut | ||
2750 : | |||
2751 : | sub _GetStructure { | ||
2752 : | parrello | 1.10 | # Get the parameters. |
2753 : | my ($self, $objectName) = @_; | ||
2754 : | # Get the metadata structure. | ||
2755 : | my $metadata = $self->{_metaData}; | ||
2756 : | # Declare the variable to receive the descriptor. | ||
2757 : | my $retVal; | ||
2758 : | # Get the descriptor from the metadata. | ||
2759 : | if (exists $metadata->{Entities}->{$objectName}) { | ||
2760 : | $retVal = $metadata->{Entities}->{$objectName}; | ||
2761 : | } elsif (exists $metadata->{Relationships}->{$objectName}) { | ||
2762 : | $retVal = $metadata->{Relationships}->{$objectName}; | ||
2763 : | } else { | ||
2764 : | Confess("Object $objectName not found in database."); | ||
2765 : | } | ||
2766 : | # Return the descriptor. | ||
2767 : | return $retVal; | ||
2768 : | parrello | 1.1 | } |
2769 : | |||
2770 : | =head3 GetRelationTable | ||
2771 : | |||
2772 : | Get the list of relations for a specified entity or relationship. | ||
2773 : | |||
2774 : | This is an instance method. | ||
2775 : | |||
2776 : | =over 4 | ||
2777 : | |||
2778 : | =item objectName | ||
2779 : | |||
2780 : | Name of the desired entity or relationship. | ||
2781 : | |||
2782 : | =item RETURN | ||
2783 : | |||
2784 : | A table containing the relations for the specified object. | ||
2785 : | |||
2786 : | =back | ||
2787 : | |||
2788 : | =cut | ||
2789 : | |||
2790 : | sub _GetRelationTable { | ||
2791 : | parrello | 1.10 | # Get the parameters. |
2792 : | my ($self, $objectName) = @_; | ||
2793 : | # Get the descriptor from the metadata. | ||
2794 : | my $objectData = $self->_GetStructure($objectName); | ||
2795 : | # Return the object's relation list. | ||
2796 : | return $objectData->{Relations}; | ||
2797 : | parrello | 1.1 | } |
2798 : | |||
2799 : | =head3 ValidateFieldNames | ||
2800 : | |||
2801 : | Determine whether or not the field names are valid. A description of the problems with the names | ||
2802 : | will be written to the standard error output. If there is an error, this method will abort. This is | ||
2803 : | a static method. | ||
2804 : | |||
2805 : | =over 4 | ||
2806 : | |||
2807 : | =item metadata | ||
2808 : | |||
2809 : | Metadata structure loaded from the XML data definition. | ||
2810 : | |||
2811 : | =back | ||
2812 : | |||
2813 : | =cut | ||
2814 : | |||
2815 : | sub _ValidateFieldNames { | ||
2816 : | parrello | 1.10 | # Get the object. |
2817 : | my ($metadata) = @_; | ||
2818 : | # Declare the return value. We assume success. | ||
2819 : | my $retVal = 1; | ||
2820 : | # Loop through the sections of the database definition. | ||
2821 : | for my $section ('Entities', 'Relationships') { | ||
2822 : | # Loop through the objects in this section. | ||
2823 : | for my $object (values %{$metadata->{$section}}) { | ||
2824 : | # Loop through the object's fields. | ||
2825 : | for my $fieldName (keys %{$object->{Fields}}) { | ||
2826 : | # Now we make some initial validations. | ||
2827 : | if ($fieldName =~ /--/) { | ||
2828 : | # Here we have a doubled minus sign. | ||
2829 : | print STDERR "Field name $fieldName has a doubled hyphen.\n"; | ||
2830 : | $retVal = 0; | ||
2831 : | } elsif ($fieldName !~ /^[A-Za-z]/) { | ||
2832 : | # Here the field name is missing the initial letter. | ||
2833 : | print STDERR "Field name $fieldName does not begin with a letter.\n"; | ||
2834 : | $retVal = 0; | ||
2835 : | } else { | ||
2836 : | # Strip out the minus signs. Everything remaining must be a letter | ||
2837 : | # or digit. | ||
2838 : | my $strippedName = $fieldName; | ||
2839 : | $strippedName =~ s/-//g; | ||
2840 : | if ($strippedName !~ /^[A-Za-z0-9]+$/) { | ||
2841 : | print STDERR "Field name $fieldName contains illegal characters.\n"; | ||
2842 : | $retVal = 0; | ||
2843 : | } | ||
2844 : | } | ||
2845 : | } | ||
2846 : | } | ||
2847 : | } | ||
2848 : | # If an error was found, fail. | ||
2849 : | if ($retVal == 0) { | ||
2850 : | Confess("Errors found in field names."); | ||
2851 : | } | ||
2852 : | parrello | 1.1 | } |
2853 : | |||
2854 : | =head3 LoadRelation | ||
2855 : | |||
2856 : | Load a relation from the data in a tab-delimited disk file. The load will only take place if a disk | ||
2857 : | file with the same name as the relation exists in the specified directory. | ||
2858 : | |||
2859 : | This is an instance method. | ||
2860 : | |||
2861 : | =over 4 | ||
2862 : | |||
2863 : | =item dbh | ||
2864 : | |||
2865 : | DBKernel object for accessing the database. | ||
2866 : | |||
2867 : | =item directoryName | ||
2868 : | |||
2869 : | Name of the directory containing the tab-delimited data files. | ||
2870 : | |||
2871 : | =item relationName | ||
2872 : | |||
2873 : | Name of the relation to load. | ||
2874 : | |||
2875 : | =item rebuild | ||
2876 : | |||
2877 : | TRUE if the table should be dropped and re-created before loading. | ||
2878 : | |||
2879 : | =item RETURN | ||
2880 : | |||
2881 : | Returns a statistical object describing the number of records read and a list of error messages. | ||
2882 : | |||
2883 : | =back | ||
2884 : | |||
2885 : | =cut | ||
2886 : | |||
2887 : | sub _LoadRelation { | ||
2888 : | parrello | 1.10 | # Get the parameters. |
2889 : | my ($self, $directoryName, $relationName, $rebuild) = @_; | ||
2890 : | # Create the file name. | ||
2891 : | my $fileName = "$directoryName/$relationName"; | ||
2892 : | # If the file doesn't exist, try adding the .dtx suffix. | ||
2893 : | if (! -e $fileName) { | ||
2894 : | $fileName .= ".dtx"; | ||
2895 : | if (! -e $fileName) { | ||
2896 : | $fileName = ""; | ||
2897 : | } | ||
2898 : | } | ||
2899 : | # Create the return object. | ||
2900 : | my $retVal = _GetLoadStats(); | ||
2901 : | # If a file exists to load the table, its name will be in $fileName. Otherwise, $fileName will | ||
2902 : | # be a null string. | ||
2903 : | if ($fileName ne "") { | ||
2904 : | # Load the relation from the file. | ||
2905 : | $retVal = $self->LoadTable($fileName, $relationName, $rebuild); | ||
2906 : | } elsif ($rebuild) { | ||
2907 : | # Here we are rebuilding, but no file exists, so we just re-create the table. | ||
2908 : | $self->CreateTable($relationName, 1); | ||
2909 : | } | ||
2910 : | # Return the statistics from the load. | ||
2911 : | return $retVal; | ||
2912 : | parrello | 1.1 | } |
2913 : | |||
2914 : | =head3 LoadMetaData | ||
2915 : | |||
2916 : | This method loads the data describing this database from an XML file into a metadata structure. | ||
2917 : | The resulting structure is a set of nested hash tables containing all the information needed to | ||
2918 : | load or use the database. The schema for the XML file is F<ERDatabase.xml>. | ||
2919 : | |||
2920 : | This is a static method. | ||
2921 : | |||
2922 : | =over 4 | ||
2923 : | |||
2924 : | =item filename | ||
2925 : | |||
2926 : | Name of the file containing the database definition. | ||
2927 : | |||
2928 : | =item RETURN | ||
2929 : | |||
2930 : | Returns a structure describing the database. | ||
2931 : | |||
2932 : | =back | ||
2933 : | |||
2934 : | =cut | ||
2935 : | |||
2936 : | sub _LoadMetaData { | ||
2937 : | parrello | 1.10 | # Get the parameters. |
2938 : | my ($filename) = @_; | ||
2939 : | parrello | 1.15 | Trace("Reading Sprout DBD from $filename.") if T(2); |
2940 : | parrello | 1.10 | # Slurp the XML file into a variable. Extensive use of options is used to insure we |
2941 : | # get the exact structure we want. | ||
2942 : | my $metadata = XML::Simple::XMLin($filename, | ||
2943 : | GroupTags => { Relationships => 'Relationship', | ||
2944 : | Entities => 'Entity', | ||
2945 : | Fields => 'Field', | ||
2946 : | Indexes => 'Index', | ||
2947 : | IndexFields => 'IndexField'}, | ||
2948 : | KeyAttr => { Relationship => 'name', | ||
2949 : | Entity => 'name', | ||
2950 : | Field => 'name'}, | ||
2951 : | ForceArray => ['Field', 'Index', 'IndexField'], | ||
2952 : | ForceContent => 1, | ||
2953 : | NormalizeSpace => 2 | ||
2954 : | ); | ||
2955 : | Trace("XML metadata loaded from file $filename.") if T(1); | ||
2956 : | # Before we go any farther, we need to validate the field and object names. If an error is found, | ||
2957 : | # the method below will fail. | ||
2958 : | _ValidateFieldNames($metadata); | ||
2959 : | # Next we need to create a hash table for finding relations. The entities and relationships are | ||
2960 : | # implemented as one or more database relations. | ||
2961 : | my %masterRelationTable = (); | ||
2962 : | # Loop through the entities. | ||
2963 : | my $entityList = $metadata->{Entities}; | ||
2964 : | for my $entityName (keys %{$entityList}) { | ||
2965 : | my $entityStructure = $entityList->{$entityName}; | ||
2966 : | # | ||
2967 : | parrello | 1.12 | # The first step is to create all the entity's default values. For C<Field> elements, |
2968 : | parrello | 1.10 | # the relation name must be added where it is not specified. For relationships, |
2969 : | # the B<from-link> and B<to-link> fields must be inserted, and for entities an B<id> | ||
2970 : | # field must be added to each relation. Finally, each field will have a C<PrettySort> attribute | ||
2971 : | # added that can be used to pull the implicit fields to the top when displaying the field | ||
2972 : | # documentation. The PrettySort values are 1-based and indicate in which pass through a | ||
2973 : | # relation's data the field should be displayed-- 1 for the first pass, 2 for the second, | ||
2974 : | # and so on. | ||
2975 : | # | ||
2976 : | # Fix up this entity. | ||
2977 : | _FixupFields($entityStructure, $entityName, 2, 3); | ||
2978 : | # Add the ID field. | ||
2979 : | _AddField($entityStructure, 'id', { type => $entityStructure->{keyType}, | ||
2980 : | relation => $entityName, | ||
2981 : | Notes => { content => "Unique identifier for this \[b\]$entityName\[/b\]." }, | ||
2982 : | PrettySort => 1}); | ||
2983 : | # | ||
2984 : | # The current field list enables us to quickly find the relation containing a particular field. | ||
2985 : | # We also need a list that tells us the fields in each relation. We do this by creating a | ||
2986 : | # Relations structure in the entity structure and collating the fields into it based on their | ||
2987 : | # C<relation> property. There is one tricky bit, which is that every relation has to have the | ||
2988 : | # C<id> field in it. Note also that the field list is put into a C<Fields> member of the | ||
2989 : | # relation's structure so that it looks more like the entity and relationship structures. | ||
2990 : | # | ||
2991 : | # First we need to create the relations list. | ||
2992 : | my $relationTable = { }; | ||
2993 : | # Loop through the fields. We use a list of field names to prevent a problem with | ||
2994 : | # the hash table cursor losing its place during the loop. | ||
2995 : | my $fieldList = $entityStructure->{Fields}; | ||
2996 : | my @fieldNames = keys %{$fieldList}; | ||
2997 : | for my $fieldName (@fieldNames) { | ||
2998 : | my $fieldData = $fieldList->{$fieldName}; | ||
2999 : | # Get the current field's relation name. | ||
3000 : | my $relationName = $fieldData->{relation}; | ||
3001 : | # Insure the relation exists. | ||
3002 : | if (!exists $relationTable->{$relationName}) { | ||
3003 : | $relationTable->{$relationName} = { Fields => { } }; | ||
3004 : | } | ||
3005 : | # Add the field to the relation's field structure. | ||
3006 : | $relationTable->{$relationName}->{Fields}->{$fieldName} = $fieldData; | ||
3007 : | } | ||
3008 : | # Now that we've organized all our fields by relation name we need to do some serious | ||
3009 : | # housekeeping. We must add the C<id> field to every relation and convert each relation | ||
3010 : | # to a list of fields. First, we need the ID field itself. | ||
3011 : | my $idField = $fieldList->{id}; | ||
3012 : | # Loop through the relations. | ||
3013 : | for my $relationName (keys %{$relationTable}) { | ||
3014 : | my $relation = $relationTable->{$relationName}; | ||
3015 : | # Get the relation's field list. | ||
3016 : | my $relationFieldList = $relation->{Fields}; | ||
3017 : | # Add the ID field to it. If the field's already there, it will not make any | ||
3018 : | # difference. | ||
3019 : | $relationFieldList->{id} = $idField; | ||
3020 : | # Convert the field set from a hash into a list using the pretty-sort number. | ||
3021 : | $relation->{Fields} = _ReOrderRelationTable($relationFieldList); | ||
3022 : | # Add the relation to the master table. | ||
3023 : | $masterRelationTable{$relationName} = $relation; | ||
3024 : | } | ||
3025 : | # The indexes come next. The primary relation will have a unique-keyed index based on the ID field. | ||
3026 : | # The other relations must have at least one index that begins with the ID field. In addition, the | ||
3027 : | # metadata may require alternate indexes. We do those alternate indexes first. To begin, we need to | ||
3028 : | # get the entity's field list and index list. | ||
3029 : | my $indexList = $entityStructure->{Indexes}; | ||
3030 : | # Loop through the indexes. | ||
3031 : | for my $indexData (@{$indexList}) { | ||
3032 : | # We need to find this index's fields. All of them should belong to the same relation. | ||
3033 : | # The ID field is an exception, since it's in all relations. | ||
3034 : | my $relationName = '0'; | ||
3035 : | for my $fieldDescriptor (@{$indexData->{IndexFields}}) { | ||
3036 : | # Get this field's name. | ||
3037 : | my $fieldName = $fieldDescriptor->{name}; | ||
3038 : | # Only proceed if it is NOT the ID field. | ||
3039 : | if ($fieldName ne 'id') { | ||
3040 : | # Find the relation containing the current index field. | ||
3041 : | my $thisName = $fieldList->{$fieldName}->{relation}; | ||
3042 : | if ($relationName eq '0') { | ||
3043 : | # Here we're looking at the first field, so we save its relation name. | ||
3044 : | $relationName = $thisName; | ||
3045 : | } elsif ($relationName ne $thisName) { | ||
3046 : | # Here we have a field mismatch. | ||
3047 : | Confess("Mixed index: field $fieldName does not belong to relation $relationName."); | ||
3048 : | } | ||
3049 : | } | ||
3050 : | } | ||
3051 : | # Now $relationName is the name of the relation that contains this index. Add the index structure | ||
3052 : | # to the relation. | ||
3053 : | push @{$relationTable->{$relationName}->{Indexes}}, $indexData; | ||
3054 : | } | ||
3055 : | # Now each index has been put in a relation. We need to add the primary index for the primary | ||
3056 : | # relation. | ||
3057 : | push @{$relationTable->{$entityName}->{Indexes}}, | ||
3058 : | { IndexFields => [ {name => 'id', order => 'ascending'} ], Unique => 'true', | ||
3059 : | Notes => { content => "Primary index for $entityName." } | ||
3060 : | }; | ||
3061 : | # The next step is to insure that each relation has at least one index that begins with the ID field. | ||
3062 : | # After that, we convert each relation's index list to an index table. We first need to loop through | ||
3063 : | # the relations. | ||
3064 : | for my $relationName (keys %{$relationTable}) { | ||
3065 : | my $relation = $relationTable->{$relationName}; | ||
3066 : | # Get the relation's index list. | ||
3067 : | my $indexList = $relation->{Indexes}; | ||
3068 : | # Insure this relation has an ID index. | ||
3069 : | my $found = 0; | ||
3070 : | for my $index (@{$indexList}) { | ||
3071 : | if ($index->{IndexFields}->[0]->{name} eq "id") { | ||
3072 : | $found = 1; | ||
3073 : | } | ||
3074 : | } | ||
3075 : | if ($found == 0) { | ||
3076 : | push @{$indexList}, { IndexFields => [ {name => 'id', order => 'ascending'} ] }; | ||
3077 : | } | ||
3078 : | # Now we need to convert the relation's index list to an index table. We begin by creating | ||
3079 : | # an empty table in the relation structure. | ||
3080 : | $relation->{Indexes} = { }; | ||
3081 : | # Loop through the indexes. | ||
3082 : | my $count = 0; | ||
3083 : | for my $index (@{$indexList}) { | ||
3084 : | # Add this index to the index table. | ||
3085 : | _AddIndex("idx$relationName$count", $relation, $index); | ||
3086 : | # Increment the counter so that the next index has a different name. | ||
3087 : | $count++; | ||
3088 : | } | ||
3089 : | } | ||
3090 : | # Finally, we add the relation structure to the entity. | ||
3091 : | $entityStructure->{Relations} = $relationTable; | ||
3092 : | } | ||
3093 : | # Loop through the relationships. Relationships actually turn out to be much simpler than entities. | ||
3094 : | # For one thing, there is only a single constituent relation. | ||
3095 : | my $relationshipList = $metadata->{Relationships}; | ||
3096 : | for my $relationshipName (keys %{$relationshipList}) { | ||
3097 : | my $relationshipStructure = $relationshipList->{$relationshipName}; | ||
3098 : | # Fix up this relationship. | ||
3099 : | _FixupFields($relationshipStructure, $relationshipName, 2, 3); | ||
3100 : | # Format a description for the FROM field. | ||
3101 : | my $fromEntity = $relationshipStructure->{from}; | ||
3102 : | my $fromComment = "<b>id</b> of the source <b><a href=\"#$fromEntity\">$fromEntity</a></b>."; | ||
3103 : | # Get the FROM entity's key type. | ||
3104 : | my $fromType = $entityList->{$fromEntity}->{keyType}; | ||
3105 : | # Add the FROM field. | ||
3106 : | _AddField($relationshipStructure, 'from-link', { type => $fromType, | ||
3107 : | relation => $relationshipName, | ||
3108 : | Notes => { content => $fromComment }, | ||
3109 : | PrettySort => 1}); | ||
3110 : | # Format a description for the TO field. | ||
3111 : | my $toEntity = $relationshipStructure->{to}; | ||
3112 : | my $toComment = "<b>id</b> of the target <b><a href=\"#$toEntity\">$toEntity</a></b>."; | ||
3113 : | # Get the TO entity's key type. | ||
3114 : | my $toType = $entityList->{$toEntity}->{keyType}; | ||
3115 : | # Add the TO field. | ||
3116 : | _AddField($relationshipStructure, 'to-link', { type=> $toType, | ||
3117 : | relation => $relationshipName, | ||
3118 : | Notes => { content => $toComment }, | ||
3119 : | PrettySort => 1}); | ||
3120 : | # Create an index-free relation from the fields. | ||
3121 : | my $thisRelation = { Fields => _ReOrderRelationTable($relationshipStructure->{Fields}), | ||
3122 : | Indexes => { } }; | ||
3123 : | $relationshipStructure->{Relations} = { $relationshipName => $thisRelation }; | ||
3124 : | # Create the FROM and TO indexes. | ||
3125 : | _CreateRelationshipIndex("From", $relationshipName, $relationshipStructure); | ||
3126 : | _CreateRelationshipIndex("To", $relationshipName, $relationshipStructure); | ||
3127 : | # Add the relation to the master table. | ||
3128 : | $masterRelationTable{$relationshipName} = $thisRelation; | ||
3129 : | } | ||
3130 : | # Now store the master relation table in the metadata structure. | ||
3131 : | $metadata->{RelationTable} = \%masterRelationTable; | ||
3132 : | # Our final task is to create the join table. The join table is a hash that describes all | ||
3133 : | # the join clauses for traveling through the relationships. The join clause is an equality | ||
3134 : | # condition that can be put into a WHERE clause in order to join two objects. Two relationships | ||
3135 : | # can be joined if they share an entity in common; and an entity can be joined to a relationship | ||
3136 : | # if the entity is at either end of the relationship. | ||
3137 : | my %joinTable = (); | ||
3138 : | # Loop through the entities. | ||
3139 : | for my $entityName (keys %{$entityList}) { | ||
3140 : | # Build three lists of the relationships connected to this entity. One will be | ||
3141 : | # for relationships from the entity, one for relationships to the entity, and | ||
3142 : | # one for recursive relationships. | ||
3143 : | my @fromList = (); | ||
3144 : | my @toList = (); | ||
3145 : | my @bothList = (); | ||
3146 : | parrello | 1.21 | Trace("Join table build for $entityName.") if T(metadata => 4); |
3147 : | parrello | 1.10 | for my $relationshipName (keys %{$relationshipList}) { |
3148 : | my $relationship = $relationshipList->{$relationshipName}; | ||
3149 : | # Determine if this relationship has our entity in one of its link fields. | ||
3150 : | my $fromEntity = $relationship->{from}; | ||
3151 : | my $toEntity = $relationship->{to}; | ||
3152 : | parrello | 1.41 | Trace("Join check for relationship $relationshipName from $fromEntity to $toEntity.") if T(Joins => 4); |
3153 : | parrello | 1.10 | if ($fromEntity eq $entityName) { |
3154 : | if ($toEntity eq $entityName) { | ||
3155 : | # Here the relationship is recursive. | ||
3156 : | push @bothList, $relationshipName; | ||
3157 : | parrello | 1.21 | Trace("Relationship $relationshipName put in both-list.") if T(metadata => 4); |
3158 : | parrello | 1.10 | } else { |
3159 : | # Here the relationship comes from the entity. | ||
3160 : | push @fromList, $relationshipName; | ||
3161 : | parrello | 1.21 | Trace("Relationship $relationshipName put in from-list.") if T(metadata => 4); |
3162 : | parrello | 1.10 | } |
3163 : | } elsif ($toEntity eq $entityName) { | ||
3164 : | # Here the relationship goes to the entity. | ||
3165 : | push @toList, $relationshipName; | ||
3166 : | parrello | 1.21 | Trace("Relationship $relationshipName put in to-list.") if T(metadata => 4); |
3167 : | parrello | 1.10 | } |
3168 : | } | ||
3169 : | # Create the nonrecursive joins. Note that we build two hashes for running | ||
3170 : | # through the nonrecursive relationships since we'll have an outer loop | ||
3171 : | # and an inner loop, and we can't do two "each" iterations on the same | ||
3172 : | # hash table at the same time. | ||
3173 : | my %directRelationships = ( from => \@fromList, to => \@toList ); | ||
3174 : | my %otherRelationships = ( from => \@fromList, to => \@toList ); | ||
3175 : | for my $linkType (keys %directRelationships) { | ||
3176 : | my $relationships = $directRelationships{$linkType}; | ||
3177 : | # Loop through all the relationships. | ||
3178 : | for my $relationshipName (@{$relationships}) { | ||
3179 : | # Create joins between the entity and this relationship. | ||
3180 : | my $linkField = "$relationshipName.${linkType}_link"; | ||
3181 : | my $joinClause = "$entityName.id = $linkField"; | ||
3182 : | parrello | 1.21 | Trace("Entity join clause is $joinClause for $entityName and $relationshipName.") if T(metadata => 4); |
3183 : | parrello | 1.10 | $joinTable{"$entityName/$relationshipName"} = $joinClause; |
3184 : | $joinTable{"$relationshipName/$entityName"} = $joinClause; | ||
3185 : | # Create joins between this relationship and the other relationships. | ||
3186 : | for my $otherType (keys %otherRelationships) { | ||
3187 : | my $otherships = $otherRelationships{$otherType}; | ||
3188 : | for my $otherName (@{$otherships}) { | ||
3189 : | # Get the key for this join. | ||
3190 : | my $joinKey = "$otherName/$relationshipName"; | ||
3191 : | # Check for a duplicate or a self-join. | ||
3192 : | if (exists $joinTable{$joinKey}) { | ||
3193 : | # Here we have a duplicate, which means that the join | ||
3194 : | # path is ambiguous. We delete the join from the join | ||
3195 : | # table to prevent it from being used. | ||
3196 : | delete $joinTable{$joinKey}; | ||
3197 : | Trace("Deleting ambiguous join $joinKey.") if T(4); | ||
3198 : | } elsif ($otherName ne $relationshipName) { | ||
3199 : | # Here we have a valid join. Note that joins between a | ||
3200 : | # relationship and itself are prohibited. | ||
3201 : | my $relJoinClause = "$otherName.${otherType}_link = $linkField"; | ||
3202 : | $joinTable{$joinKey} = $relJoinClause; | ||
3203 : | parrello | 1.21 | Trace("Relationship join clause is $relJoinClause for $joinKey.") if T(metadata => 4); |
3204 : | parrello | 1.10 | } |
3205 : | } | ||
3206 : | } | ||
3207 : | # Create joins between this relationship and the recursive relationships. | ||
3208 : | # We don't need to check for ambiguous joins here, because a recursive | ||
3209 : | # relationship can only be ambiguous with another recursive relationship, | ||
3210 : | # and the incoming relationship from the outer loop is never recursive. | ||
3211 : | for my $otherName (@bothList) { | ||
3212 : | parrello | 1.21 | Trace("Setting up relationship joins to recursive relationship $otherName with $relationshipName.") if T(metadata => 4); |
3213 : | parrello | 1.10 | # Join from the left. |
3214 : | $joinTable{"$relationshipName/$otherName"} = | ||
3215 : | "$linkField = $otherName.from_link"; | ||
3216 : | # Join from the right. | ||
3217 : | $joinTable{"$otherName/$relationshipName"} = | ||
3218 : | "$otherName.to_link = $linkField"; | ||
3219 : | } | ||
3220 : | } | ||
3221 : | } | ||
3222 : | # Create entity joins for the recursive relationships. Unlike the non-recursive | ||
3223 : | # joins, the direction makes a difference with the recursive joins. This can give | ||
3224 : | # rise to situations where we can't create the path we want; however, it is always | ||
3225 : | # possible to get the same effect using multiple queries. | ||
3226 : | for my $relationshipName (@bothList) { | ||
3227 : | parrello | 1.21 | Trace("Setting up entity joins to recursive relationship $relationshipName with $entityName.") if T(metadata => 4); |
3228 : | parrello | 1.10 | # Join to the entity from each direction. |
3229 : | $joinTable{"$entityName/$relationshipName"} = | ||
3230 : | "$entityName.id = $relationshipName.from_link"; | ||
3231 : | $joinTable{"$relationshipName/$entityName"} = | ||
3232 : | "$relationshipName.to_link = $entityName.id"; | ||
3233 : | } | ||
3234 : | } | ||
3235 : | # Add the join table to the structure. | ||
3236 : | $metadata->{Joins} = \%joinTable; | ||
3237 : | # Return the slurped and fixed-up structure. | ||
3238 : | return $metadata; | ||
3239 : | parrello | 1.1 | } |
3240 : | |||
3241 : | parrello | 1.42 | =head3 SortNeeded |
3242 : | |||
3243 : | parrello | 1.60 | C<< my $parms = $erdb->SortNeeded($relationName); >> |
3244 : | |||
3245 : | Return the pipe command for the sort that should be applied to the specified | ||
3246 : | relation when creating the load file. | ||
3247 : | |||
3248 : | For example, if the load file should be sorted ascending by the first | ||
3249 : | field with duplicates removed, this method would return | ||
3250 : | parrello | 1.42 | |
3251 : | parrello | 1.60 | sort -k 1 -u -t "\t" |
3252 : | |||
3253 : | If the first field is numeric and duplicates are okay, the method would | ||
3254 : | return | ||
3255 : | |||
3256 : | sort -k 1n -t "\t" | ||
3257 : | parrello | 1.42 | |
3258 : | =over 4 | ||
3259 : | |||
3260 : | =item relationName | ||
3261 : | |||
3262 : | Name of the relation to be examined. | ||
3263 : | |||
3264 : | parrello | 1.60 | =item |
3265 : | parrello | 1.42 | |
3266 : | parrello | 1.60 | Returns the sort command to use for sorting the relation, suitable for piping. |
3267 : | parrello | 1.42 | |
3268 : | =back | ||
3269 : | |||
3270 : | =cut | ||
3271 : | #: Return Type $; | ||
3272 : | sub SortNeeded { | ||
3273 : | # Get the parameters. | ||
3274 : | my ($self, $relationName) = @_; | ||
3275 : | parrello | 1.60 | # Declare a descriptor to hold the names of the key fields. |
3276 : | my @keyNames = (); | ||
3277 : | # Declare a flag for indicating uniqueness. | ||
3278 : | my $unique; | ||
3279 : | # Get the relation structure. | ||
3280 : | my $relationData = $self->_FindRelation($relationName); | ||
3281 : | # Find out if the relation is a primary entity relation, | ||
3282 : | # a relationship relation, or a secondary entity relation. | ||
3283 : | parrello | 1.43 | my $entityTable = $self->{_metaData}->{Entities}; |
3284 : | parrello | 1.60 | my $relationshipTable = $self->{_metaData}->{Relationships}; |
3285 : | parrello | 1.42 | if (exists $entityTable->{$relationName}) { |
3286 : | parrello | 1.60 | # Here we have a primary entity relation, so we have a unique sort on the |
3287 : | # ID field. | ||
3288 : | $unique = "-u "; | ||
3289 : | push @keyNames, "id"; | ||
3290 : | } elsif (exists $relationshipTable->{$relationName}) { | ||
3291 : | # Here we have a relationship. We sort using the FROM index. | ||
3292 : | $unique = ""; | ||
3293 : | my $relationshipData = $relationshipTable->{$relationName}; | ||
3294 : | my $index = $relationData->{Indexes}->{"idx${relationName}From"}; | ||
3295 : | push @keyNames, @{$index->{IndexFields}}; | ||
3296 : | } else { | ||
3297 : | # Here we have a secondary entity relation, so we have a non-unique sort on | ||
3298 : | # the ID field. | ||
3299 : | $unique = ""; | ||
3300 : | push @keyNames, "id"; | ||
3301 : | } | ||
3302 : | # Now we parse the key names into sort parameters. First, we prime the return | ||
3303 : | # string. | ||
3304 : | parrello | 1.64 | my $retVal = "sort -t \"\t\" $unique"; |
3305 : | parrello | 1.60 | # Get the relation's field list. |
3306 : | my @fields = @{$relationData->{Fields}}; | ||
3307 : | # Loop through the keys. | ||
3308 : | for my $keyData (@keyNames) { | ||
3309 : | # Get the key and the ordering. | ||
3310 : | my ($keyName, $ordering); | ||
3311 : | if ($keyData =~ /^([^ ]+) DESC/) { | ||
3312 : | ($keyName, $ordering) = ($1, "descending"); | ||
3313 : | } else { | ||
3314 : | ($keyName, $ordering) = ($keyData, "ascending"); | ||
3315 : | } | ||
3316 : | # Find the key's position and type. | ||
3317 : | my $fieldSpec; | ||
3318 : | for (my $i = 0; $i <= $#fields && ! $fieldSpec; $i++) { | ||
3319 : | my $thisField = $fields[$i]; | ||
3320 : | if ($thisField->{name} eq $keyName) { | ||
3321 : | # Get the sort modifier for this field type. The modifier | ||
3322 : | # decides whether we're using a character, numeric, or | ||
3323 : | # floating-point sort. | ||
3324 : | my $modifier = $TypeTable{$thisField->{type}}->{sort}; | ||
3325 : | # If the index is descending for this field, denote we want | ||
3326 : | # to reverse the sort order on this field. | ||
3327 : | if ($ordering eq 'descending') { | ||
3328 : | $modifier .= "r"; | ||
3329 : | } | ||
3330 : | # Store the position and modifier into the field spec, which | ||
3331 : | parrello | 1.62 | # will stop the inner loop. Note that the field number is |
3332 : | # 1-based in the sort command, so we have to increment the | ||
3333 : | # index. | ||
3334 : | $fieldSpec = ($i + 1) . $modifier; | ||
3335 : | parrello | 1.60 | } |
3336 : | } | ||
3337 : | # Add this field to the sort command. | ||
3338 : | $retVal .= " -k $fieldSpec"; | ||
3339 : | parrello | 1.42 | } |
3340 : | # Return the result. | ||
3341 : | return $retVal; | ||
3342 : | } | ||
3343 : | |||
3344 : | parrello | 1.1 | =head3 CreateRelationshipIndex |
3345 : | |||
3346 : | Create an index for a relationship's relation. | ||
3347 : | |||
3348 : | This is a static method. | ||
3349 : | |||
3350 : | =over 4 | ||
3351 : | |||
3352 : | =item indexKey | ||
3353 : | |||
3354 : | Type of index: either C<"From"> or C<"To">. | ||
3355 : | |||
3356 : | =item relationshipName | ||
3357 : | |||
3358 : | Name of the relationship. | ||
3359 : | |||
3360 : | =item relationshipStructure | ||
3361 : | |||
3362 : | Structure describing the relationship that the index will sort. | ||
3363 : | |||
3364 : | =back | ||
3365 : | |||
3366 : | =cut | ||
3367 : | |||
3368 : | sub _CreateRelationshipIndex { | ||
3369 : | parrello | 1.10 | # Get the parameters. |
3370 : | my ($indexKey, $relationshipName, $relationshipStructure) = @_; | ||
3371 : | # Get the target relation. | ||
3372 : | my $relationStructure = $relationshipStructure->{Relations}->{$relationshipName}; | ||
3373 : | # Create a descriptor for the link field that goes at the beginning of this index. | ||
3374 : | my $firstField = { name => lcfirst $indexKey . '-link', order => 'ascending' }; | ||
3375 : | # Get the target index descriptor. | ||
3376 : | my $newIndex = $relationshipStructure->{$indexKey . "Index"}; | ||
3377 : | # Add the first field to the index's field list. Due to the craziness of PERL, if the | ||
3378 : | # index descriptor does not exist, it will be created automatically so we can add | ||
3379 : | # the field to it. | ||
3380 : | unshift @{$newIndex->{IndexFields}}, $firstField; | ||
3381 : | parrello | 1.12 | # If this is a one-to-many relationship, the "To" index is unique. |
3382 : | if ($relationshipStructure->{arity} eq "1M" && $indexKey eq "To") { | ||
3383 : | $newIndex->{Unique} = 'true'; | ||
3384 : | } | ||
3385 : | parrello | 1.10 | # Add the index to the relation. |
3386 : | _AddIndex("idx$relationshipName$indexKey", $relationStructure, $newIndex); | ||
3387 : | parrello | 1.1 | } |
3388 : | |||
3389 : | =head3 AddIndex | ||
3390 : | |||
3391 : | Add an index to a relation structure. | ||
3392 : | |||
3393 : | This is a static method. | ||
3394 : | |||
3395 : | =over 4 | ||
3396 : | |||
3397 : | =item indexName | ||
3398 : | |||
3399 : | Name to give to the new index. | ||
3400 : | |||
3401 : | =item relationStructure | ||
3402 : | |||
3403 : | Relation structure to which the new index should be added. | ||
3404 : | |||
3405 : | =item newIndex | ||
3406 : | |||
3407 : | New index to add. | ||
3408 : | |||
3409 : | =back | ||
3410 : | |||
3411 : | =cut | ||
3412 : | |||
3413 : | sub _AddIndex { | ||
3414 : | parrello | 1.10 | # Get the parameters. |
3415 : | my ($indexName, $relationStructure, $newIndex) = @_; | ||
3416 : | # We want to re-do the index's field list. Instead of an object for each field, | ||
3417 : | # we want a string consisting of the field name optionally followed by the token DESC. | ||
3418 : | my @fieldList = ( ); | ||
3419 : | for my $field (@{$newIndex->{IndexFields}}) { | ||
3420 : | # Create a string containing the field name. | ||
3421 : | my $fieldString = $field->{name}; | ||
3422 : | # Add the ordering token if needed. | ||
3423 : | if ($field->{order} eq "descending") { | ||
3424 : | $fieldString .= " DESC"; | ||
3425 : | } | ||
3426 : | # Push the result onto the field list. | ||
3427 : | push @fieldList, $fieldString; | ||
3428 : | } | ||
3429 : | # Store the field list just created as the new index field list. | ||
3430 : | $newIndex->{IndexFields} = \@fieldList; | ||
3431 : | # Add the index to the relation's index list. | ||
3432 : | $relationStructure->{Indexes}->{$indexName} = $newIndex; | ||
3433 : | parrello | 1.1 | } |
3434 : | |||
3435 : | =head3 FixupFields | ||
3436 : | |||
3437 : | This method fixes the field list for an entity or relationship. It will add the caller-specified | ||
3438 : | relation name to fields that do not have a name and set the C<PrettySort> value as specified. | ||
3439 : | |||
3440 : | This is a static method. | ||
3441 : | |||
3442 : | =over 4 | ||
3443 : | |||
3444 : | =item structure | ||
3445 : | |||
3446 : | Entity or relationship structure to be fixed up. | ||
3447 : | |||
3448 : | =item defaultRelationName | ||
3449 : | |||
3450 : | Default relation name to be added to the fields. | ||
3451 : | |||
3452 : | =item prettySortValue | ||
3453 : | |||
3454 : | C<PrettySort> value for the relation's normal fields. | ||
3455 : | |||
3456 : | =item textPrettySortValue | ||
3457 : | |||
3458 : | C<PrettySort> value for the relation's text fields. This value can be set to one greater than the | ||
3459 : | normal pretty sort value so that text fields go at the end of each relation. | ||
3460 : | |||
3461 : | =back | ||
3462 : | |||
3463 : | =cut | ||
3464 : | |||
3465 : | sub _FixupFields { | ||
3466 : | parrello | 1.10 | # Get the parameters. |
3467 : | my ($structure, $defaultRelationName, $prettySortValue, $textPrettySortValue) = @_; | ||
3468 : | # Insure the structure has a field list. | ||
3469 : | if (!exists $structure->{Fields}) { | ||
3470 : | # Here it doesn't, so we create a new one. | ||
3471 : | $structure->{Fields} = { }; | ||
3472 : | } else { | ||
3473 : | # Here we have a field list. Loop through its fields. | ||
3474 : | my $fieldStructures = $structure->{Fields}; | ||
3475 : | for my $fieldName (keys %{$fieldStructures}) { | ||
3476 : | parrello | 1.8 | Trace("Processing field $fieldName of $defaultRelationName.") if T(4); |
3477 : | parrello | 1.10 | my $fieldData = $fieldStructures->{$fieldName}; |
3478 : | # Get the field type. | ||
3479 : | my $type = $fieldData->{type}; | ||
3480 : | # Plug in a relation name if it is needed. | ||
3481 : | Tracer::MergeOptions($fieldData, { relation => $defaultRelationName }); | ||
3482 : | # Plug in a data generator if we need one. | ||
3483 : | if (!exists $fieldData->{DataGen}) { | ||
3484 : | # The data generator will use the default for the field's type. | ||
3485 : | $fieldData->{DataGen} = { content => $TypeTable{$type}->{dataGen} }; | ||
3486 : | } | ||
3487 : | # Plug in the defaults for the optional data generation parameters. | ||
3488 : | Tracer::MergeOptions($fieldData->{DataGen}, { testCount => 1, pass => 0 }); | ||
3489 : | # Add the PrettySortValue. | ||
3490 : | $fieldData->{PrettySort} = (($type eq "text") ? $textPrettySortValue : $prettySortValue); | ||
3491 : | } | ||
3492 : | } | ||
3493 : | parrello | 1.1 | } |
3494 : | |||
3495 : | =head3 FixName | ||
3496 : | |||
3497 : | Fix the incoming field name so that it is a legal SQL column name. | ||
3498 : | |||
3499 : | This is a static method. | ||
3500 : | |||
3501 : | =over 4 | ||
3502 : | |||
3503 : | =item fieldName | ||
3504 : | |||
3505 : | Field name to fix. | ||
3506 : | |||
3507 : | =item RETURN | ||
3508 : | |||
3509 : | Returns the fixed-up field name. | ||
3510 : | |||
3511 : | =back | ||
3512 : | |||
3513 : | =cut | ||
3514 : | |||
3515 : | sub _FixName { | ||
3516 : | parrello | 1.10 | # Get the parameter. |
3517 : | my ($fieldName) = @_; | ||
3518 : | # Replace its minus signs with underscores. | ||
3519 : | $fieldName =~ s/-/_/g; | ||
3520 : | # Return the result. | ||
3521 : | return $fieldName; | ||
3522 : | parrello | 1.1 | } |
3523 : | |||
3524 : | =head3 FixNames | ||
3525 : | |||
3526 : | Fix all the field names in a list. | ||
3527 : | |||
3528 : | This is a static method. | ||
3529 : | |||
3530 : | =over 4 | ||
3531 : | |||
3532 : | =item field1, field2, field3, ... fieldn | ||
3533 : | |||
3534 : | List of field names to fix. | ||
3535 : | |||
3536 : | =item RETURN | ||
3537 : | |||
3538 : | Returns a list of fixed-up versions of the incoming field names. | ||
3539 : | |||
3540 : | =back | ||
3541 : | |||
3542 : | =cut | ||
3543 : | |||
3544 : | sub _FixNames { | ||
3545 : | parrello | 1.10 | # Create the result list. |
3546 : | my @result = ( ); | ||
3547 : | # Loop through the incoming parameters. | ||
3548 : | for my $field (@_) { | ||
3549 : | push @result, _FixName($field); | ||
3550 : | } | ||
3551 : | # Return the result. | ||
3552 : | return @result; | ||
3553 : | parrello | 1.1 | } |
3554 : | |||
3555 : | =head3 AddField | ||
3556 : | |||
3557 : | Add a field to a field list. | ||
3558 : | |||
3559 : | This is a static method. | ||
3560 : | |||
3561 : | =over 4 | ||
3562 : | |||
3563 : | =item structure | ||
3564 : | |||
3565 : | Structure (usually an entity or relationship) that is to contain the field. | ||
3566 : | |||
3567 : | =item fieldName | ||
3568 : | |||
3569 : | Name of the new field. | ||
3570 : | |||
3571 : | =item fieldData | ||
3572 : | |||
3573 : | Structure containing the data to put in the field. | ||
3574 : | |||
3575 : | =back | ||
3576 : | |||
3577 : | =cut | ||
3578 : | |||
3579 : | sub _AddField { | ||
3580 : | parrello | 1.10 | # Get the parameters. |
3581 : | my ($structure, $fieldName, $fieldData) = @_; | ||
3582 : | # Create the field structure by copying the incoming data. | ||
3583 : | my $fieldStructure = {%{$fieldData}}; | ||
3584 : | # Get a reference to the field list itself. | ||
3585 : | my $fieldList = $structure->{Fields}; | ||
3586 : | # Add the field to the field list. | ||
3587 : | $fieldList->{$fieldName} = $fieldStructure; | ||
3588 : | parrello | 1.1 | } |
3589 : | |||
3590 : | =head3 ReOrderRelationTable | ||
3591 : | |||
3592 : | This method will take a relation table and re-sort it according to the implicit ordering of the | ||
3593 : | C<PrettySort> property. Instead of a hash based on field names, it will return a list of fields. | ||
3594 : | This requires creating a new hash that contains the field name in the C<name> property but doesn't | ||
3595 : | have the C<PrettySort> property, and then inserting that new hash into the field list. | ||
3596 : | |||
3597 : | This is a static method. | ||
3598 : | |||
3599 : | =over 4 | ||
3600 : | |||
3601 : | =item relationTable | ||
3602 : | |||
3603 : | Relation hash to be reformatted into a list. | ||
3604 : | |||
3605 : | =item RETURN | ||
3606 : | |||
3607 : | A list of field hashes. | ||
3608 : | |||
3609 : | =back | ||
3610 : | |||
3611 : | =cut | ||
3612 : | |||
3613 : | sub _ReOrderRelationTable { | ||
3614 : | parrello | 1.10 | # Get the parameters. |
3615 : | my ($relationTable) = @_; | ||
3616 : | # Create the return list. | ||
3617 : | my @resultList; | ||
3618 : | # Rather than copy all the fields in a single pass, we make multiple passes and only copy | ||
3619 : | # fields whose PrettySort value matches the current pass number. This process continues | ||
3620 : | # until we process all the fields in the relation. | ||
3621 : | my $fieldsLeft = (values %{$relationTable}); | ||
3622 : | for (my $sortPass = 1; $fieldsLeft > 0; $sortPass++) { | ||
3623 : | # Loop through the fields. Note that we lexically sort the fields. This makes field name | ||
3624 : | # secondary to pretty-sort number in the final ordering. | ||
3625 : | for my $fieldName (sort keys %{$relationTable}) { | ||
3626 : | # Get this field's data. | ||
3627 : | my $fieldData = $relationTable->{$fieldName}; | ||
3628 : | # Verify the sort pass. | ||
3629 : | if ($fieldData->{PrettySort} == $sortPass) { | ||
3630 : | # Here we're in the correct pass. Denote we've found a field. | ||
3631 : | $fieldsLeft--; | ||
3632 : | # The next step is to create the field structure. This done by copying all | ||
3633 : | # of the field elements except PrettySort and adding the name. | ||
3634 : | my %thisField; | ||
3635 : | for my $property (keys %{$fieldData}) { | ||
3636 : | if ($property ne 'PrettySort') { | ||
3637 : | $thisField{$property} = $fieldData->{$property}; | ||
3638 : | } | ||
3639 : | } | ||
3640 : | $thisField{name} = $fieldName; | ||
3641 : | # Now we add this field to the end of the result list. | ||
3642 : | push @resultList, \%thisField; | ||
3643 : | } | ||
3644 : | } | ||
3645 : | } | ||
3646 : | # Return a reference to the result list. | ||
3647 : | return \@resultList; | ||
3648 : | parrello | 1.1 | |
3649 : | } | ||
3650 : | |||
3651 : | =head3 IsPrimary | ||
3652 : | |||
3653 : | Return TRUE if a specified relation is a primary relation, else FALSE. A relation is primary | ||
3654 : | if it has the same name as an entity or relationship. | ||
3655 : | |||
3656 : | This is an instance method. | ||
3657 : | |||
3658 : | =over 4 | ||
3659 : | |||
3660 : | =item relationName | ||
3661 : | |||
3662 : | Name of the relevant relation. | ||
3663 : | |||
3664 : | =item RETURN | ||
3665 : | |||
3666 : | Returns TRUE for a primary relation, else FALSE. | ||
3667 : | |||
3668 : | =back | ||
3669 : | |||
3670 : | =cut | ||
3671 : | |||
3672 : | sub _IsPrimary { | ||
3673 : | parrello | 1.10 | # Get the parameters. |
3674 : | my ($self, $relationName) = @_; | ||
3675 : | # Check for the relation in the entity table. | ||
3676 : | my $entityTable = $self->{_metaData}->{Entities}; | ||
3677 : | my $retVal = exists $entityTable->{$relationName}; | ||
3678 : | if (! $retVal) { | ||
3679 : | # Check for it in the relationship table. | ||
3680 : | my $relationshipTable = $self->{_metaData}->{Relationships}; | ||
3681 : | $retVal = exists $relationshipTable->{$relationName}; | ||
3682 : | } | ||
3683 : | # Return the determination indicator. | ||
3684 : | return $retVal; | ||
3685 : | parrello | 1.1 | } |
3686 : | |||
3687 : | =head3 FindRelation | ||
3688 : | |||
3689 : | Return the descriptor for the specified relation. | ||
3690 : | |||
3691 : | This is an instance method. | ||
3692 : | |||
3693 : | =over 4 | ||
3694 : | |||
3695 : | =item relationName | ||
3696 : | |||
3697 : | Name of the relation whose descriptor is to be returned. | ||
3698 : | |||
3699 : | =item RETURN | ||
3700 : | |||
3701 : | Returns the object that describes the relation's indexes and fields. | ||
3702 : | |||
3703 : | =back | ||
3704 : | |||
3705 : | =cut | ||
3706 : | sub _FindRelation { | ||
3707 : | parrello | 1.10 | # Get the parameters. |
3708 : | my ($self, $relationName) = @_; | ||
3709 : | # Get the relation's structure from the master relation table in the metadata structure. | ||
3710 : | my $metaData = $self->{_metaData}; | ||
3711 : | my $retVal = $metaData->{RelationTable}->{$relationName}; | ||
3712 : | # Return it to the caller. | ||
3713 : | return $retVal; | ||
3714 : | parrello | 1.1 | } |
3715 : | |||
3716 : | =head2 HTML Documentation Utility Methods | ||
3717 : | |||
3718 : | =head3 ComputeRelationshipSentence | ||
3719 : | |||
3720 : | The relationship sentence consists of the relationship name between the names of the | ||
3721 : | two related entities and an arity indicator. | ||
3722 : | |||
3723 : | This is a static method. | ||
3724 : | |||
3725 : | =over 4 | ||
3726 : | |||
3727 : | =item relationshipName | ||
3728 : | |||
3729 : | Name of the relationship. | ||
3730 : | |||
3731 : | =item relationshipStructure | ||
3732 : | |||
3733 : | Relationship structure containing the relationship's description and properties. | ||
3734 : | |||
3735 : | =item RETURN | ||
3736 : | |||
3737 : | Returns a string containing the entity names on either side of the relationship name and an | ||
3738 : | indicator of the arity. | ||
3739 : | |||
3740 : | =back | ||
3741 : | |||
3742 : | =cut | ||
3743 : | |||
3744 : | sub _ComputeRelationshipSentence { | ||
3745 : | parrello | 1.10 | # Get the parameters. |
3746 : | my ($relationshipName, $relationshipStructure) = @_; | ||
3747 : | # Format the relationship sentence. | ||
3748 : | my $result = "$relationshipStructure->{from} <b>$relationshipName</b> $relationshipStructure->{to}"; | ||
3749 : | # Compute the arity. | ||
3750 : | my $arityCode = $relationshipStructure->{arity}; | ||
3751 : | my $arity = $ArityTable{$arityCode}; | ||
3752 : | $result .= " ($arity)"; | ||
3753 : | return $result; | ||
3754 : | parrello | 1.1 | } |
3755 : | |||
3756 : | =head3 ComputeRelationshipHeading | ||
3757 : | |||
3758 : | The relationship heading is the L<relationship sentence|/ComputeRelationshipSentence> with the entity | ||
3759 : | names hyperlinked to the appropriate entity sections of the document. | ||
3760 : | |||
3761 : | This is a static method. | ||
3762 : | |||
3763 : | =over 4 | ||
3764 : | |||
3765 : | =item relationshipName | ||
3766 : | |||
3767 : | Name of the relationship. | ||
3768 : | |||
3769 : | =item relationshipStructure | ||
3770 : | |||
3771 : | Relationship structure containing the relationship's description and properties. | ||
3772 : | |||
3773 : | =item RETURN | ||
3774 : | |||
3775 : | Returns a string containing the entity names on either side of the relationship name with the entity | ||
3776 : | names hyperlinked. | ||
3777 : | |||
3778 : | =back | ||
3779 : | |||
3780 : | =cut | ||
3781 : | |||
3782 : | sub _ComputeRelationshipHeading { | ||
3783 : | parrello | 1.10 | # Get the parameters. |
3784 : | my ($relationshipName, $relationshipStructure) = @_; | ||
3785 : | # Get the FROM and TO entity names. | ||
3786 : | my $fromEntity = $relationshipStructure->{from}; | ||
3787 : | my $toEntity = $relationshipStructure->{to}; | ||
3788 : | # Format a relationship sentence with hyperlinks in it. | ||
3789 : | my $result = "<a href=\"#$fromEntity\">$fromEntity</a> $relationshipName <a href=\"#$toEntity\">$toEntity</a>"; | ||
3790 : | return $result; | ||
3791 : | parrello | 1.1 | } |
3792 : | |||
3793 : | =head3 ShowRelationTable | ||
3794 : | |||
3795 : | Generate the HTML string for a particular relation. The relation's data will be formatted as an HTML | ||
3796 : | table with three columns-- the field name, the field type, and the field description. | ||
3797 : | |||
3798 : | This is a static method. | ||
3799 : | |||
3800 : | =over 4 | ||
3801 : | |||
3802 : | =item relationName | ||
3803 : | |||
3804 : | Name of the relation being formatted. | ||
3805 : | |||
3806 : | =item relationData | ||
3807 : | |||
3808 : | Hash containing the relation's fields and indexes. | ||
3809 : | |||
3810 : | =item RETURN | ||
3811 : | |||
3812 : | Returns an HTML string that can be used to display the relation name and all of its fields. | ||
3813 : | |||
3814 : | =back | ||
3815 : | |||
3816 : | =cut | ||
3817 : | |||
3818 : | sub _ShowRelationTable { | ||
3819 : | parrello | 1.10 | # Get the parameters. |
3820 : | my ($relationName, $relationData) = @_; | ||
3821 : | # Start the relation's field table. | ||
3822 : | my $htmlString = _OpenFieldTable($relationName); | ||
3823 : | # Loop through the fields. | ||
3824 : | for my $field (@{$relationData->{Fields}}) { | ||
3825 : | $htmlString .= _ShowField($field); | ||
3826 : | } | ||
3827 : | # Close this relation's field table. | ||
3828 : | $htmlString .= &_CloseTable; | ||
3829 : | # Now we show the relation's indexes. | ||
3830 : | $htmlString .= "<ul>\n"; | ||
3831 : | my $indexTable = $relationData->{Indexes}; | ||
3832 : | for my $indexName (sort keys %{$indexTable}) { | ||
3833 : | my $indexData = $indexTable->{$indexName}; | ||
3834 : | # Determine whether or not the index is unique. | ||
3835 : | my $fullName = $indexName; | ||
3836 : | if (exists $indexData->{Unique} && $indexData->{Unique} eq "true") { | ||
3837 : | $fullName .= " (unique)"; | ||
3838 : | } | ||
3839 : | # Start an HTML list item for this index. | ||
3840 : | $htmlString .= "<li><b>Index $fullName</b>\n<ul>\n"; | ||
3841 : | # Add any note text. | ||
3842 : | if (my $note = $indexData->{Notes}) { | ||
3843 : | $htmlString .= "<li>" . _HTMLNote($note->{content}) . "</li>\n"; | ||
3844 : | } | ||
3845 : | # Add the fiield list. | ||
3846 : | $htmlString .= "<li><i>" . join(', ', @{$indexData->{IndexFields}}) . "</i></li>\n"; | ||
3847 : | # Close this entry. | ||
3848 : | $htmlString .= "</ul></li>\n"; | ||
3849 : | } | ||
3850 : | # Close off the index list. | ||
3851 : | $htmlString .= "</ul>\n"; | ||
3852 : | parrello | 1.1 | } |
3853 : | |||
3854 : | =head3 OpenFieldTable | ||
3855 : | |||
3856 : | This method creates the header string for the field table generated by L</ShowMetaData>. | ||
3857 : | |||
3858 : | This is a static method. | ||
3859 : | |||
3860 : | =over 4 | ||
3861 : | |||
3862 : | =item tablename | ||
3863 : | |||
3864 : | Name of the table whose fields will be displayed. | ||
3865 : | |||
3866 : | =item RETURN | ||
3867 : | |||
3868 : | Returns a string containing the HTML for a field table's header. | ||
3869 : | |||
3870 : | =back | ||
3871 : | |||
3872 : | =cut | ||
3873 : | |||
3874 : | sub _OpenFieldTable { | ||
3875 : | parrello | 1.10 | my ($tablename) = @_; |
3876 : | return _OpenTable($tablename, 'Field', 'Type', 'Description'); | ||
3877 : | parrello | 1.1 | } |
3878 : | |||
3879 : | =head3 OpenTable | ||
3880 : | |||
3881 : | This method creates the header string for an HTML table. | ||
3882 : | |||
3883 : | This is a static method. | ||
3884 : | |||
3885 : | =over 4 | ||
3886 : | |||
3887 : | =item tablename | ||
3888 : | |||
3889 : | Title of the table. | ||
3890 : | |||
3891 : | =item colName1, colName2, ..., colNameN | ||
3892 : | |||
3893 : | List of column names. | ||
3894 : | |||
3895 : | =item RETURN | ||
3896 : | |||
3897 : | Returns a string containing the HTML for the desired table's header. | ||
3898 : | |||
3899 : | =back | ||
3900 : | |||
3901 : | =cut | ||
3902 : | |||
3903 : | sub _OpenTable { | ||
3904 : | parrello | 1.10 | # Get the parameters. |
3905 : | my ($tablename, @colNames) = @_; | ||
3906 : | # Compute the number of columns. | ||
3907 : | my $colCount = @colNames; | ||
3908 : | # Generate the title row. | ||
3909 : | my $htmlString = "<p><table border=\"2\"><tr><td colspan=\"$colCount\" align=\"center\">$tablename</td></tr>\n"; | ||
3910 : | # Loop through the columns, adding the column header rows. | ||
3911 : | $htmlString .= "<tr>"; | ||
3912 : | for my $colName (@colNames) { | ||
3913 : | $htmlString .= "<th>$colName</th>"; | ||
3914 : | } | ||
3915 : | $htmlString .= "</tr>\n"; | ||
3916 : | return $htmlString; | ||
3917 : | parrello | 1.1 | } |
3918 : | |||
3919 : | =head3 CloseTable | ||
3920 : | |||
3921 : | This method returns the HTML for closing a table. | ||
3922 : | |||
3923 : | This is a static method. | ||
3924 : | |||
3925 : | =cut | ||
3926 : | |||
3927 : | sub _CloseTable { | ||
3928 : | parrello | 1.10 | return "</table></p>\n"; |
3929 : | parrello | 1.1 | } |
3930 : | |||
3931 : | =head3 ShowField | ||
3932 : | |||
3933 : | This method returns the HTML for displaying a row of field information in a field table. | ||
3934 : | |||
3935 : | This is a static method. | ||
3936 : | |||
3937 : | =over 4 | ||
3938 : | |||
3939 : | =item fieldData | ||
3940 : | |||
3941 : | Table of data about the field. | ||
3942 : | |||
3943 : | =item RETURN | ||
3944 : | |||
3945 : | Returns an HTML string for a table row that shows the field's name, type, and description. | ||
3946 : | |||
3947 : | =back | ||
3948 : | |||
3949 : | =cut | ||
3950 : | |||
3951 : | sub _ShowField { | ||
3952 : | parrello | 1.10 | # Get the parameters. |
3953 : | my ($fieldData) = @_; | ||
3954 : | # Create the HTML string. | ||
3955 : | my $htmlString = "<tr><th align=\"left\">$fieldData->{name}</th><td>$fieldData->{type}</td>"; | ||
3956 : | # If we have content, add it as a third column. | ||
3957 : | if (exists $fieldData->{Notes}) { | ||
3958 : | $htmlString .= "<td>" . _HTMLNote($fieldData->{Notes}->{content}) . "</td>"; | ||
3959 : | } | ||
3960 : | # Close off the row. | ||
3961 : | $htmlString .= "</tr>\n"; | ||
3962 : | # Return the result. | ||
3963 : | return $htmlString; | ||
3964 : | parrello | 1.1 | } |
3965 : | |||
3966 : | =head3 HTMLNote | ||
3967 : | |||
3968 : | Convert a note or comment to HTML by replacing some bulletin-board codes with HTML. The codes | ||
3969 : | supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a new paragraph. | ||
3970 : | Except for C<[p]>, all the codes are closed by slash-codes. So, for | ||
3971 : | example, C<[b]Feature[/b]> displays the string C<Feature> in boldface. | ||
3972 : | |||
3973 : | This is a static method. | ||
3974 : | |||
3975 : | =over 4 | ||
3976 : | |||
3977 : | =item dataString | ||
3978 : | |||
3979 : | String to convert to HTML. | ||
3980 : | |||
3981 : | =item RETURN | ||
3982 : | |||
3983 : | An HTML string derived from the input string. | ||
3984 : | |||
3985 : | =back | ||
3986 : | |||
3987 : | =cut | ||
3988 : | |||
3989 : | sub _HTMLNote { | ||
3990 : | parrello | 1.10 | # Get the parameter. |
3991 : | my ($dataString) = @_; | ||
3992 : | # Substitute the codes. | ||
3993 : | $dataString =~ s!\[(/?[bi])\]!<$1>!g; | ||
3994 : | $dataString =~ s!\[p\]!</p><p>!g; | ||
3995 : | # Return the result. | ||
3996 : | return $dataString; | ||
3997 : | parrello | 1.1 | } |
3998 : | |||
3999 : | =head2 Data Generation Utilities | ||
4000 : | |||
4001 : | =head3 IntGen | ||
4002 : | |||
4003 : | C<< my $integer = IntGen($min, $max); >> | ||
4004 : | |||
4005 : | Returns a random number between the specified minimum and maximum (inclusive). | ||
4006 : | |||
4007 : | =over 4 | ||
4008 : | |||
4009 : | =item min | ||
4010 : | |||
4011 : | Minimum permissible return value. | ||
4012 : | |||
4013 : | =item max | ||
4014 : | |||
4015 : | Maximum permissible return value. | ||
4016 : | |||
4017 : | =item RETURN | ||
4018 : | |||
4019 : | Returns a value no lower than the minimum and no greater than the maximum. | ||
4020 : | |||
4021 : | =back | ||
4022 : | |||
4023 : | =cut | ||
4024 : | |||
4025 : | sub IntGen { | ||
4026 : | parrello | 1.10 | # Get the parameters. |
4027 : | my ($min, $max) = @_; | ||
4028 : | # Determine the range of possible values. Note we put some space well above the | ||
4029 : | # maximum value to give it a fighting chance of apppearing in the list. | ||
4030 : | my $span = $max + 0.99 - $min; | ||
4031 : | # Create an integer in the range. | ||
4032 : | my $retVal = $min + int(rand($span)); | ||
4033 : | # Return the result. | ||
4034 : | return $retVal; | ||
4035 : | parrello | 1.1 | } |
4036 : | |||
4037 : | =head3 RandChar | ||
4038 : | |||
4039 : | C<< my $char = RandChar($sourceString); >> | ||
4040 : | |||
4041 : | Select a random character from a string. | ||
4042 : | |||
4043 : | =over 4 | ||
4044 : | |||
4045 : | =item sourceString | ||
4046 : | |||
4047 : | String from which the random character should be selected. | ||
4048 : | |||
4049 : | =item RETURN | ||
4050 : | |||
4051 : | Returns a single character from the incoming string. | ||
4052 : | |||
4053 : | =back | ||
4054 : | |||
4055 : | =cut | ||
4056 : | |||
4057 : | sub RandChar { | ||
4058 : | parrello | 1.10 | # Get the parameter. |
4059 : | my ($sourceString) = @_; | ||
4060 : | # Select a random character. | ||
4061 : | my $retVal = IntGen(0, (length $sourceString) - 1); | ||
4062 : | # Return it. | ||
4063 : | return substr($sourceString, $retVal, 1); | ||
4064 : | parrello | 1.1 | } |
4065 : | |||
4066 : | =head3 RandChars | ||
4067 : | |||
4068 : | C<< my $string = RandChars($sourceString, $length); >> | ||
4069 : | |||
4070 : | Create a string from characters taken from a source string. | ||
4071 : | |||
4072 : | =over 4 | ||
4073 : | |||
4074 : | =item sourceString | ||
4075 : | |||
4076 : | String from which the random characters should be selected. | ||
4077 : | |||
4078 : | =item length | ||
4079 : | |||
4080 : | Number of characters to put in the output string. | ||
4081 : | |||
4082 : | =item RETURN | ||
4083 : | |||
4084 : | Returns a string of the specified length consisting of characters taken from the | ||
4085 : | source string. | ||
4086 : | |||
4087 : | =back | ||
4088 : | |||
4089 : | =cut | ||
4090 : | |||
4091 : | sub RandChars { | ||
4092 : | parrello | 1.10 | # Get the parameters. |
4093 : | my ($sourceString, $length) = @_; | ||
4094 : | # Call RandChar repeatedly to generate the string. | ||
4095 : | my $retVal = ""; | ||
4096 : | for (my $i = 0; $i < $length; $i++) { | ||
4097 : | $retVal .= RandChar($sourceString); | ||
4098 : | } | ||
4099 : | # Return the result. | ||
4100 : | return $retVal; | ||
4101 : | parrello | 1.1 | } |
4102 : | |||
4103 : | =head3 RandParam | ||
4104 : | |||
4105 : | C<< my $value = RandParam($parm1, $parm2, ... $parmN); >> | ||
4106 : | |||
4107 : | Return a randomly-selected value from the parameter list. | ||
4108 : | |||
4109 : | =over 4 | ||
4110 : | |||
4111 : | =item parm1, parm2, ... parmN | ||
4112 : | |||
4113 : | List of values of which one will be selected. | ||
4114 : | |||
4115 : | =item RETURN | ||
4116 : | |||
4117 : | Returns a randomly-chosen value from the specified list. | ||
4118 : | |||
4119 : | =back | ||
4120 : | |||
4121 : | =cut | ||
4122 : | |||
4123 : | sub RandParam { | ||
4124 : | parrello | 1.10 | # Get the parameter. |
4125 : | my @parms = @_; | ||
4126 : | # Choose a random parameter from the list. | ||
4127 : | my $chosenIndex = IntGen(0, $#parms); | ||
4128 : | return $parms[$chosenIndex]; | ||
4129 : | parrello | 1.1 | } |
4130 : | |||
4131 : | =head3 StringGen | ||
4132 : | |||
4133 : | C<< my $string = StringGen($pattern1, $pattern2, ... $patternN); >> | ||
4134 : | |||
4135 : | Returns a random string derived from a randomly-chosen format pattern. The pattern | ||
4136 : | can either be a number (indicating the number of characters desired, or the letter | ||
4137 : | C<P> followed by a picture. The picture should contain C<A> when a letter is desired, | ||
4138 : | C<9> when a digit is desired, C<V> when a vowel is desired, C<K> when a consonant is | ||
4139 : | desired, and C<X> when a letter or a digit is desired. Any other character will be | ||
4140 : | translated as a literal. | ||
4141 : | |||
4142 : | =over 4 | ||
4143 : | |||
4144 : | =item pattern1, pattern2, ... patternN | ||
4145 : | |||
4146 : | List of patterns to be used to generate string values. | ||
4147 : | |||
4148 : | =item RETURN | ||
4149 : | |||
4150 : | A single string generated from a pattern. | ||
4151 : | |||
4152 : | =back | ||
4153 : | |||
4154 : | =cut | ||
4155 : | |||
4156 : | sub StringGen { | ||
4157 : | parrello | 1.10 | # Get the parameters. |
4158 : | my @patterns = @_; | ||
4159 : | # Choose the appropriate pattern. | ||
4160 : | my $chosenPattern = RandParam(@patterns); | ||
4161 : | # Declare the return variable. | ||
4162 : | my $retVal = ""; | ||
4163 : | # Determine whether this is a count or a picture pattern. | ||
4164 : | if ($chosenPattern =~ m/^\d+/) { | ||
4165 : | # Here we have a count. Get the string of source characters. | ||
4166 : | my $letterString = $PictureTable{'X'}; | ||
4167 : | my $stringLen = length $letterString; | ||
4168 : | # Save the number of characters we have to generate. | ||
4169 : | my $charsLeft = $chosenPattern; | ||
4170 : | # Loop until the return variable is full. | ||
4171 : | while ($charsLeft > 0) { | ||
4172 : | # Generate a random position in the soruce string. | ||
4173 : | my $stringIndex = IntGen(0, $stringLen - 1); | ||
4174 : | # Compute the number of characters to pull out of the source string. | ||
4175 : | my $chunkSize = $stringLen - $stringIndex; | ||
4176 : | if ($chunkSize > $charsLeft) { $chunkSize = $charsLeft; } | ||
4177 : | # Stuff this chunk into the return value. | ||
4178 : | $retVal .= substr($letterString, $stringIndex, $chunkSize); | ||
4179 : | # Record the data moved. | ||
4180 : | $charsLeft -= $chunkSize; | ||
4181 : | } | ||
4182 : | } elsif ($chosenPattern =~ m/^P/) { | ||
4183 : | # Here we have a picture string. We will move through the picture one | ||
4184 : | # character at a time generating data. | ||
4185 : | for (my $i = 1; $i < length $chosenPattern; $i++) { | ||
4186 : | # Get this picture character. | ||
4187 : | my $chr = substr($chosenPattern, $i, 1); | ||
4188 : | # Check to see if the picture char is one we recognize. | ||
4189 : | if (exists $PictureTable{$chr}) { | ||
4190 : | # Choose a random character from the available values for this | ||
4191 : | # picture character. | ||
4192 : | $retVal .= RandChar($PictureTable{$chr}); | ||
4193 : | } else { | ||
4194 : | # Copy in the picture character as a literal. | ||
4195 : | $retVal .= $chr; | ||
4196 : | } | ||
4197 : | } | ||
4198 : | } else { | ||
4199 : | # Here we have neither a picture string or a letter count, so we treat | ||
4200 : | # the string as a literal. | ||
4201 : | $retVal = $chosenPattern; | ||
4202 : | } | ||
4203 : | # Return the string formed. | ||
4204 : | return $retVal; | ||
4205 : | parrello | 1.1 | } |
4206 : | |||
4207 : | =head3 DateGen | ||
4208 : | |||
4209 : | C<< my $date = DateGen($startDayOffset, $endDayOffset, $minutes); >> | ||
4210 : | |||
4211 : | Return a numeric timestamp within the specified range of days with the specified minute | ||
4212 : | value. The range of days is specified relevant to the current day. Thus, the call | ||
4213 : | |||
4214 : | C<< my $date = DateGen(-1, 5, 720); >> | ||
4215 : | |||
4216 : | will return a timestamp at noon (72 minutes past midnight) sometime during the week that | ||
4217 : | began on the preceding day. If you want a random minute of the day, simply combine with | ||
4218 : | a call to L</IntGen>, as follows. | ||
4219 : | |||
4220 : | C<< my $date = DateGen(-1, 5, IntGen(0, 1439)); >> | ||
4221 : | |||
4222 : | =over 4 | ||
4223 : | |||
4224 : | =item startDayOffset | ||
4225 : | |||
4226 : | The earliest day that can be returned, relative to the current day. | ||
4227 : | |||
4228 : | =item endDayOffset | ||
4229 : | |||
4230 : | The latest day that can be returned, related to the current day. | ||
4231 : | |||
4232 : | =item minutes | ||
4233 : | |||
4234 : | Number of minutes into the selected day that should be used. | ||
4235 : | |||
4236 : | =back | ||
4237 : | |||
4238 : | =cut | ||
4239 : | |||
4240 : | sub DateGen { | ||
4241 : | parrello | 1.10 | # Get the parameters. |
4242 : | my ($startDayOffset, $endDayOffset, $minutes) = @_; | ||
4243 : | # Get midnight of the current day. | ||
4244 : | my $now = time(); | ||
4245 : | my ($sec, $min, $hour) = localtime($now); | ||
4246 : | my $today = $now - (($hour * 60 + $min) * 60 + $sec); | ||
4247 : | # Compute the day we want. | ||
4248 : | my $newDay = IntGen($startDayOffset, $endDayOffset) * 86400 + $today; | ||
4249 : | # Add the minutes. | ||
4250 : | my $retVal = $newDay + $minutes * 60; | ||
4251 : | # Return the result. | ||
4252 : | return $retVal; | ||
4253 : | parrello | 1.1 | } |
4254 : | |||
4255 : | =head3 FloatGen | ||
4256 : | |||
4257 : | C<< my $number = FloatGen($min, $max); >> | ||
4258 : | |||
4259 : | Return a random floating-point number greater than or equal to the specified minimum and | ||
4260 : | less than the specified maximum. | ||
4261 : | |||
4262 : | =over 4 | ||
4263 : | |||
4264 : | =item min | ||
4265 : | |||
4266 : | Minimum permissible value for the number returned. | ||
4267 : | |||
4268 : | =item max | ||
4269 : | |||
4270 : | Maximum permissible value for the number returned. | ||
4271 : | |||
4272 : | =item RETURN | ||
4273 : | |||
4274 : | Returns a floating-point number anywhere in the specified range. | ||
4275 : | |||
4276 : | =back | ||
4277 : | |||
4278 : | =cut | ||
4279 : | |||
4280 : | sub FloatGen { | ||
4281 : | parrello | 1.10 | # Get the parameters. |
4282 : | my ($min, $max) = @_; | ||
4283 : | # Generate the result. | ||
4284 : | my $retVal = rand($max - $min) + $min; | ||
4285 : | return $retVal; | ||
4286 : | parrello | 1.1 | } |
4287 : | |||
4288 : | =head3 ListGen | ||
4289 : | |||
4290 : | C<< my @list = ListGen($pattern, $count); >> | ||
4291 : | |||
4292 : | Return a list containing a fixed number of randomly-generated strings. | ||
4293 : | |||
4294 : | =over 4 | ||
4295 : | |||
4296 : | =item pattern | ||
4297 : | |||
4298 : | A pattern (in the form expected by L</StringGen>) that should be used to generate the | ||
4299 : | strings in the list. | ||
4300 : | |||
4301 : | =item count | ||
4302 : | |||
4303 : | The number of list entries to generate. | ||
4304 : | |||
4305 : | =item RETURN | ||
4306 : | |||
4307 : | Returns a list consisting of the specified number of strings. | ||
4308 : | |||
4309 : | =back | ||
4310 : | |||
4311 : | =cut | ||
4312 : | |||
4313 : | sub ListGen { | ||
4314 : | parrello | 1.10 | # Get the parameters. |
4315 : | my ($pattern, $count) = @_; | ||
4316 : | # Generate the list. | ||
4317 : | my @retVal = (); | ||
4318 : | for (my $i = 0; $i < $count; $i++) { | ||
4319 : | push @retVal, StringGen($pattern); | ||
4320 : | } | ||
4321 : | # Return it. | ||
4322 : | return @retVal; | ||
4323 : | parrello | 1.1 | } |
4324 : | |||
4325 : | overbeek | 1.11 | 1; |
MCS Webmaster | ViewVC Help |
Powered by ViewVC 1.0.3 |