Parent Directory
|
Revision Log
Revision 1.9 - (view) (download) (as text)
1 : | parrello | 1.1 | package ERDB; |
2 : | |||
3 : | use strict; | ||
4 : | use Tracer; | ||
5 : | use DBKernel; | ||
6 : | use Data::Dumper; | ||
7 : | use XML::Simple; | ||
8 : | use DBQuery; | ||
9 : | use DBObject; | ||
10 : | use Stats; | ||
11 : | use Time::HiRes qw(gettimeofday); | ||
12 : | |||
13 : | =head1 Entity-Relationship Database Package | ||
14 : | |||
15 : | =head2 Introduction | ||
16 : | |||
17 : | The Entity-Relationship Database Package allows the client to create an easily-configurable | ||
18 : | database of Entities connected by Relationships. Each entity is represented by one or more | ||
19 : | relations in an underlying SQL database. Each relationship is represented by a single | ||
20 : | relation that connects two entities. | ||
21 : | |||
22 : | Although this package is designed for general use, all examples are derived from the | ||
23 : | Sprout database, which is the first database implemented using this package. | ||
24 : | |||
25 : | Each entity has at least one relation, the I<primary relation>, that has the same name as | ||
26 : | the entity. The primary relation contains a field named C<id> that contains the unique | ||
27 : | identifier of each entity instance. An entity may have additional relations that contain | ||
28 : | fields which are optional or can occur more than once. For example, the B<FEATURE> entity | ||
29 : | has a B<feature-type> attribute that occurs exactly once for each feature. This attribute | ||
30 : | is implemented by a C<feature_type> column in the primary relation C<Feature>. In addition, | ||
31 : | however, a feature may have zero or more aliases. These are implemented using a C<FeatureAlias> | ||
32 : | relation that contains two fields-- the feature ID (C<id>) and the alias name (C<alias>). | ||
33 : | The B<FEATURE> entity also contains an optional virulence number. This is implemented | ||
34 : | as a separate relation C<FeatureVirulence> which contains an ID (C<id>) and a virulence number | ||
35 : | parrello | 1.8 | (C<virulence>). If the virulence of a feature I<ABC> is known to be 6, there will be one row in |
36 : | the C<FeatureVirulence> relation possessing the value I<ABC> as its ID and 6 as its virulence | ||
37 : | number. If the virulence of I<ABC> is not known, there will not be any rows for it in | ||
38 : | C<FeatureVirulence>. | ||
39 : | parrello | 1.1 | |
40 : | Entities are connected by binary relationships implemented using single relations possessing the | ||
41 : | same name as the relationship itself and that has an I<arity> of 1-to-1 (C<11>), 1-to-many (C<1M>), | ||
42 : | or many-to-many (C<MM>). Each relationship's relation contains a C<from-link> field that contains the | ||
43 : | ID of the source entity and a C<to-link> field that contains the ID of the target entity. The name | ||
44 : | of the relationship is generally a verb phrase with the source entity as the subject and the | ||
45 : | target entity as the object. So, for example, the B<ComesFrom> relationship connects the B<GENOME> | ||
46 : | and B<SOURCE> entities, and indicates that a particular source organization participated in the | ||
47 : | mapping of the genome. A source organization frequently participates in the mapping | ||
48 : | of many genomes, and many source organizations can cooperate in the mapping of a single genome, so | ||
49 : | this relationship has an arity of many-to-many (C<MM>). The relation that implements the B<ComesFrom> | ||
50 : | relationship is called C<ComesFrom> and contains two fields-- C<from-link>, which contains a genome ID, | ||
51 : | and C<to-link>, which contains a source ID. | ||
52 : | |||
53 : | A relationship may itself have attributes. These attributes, known as I<intersection data attributes>, | ||
54 : | are implemented as additional fields in the relationship's relation. So, for example, the | ||
55 : | B<IsMadeUpOf> relationship connects the B<Contig> entity to the B<Sequence> entity, and is used | ||
56 : | to determine which sequences make up a contig. The relationship has as an attribute the | ||
57 : | B<start-position>, which indicates where in the contig that the sequence begins. This attribute | ||
58 : | is implemented as the C<start_position> field in the C<IsMadeUpOf> relation. | ||
59 : | |||
60 : | The database itself is described by an XML file using the F<ERDatabase.xsd> schema. In addition to | ||
61 : | all the data required to define the entities, relationships, and attributes, the schema provides | ||
62 : | space for notes describing the data and what it means. These notes are used by L</ShowMetaData> | ||
63 : | to generate documentation for the database. | ||
64 : | |||
65 : | Finally, every entity and relationship object has a flag indicating if it is new or old. The object | ||
66 : | is considered I<old> if it was loaded by the L</LoadTables> method. It is considered I<new> if it | ||
67 : | was inserted by the L</InsertObject> method. | ||
68 : | |||
69 : | To facilitate testing, the ERDB module supports automatic generation of test data. This process | ||
70 : | parrello | 1.5 | is described in the L</GenerateEntity> and L</GenerateConnection> methods, though it is not yet |
71 : | fully implemented. | ||
72 : | parrello | 1.1 | |
73 : | parrello | 1.8 | =head2 XML Database Description |
74 : | |||
75 : | =head3 Data Types | ||
76 : | |||
77 : | The ERDB system supports the following data types. Note that there are numerous string | ||
78 : | types depending on the maximum length. Some database packages limit the total number of | ||
79 : | characters you have in an index key; to insure the database works in all environments, | ||
80 : | the type of string should be the shortest one possible that supports all the known values. | ||
81 : | |||
82 : | =over 4 | ||
83 : | |||
84 : | =item char | ||
85 : | |||
86 : | single ASCII character | ||
87 : | |||
88 : | =item int | ||
89 : | |||
90 : | 32-bit signed integer | ||
91 : | |||
92 : | =item date | ||
93 : | |||
94 : | 64-bit unsigned integer, representing a PERL date/time value | ||
95 : | |||
96 : | =item text | ||
97 : | |||
98 : | long string; Text fields cannot be used in indexes or sorting and do not support the | ||
99 : | normal syntax of filter clauses, but can be up to a billion character in length | ||
100 : | |||
101 : | =item float | ||
102 : | |||
103 : | double-precision floating-point number | ||
104 : | |||
105 : | =item boolean | ||
106 : | |||
107 : | single-bit numeric value; The value is stored as a 16-bit signed integer (for | ||
108 : | compatability with certain database packages), but the only values supported are | ||
109 : | 0 and 1. | ||
110 : | |||
111 : | =item key-string | ||
112 : | |||
113 : | variable-length string, maximum 40 characters | ||
114 : | |||
115 : | =item name-string | ||
116 : | |||
117 : | variable-length string, maximum 80 characters | ||
118 : | |||
119 : | =item medium-string | ||
120 : | |||
121 : | variable-length string, maximum 160 characters | ||
122 : | |||
123 : | =item string | ||
124 : | |||
125 : | variable-length string, maximum 255 characters | ||
126 : | |||
127 : | =back | ||
128 : | |||
129 : | =head3 Global Tags | ||
130 : | |||
131 : | The entire database definition must be inside a B<Database> tag. The display name of | ||
132 : | the database is given by the text associated with the B<Title> tag. The display name | ||
133 : | is only used in the automated documentation. It has no other effect. The entities and | ||
134 : | relationships are listed inside the B<Entities> and B<Relationships> tags, | ||
135 : | respectively. None of these tags have attributes. | ||
136 : | |||
137 : | <Database> | ||
138 : | <Title>... display title here...</Title> | ||
139 : | <Entities> | ||
140 : | ... entity definitions here ... | ||
141 : | </Entities> | ||
142 : | <Relationships> | ||
143 : | ... relationship definitions here... | ||
144 : | </Relationships> | ||
145 : | </Database> | ||
146 : | |||
147 : | Entities, relationships, indexes, and fields all allow a text tag called B<Notes>. | ||
148 : | The text inside the B<Notes> tag contains comments that will appear when the database | ||
149 : | documentation is generated. Within a B<Notes> tag, you may use C<[i]> and C<[/i]> for | ||
150 : | italics, C<[b]> and C<[/b]> for bold, and C<[p]> for a new paragraph. | ||
151 : | |||
152 : | =head3 Fields | ||
153 : | |||
154 : | Both entities and relationships have fields described by B<Field> tags. A B<Field> | ||
155 : | tag can have B<Notes> associated with it. The complete set of B<Field> tags for an | ||
156 : | object mus be inside B<Fields> tags. | ||
157 : | |||
158 : | <Entity ... > | ||
159 : | <Fields> | ||
160 : | ... Field tags ... | ||
161 : | </Fields> | ||
162 : | </Entity> | ||
163 : | |||
164 : | The attributes for the B<Field> tag are as follows. | ||
165 : | |||
166 : | =over 4 | ||
167 : | |||
168 : | =item name | ||
169 : | |||
170 : | Name of the field. The field name should contain only letters, digits, and hyphens (C<->), | ||
171 : | and the first character should be a letter. Most underlying databases are case-insensitive | ||
172 : | with the respect to field names, so a best practice is to use lower-case letters only. | ||
173 : | |||
174 : | =item type | ||
175 : | |||
176 : | Data type of the field. The legal data types are given above. | ||
177 : | |||
178 : | =item relation | ||
179 : | |||
180 : | Name of the relation containing the field. This should only be specified for entity | ||
181 : | fields. The ERDB system does not support optional fields or multi-occurring fields | ||
182 : | in the primary relation of an entity. Instead, they are put into secondary relations. | ||
183 : | So, for example, in the C<Genome> entity, the C<group-name> field indicates a special | ||
184 : | grouping used to select a subset of the genomes. A given genome may not be in any | ||
185 : | groups or may be in multiple groups. Therefore, C<group-name> specifies a relation | ||
186 : | value. The relation name specified must be a valid table name. By convention, it is | ||
187 : | usually the entity name followed by a qualifying word (e.g. C<GenomeGroup>). In an | ||
188 : | entity, the fields without a relation attribute are said to belong to the | ||
189 : | I<primary relation>. This relation has the same name as the entity itself. | ||
190 : | |||
191 : | =back | ||
192 : | |||
193 : | =head3 Indexes | ||
194 : | |||
195 : | An entity can have multiple alternate indexes associated with it. The fields must | ||
196 : | be from the primary relation. The alternate indexes assist in ordering results | ||
197 : | from a query. A relationship can have up to two indexes-- a I<to-index> and a | ||
198 : | I<from-index>. These order the results when crossing the relationship. For | ||
199 : | example, in the relationship C<HasContig> from C<Genome> to C<Contig>, the | ||
200 : | from-index would order the contigs of a ganome, and the to-index would order | ||
201 : | the genomes of a contig. A relationship's index must specify only fields in | ||
202 : | the relationship. | ||
203 : | |||
204 : | The indexes for an entity must be listed inside the B<Indexes> tag. The from-index | ||
205 : | of a relationship is specified using the B<FromIndex> tag; the to-index is specified | ||
206 : | using the B<ToIndex> tag. | ||
207 : | |||
208 : | Each index can contain a B<Notes> tag. In addition, it will have an B<IndexFields> | ||
209 : | tag containing the B<IndexField> tags. These specify, in order, the fields used in | ||
210 : | the index. The attributes of an B<IndexField> tag are as follows. | ||
211 : | |||
212 : | =over 4 | ||
213 : | |||
214 : | =item name | ||
215 : | |||
216 : | Name of the field. | ||
217 : | |||
218 : | =item order | ||
219 : | |||
220 : | Sort order of the field-- C<ascending> or C<descending>. | ||
221 : | |||
222 : | =back | ||
223 : | |||
224 : | The B<Index>, B<FromIndex>, and B<ToIndex> tags themselves have no attributes. | ||
225 : | |||
226 : | =head3 Object and Field Names | ||
227 : | |||
228 : | By convention entity and relationship names use capital casing (e.g. C<Genome> or | ||
229 : | C<HasRegionsIn>. Most underlying databases, however, are aggressively case-insensitive | ||
230 : | with respect to relation names, converting them internally to all-upper case or | ||
231 : | all-lower case. | ||
232 : | |||
233 : | If syntax or parsing errors occur when you try to load or use an ERDB database, the | ||
234 : | most likely reason is that one of your objects has an SQL reserved word as its name. | ||
235 : | The list of SQL reserved words keeps increasing; however, most are unlikely to show | ||
236 : | up as a noun or declarative verb phrase. The exceptions are C<Group>, C<User>, | ||
237 : | C<Table>, C<Index>, C<Object>, C<Date>, C<Number>, C<Update>, C<Time>, C<Percent>, | ||
238 : | C<Memo>, C<Order>, and C<Sum>. This problem can crop up in field names as well. | ||
239 : | |||
240 : | Every entity has a field called C<id> that acts as its primary key. Every relationship | ||
241 : | has fields called C<from-link> and C<to-link> that contain copies of the relevant | ||
242 : | entity IDs. These are essentially ERDB's reserved words, and should not be used | ||
243 : | for user-defined field names. | ||
244 : | |||
245 : | =head3 Entities | ||
246 : | |||
247 : | An entity is described by the B<Entity> tag. The entity can contain B<Notes>, an | ||
248 : | B<Indexes> tag containing one or more secondary indexes, and a B<Fields> tag | ||
249 : | containing one or more fields. The attributes of the B<Entity> tag are as follows. | ||
250 : | |||
251 : | =over 4 | ||
252 : | |||
253 : | =item name | ||
254 : | |||
255 : | Name of the entity. The entity name, by convention, uses capital casing (e.g. C<Genome> | ||
256 : | or C<GroupBlock>) and should be a noun or noun phrase. | ||
257 : | |||
258 : | =item keyType | ||
259 : | |||
260 : | Data type of the primary key. The primary key is always named C<id>. | ||
261 : | |||
262 : | =back | ||
263 : | |||
264 : | =head3 Relationships | ||
265 : | |||
266 : | A relationship is described by the C<Relationship> tag. Within a relationship, | ||
267 : | there can be a C<Notes> tag, a C<Fields> tag containing the intersection data | ||
268 : | fields, a C<FromIndex> tag containing the from-index, and a C<ToIndex> tag containing | ||
269 : | the to-index. | ||
270 : | |||
271 : | The C<Relationship> tag has the following attributes. | ||
272 : | |||
273 : | =over 4 | ||
274 : | |||
275 : | =item name | ||
276 : | |||
277 : | Name of the relationship. The relationship name, by convention, uses capital casing | ||
278 : | (e.g. C<ContainsRegionIn> or C<HasContig>), and should be a declarative verb | ||
279 : | phrase, designed to fit between the from-entity and the to-entity (e.g. | ||
280 : | Block C<ContainsRegionIn> Genome). | ||
281 : | |||
282 : | =item from | ||
283 : | |||
284 : | Name of the entity from which the relationship starts. | ||
285 : | |||
286 : | =item to | ||
287 : | |||
288 : | Name of the entity to which the relationship proceeds. | ||
289 : | |||
290 : | =item arity | ||
291 : | |||
292 : | Relationship type: C<1M> for one-to-many and C<MM> for many-to-many. | ||
293 : | |||
294 : | =back | ||
295 : | |||
296 : | parrello | 1.1 | =cut |
297 : | |||
298 : | # GLOBALS | ||
299 : | |||
300 : | # Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. | ||
301 : | # "maxLen" is the maximum permissible length of the incoming string data used to populate a field | ||
302 : | # of the specified type. "dataGen" is PERL string that will be evaluated if no test data generation | ||
303 : | #string is specified in the field definition. | ||
304 : | my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, dataGen => "StringGen('A')" }, | ||
305 : | int => { sqlType => 'INTEGER', maxLen => 20, dataGen => "IntGen(0, 99999999)" }, | ||
306 : | string => { sqlType => 'VARCHAR(255)', maxLen => 255, dataGen => "StringGen(IntGen(10,250))" }, | ||
307 : | text => { sqlType => 'TEXT', maxLen => 1000000000, dataGen => "StringGen(IntGen(80,1000))" }, | ||
308 : | date => { sqlType => 'BIGINT', maxLen => 80, dataGen => "DateGen(-7, 7, IntGen(0,1400))" }, | ||
309 : | float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, dataGen => "FloatGen(0.0, 100.0)" }, | ||
310 : | boolean => { sqlType => 'SMALLINT', maxLen => 1, dataGen => "IntGen(0, 1)" }, | ||
311 : | 'key-string' => | ||
312 : | { sqlType => 'VARCHAR(40)', maxLen => 40, dataGen => "StringGen(IntGen(10,40))" }, | ||
313 : | 'name-string' => | ||
314 : | { sqlType => 'VARCHAR(80)', maxLen => 80, dataGen => "StringGen(IntGen(10,80))" }, | ||
315 : | 'medium-string' => | ||
316 : | { sqlType => 'VARCHAR(160)', maxLen => 160, dataGen => "StringGen(IntGen(10,160))" }, | ||
317 : | ); | ||
318 : | |||
319 : | # Table translating arities into natural language. | ||
320 : | my %ArityTable = ( '11' => 'one-to-one', | ||
321 : | '1M' => 'one-to-many', | ||
322 : | 'MM' => 'many-to-many' | ||
323 : | ); | ||
324 : | |||
325 : | # Table for interpreting string patterns. | ||
326 : | |||
327 : | my %PictureTable = ( 'A' => "abcdefghijklmnopqrstuvwxyz", | ||
328 : | '9' => "0123456789", | ||
329 : | 'X' => "abcdefghijklmnopqrstuvwxyz0123456789", | ||
330 : | 'V' => "aeiou", | ||
331 : | 'K' => "bcdfghjklmnoprstvwxyz" | ||
332 : | ); | ||
333 : | |||
334 : | =head2 Public Methods | ||
335 : | |||
336 : | =head3 new | ||
337 : | |||
338 : | parrello | 1.5 | C<< my $database = ERDB->new($dbh, $metaFileName); >> |
339 : | parrello | 1.1 | |
340 : | Create a new ERDB object. | ||
341 : | |||
342 : | =over 4 | ||
343 : | |||
344 : | =item dbh | ||
345 : | |||
346 : | DBKernel database object for the target database. | ||
347 : | |||
348 : | =item metaFileName | ||
349 : | |||
350 : | Name of the XML file containing the metadata. | ||
351 : | |||
352 : | =back | ||
353 : | |||
354 : | =cut | ||
355 : | |||
356 : | sub new { | ||
357 : | # Get the parameters. | ||
358 : | my ($class, $dbh, $metaFileName, $options) = @_; | ||
359 : | # Load the meta-data. | ||
360 : | my $metaData = _LoadMetaData($metaFileName); | ||
361 : | # Create the object. | ||
362 : | my $self = { _dbh => $dbh, | ||
363 : | parrello | 1.5 | _metaData => $metaData |
364 : | parrello | 1.1 | }; |
365 : | # Bless and return it. | ||
366 : | parrello | 1.6 | bless $self, $class; |
367 : | parrello | 1.1 | return $self; |
368 : | } | ||
369 : | |||
370 : | =head3 ShowMetaData | ||
371 : | |||
372 : | C<< $database->ShowMetaData($fileName); >> | ||
373 : | |||
374 : | This method outputs a description of the database. This description can be used to help users create | ||
375 : | the data to be loaded into the relations. | ||
376 : | |||
377 : | =over 4 | ||
378 : | |||
379 : | =item filename | ||
380 : | |||
381 : | The name of the output file. | ||
382 : | |||
383 : | =back | ||
384 : | |||
385 : | =cut | ||
386 : | |||
387 : | sub ShowMetaData { | ||
388 : | # Get the parameters. | ||
389 : | parrello | 1.4 | my ($self, $filename) = @_; |
390 : | parrello | 1.1 | # Get the metadata and the title string. |
391 : | my $metadata = $self->{_metaData}; | ||
392 : | # Get the title string. | ||
393 : | my $title = $metadata->{Title}; | ||
394 : | # Get the entity and relationship lists. | ||
395 : | my $entityList = $metadata->{Entities}; | ||
396 : | my $relationshipList = $metadata->{Relationships}; | ||
397 : | # Open the output file. | ||
398 : | open(HTMLOUT, ">$filename") || Confess("Could not open MetaData display file $filename: $!"); | ||
399 : | parrello | 1.5 | Trace("Building MetaData table of contents.") if T(4); |
400 : | parrello | 1.1 | # Write the HTML heading stuff. |
401 : | print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; | ||
402 : | print HTMLOUT "</head>\n<body>\n"; | ||
403 : | # Here we do the table of contents. It starts as an unordered list of section names. Each | ||
404 : | # section contains an ordered list of entity or relationship subsections. | ||
405 : | print HTMLOUT "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; | ||
406 : | # Loop through the Entities, displaying a list item for each. | ||
407 : | foreach my $key (sort keys %{$entityList}) { | ||
408 : | # Display this item. | ||
409 : | print HTMLOUT "<li><a href=\"#$key\">$key</a></li>\n"; | ||
410 : | } | ||
411 : | # Close off the entity section and start the relationship section. | ||
412 : | print HTMLOUT "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; | ||
413 : | # Loop through the Relationships. | ||
414 : | foreach my $key (sort keys %{$relationshipList}) { | ||
415 : | # Display this item. | ||
416 : | my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); | ||
417 : | print HTMLOUT "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; | ||
418 : | } | ||
419 : | # Close off the relationship section and list the join table section. | ||
420 : | print HTMLOUT "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; | ||
421 : | # Close off the table of contents itself. | ||
422 : | print HTMLOUT "</ul>\n"; | ||
423 : | # Now we start with the actual data. Denote we're starting the entity section. | ||
424 : | print HTMLOUT "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; | ||
425 : | # Loop through the entities. | ||
426 : | for my $key (sort keys %{$entityList}) { | ||
427 : | parrello | 1.5 | Trace("Building MetaData entry for $key entity.") if T(4); |
428 : | parrello | 1.1 | # Create the entity header. It contains a bookmark and the entity name. |
429 : | print HTMLOUT "<a name=\"$key\"></a><h3>$key</h3>\n"; | ||
430 : | # Get the entity data. | ||
431 : | my $entityData = $entityList->{$key}; | ||
432 : | # If there's descriptive text, display it. | ||
433 : | if (my $notes = $entityData->{Notes}) { | ||
434 : | print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; | ||
435 : | } | ||
436 : | # Now we want a list of the entity's relationships. First, we set up the relationship subsection. | ||
437 : | print HTMLOUT "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; | ||
438 : | # Loop through the relationships. | ||
439 : | for my $relationship (sort keys %{$relationshipList}) { | ||
440 : | # Get the relationship data. | ||
441 : | my $relationshipStructure = $relationshipList->{$relationship}; | ||
442 : | # Only use the relationship if if has this entity in its FROM or TO fields. | ||
443 : | if ($relationshipStructure->{from} eq $key || $relationshipStructure->{to} eq $key) { | ||
444 : | # Get the relationship sentence and append the arity. | ||
445 : | my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); | ||
446 : | # Display the relationship data. | ||
447 : | print HTMLOUT "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; | ||
448 : | } | ||
449 : | } | ||
450 : | # Close off the relationship list. | ||
451 : | print HTMLOUT "</ul>\n"; | ||
452 : | # Get the entity's relations. | ||
453 : | my $relationList = $entityData->{Relations}; | ||
454 : | # Create a header for the relation subsection. | ||
455 : | print HTMLOUT "<h4>Relations for <b>$key</b></h4>\n"; | ||
456 : | # Loop through the relations, displaying them. | ||
457 : | for my $relation (sort keys %{$relationList}) { | ||
458 : | my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); | ||
459 : | print HTMLOUT $htmlString; | ||
460 : | } | ||
461 : | } | ||
462 : | # Denote we're starting the relationship section. | ||
463 : | print HTMLOUT "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; | ||
464 : | # Loop through the relationships. | ||
465 : | for my $key (sort keys %{$relationshipList}) { | ||
466 : | parrello | 1.5 | Trace("Building MetaData entry for $key relationship.") if T(4); |
467 : | parrello | 1.1 | # Get the relationship's structure. |
468 : | my $relationshipStructure = $relationshipList->{$key}; | ||
469 : | # Create the relationship header. | ||
470 : | my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); | ||
471 : | print HTMLOUT "<h3><a name=\"$key\"></a>$headerText</h3>\n"; | ||
472 : | # Get the entity names. | ||
473 : | my $fromEntity = $relationshipStructure->{from}; | ||
474 : | my $toEntity = $relationshipStructure->{to}; | ||
475 : | # Describe the relationship arity. Note there's a bit of trickiness involving recursive | ||
476 : | # many-to-many relationships. In a normal many-to-many we use two sentences to describe | ||
477 : | # the arity (one for each direction). This is a bad idea for a recursive relationship, | ||
478 : | # since both sentences will say the same thing. | ||
479 : | my $arity = $relationshipStructure->{arity}; | ||
480 : | if ($arity eq "11") { | ||
481 : | print HTMLOUT "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; | ||
482 : | } else { | ||
483 : | print HTMLOUT "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; | ||
484 : | if ($arity eq "MM" && $fromEntity ne $toEntity) { | ||
485 : | print HTMLOUT "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; | ||
486 : | } | ||
487 : | } | ||
488 : | print HTMLOUT "</p>\n"; | ||
489 : | # If there are notes on this relationship, display them. | ||
490 : | if (my $notes = $relationshipStructure->{Notes}) { | ||
491 : | print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; | ||
492 : | } | ||
493 : | # Generate the relationship's relation table. | ||
494 : | my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); | ||
495 : | print HTMLOUT $htmlString; | ||
496 : | } | ||
497 : | parrello | 1.5 | Trace("Building MetaData join table.") if T(4); |
498 : | parrello | 1.1 | # Denote we're starting the join table. |
499 : | print HTMLOUT "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; | ||
500 : | # Create a table header. | ||
501 : | print HTMLOUT _OpenTable("Join Table", "Source", "Target", "Join Condition"); | ||
502 : | # Loop through the joins. | ||
503 : | my $joinTable = $metadata->{Joins}; | ||
504 : | parrello | 1.6 | my @joinKeys = keys %{$joinTable}; |
505 : | for my $joinKey (sort @joinKeys) { | ||
506 : | parrello | 1.1 | # Separate out the source, the target, and the join clause. |
507 : | parrello | 1.6 | $joinKey =~ m!^([^/]+)/(.+)$!; |
508 : | my ($sourceRelation, $targetRelation) = ($1, $2); | ||
509 : | Trace("Join with key $joinKey is from $sourceRelation to $targetRelation.") if T(4); | ||
510 : | my $source = $self->ComputeObjectSentence($sourceRelation); | ||
511 : | my $target = $self->ComputeObjectSentence($targetRelation); | ||
512 : | my $clause = $joinTable->{$joinKey}; | ||
513 : | parrello | 1.1 | # Display them in a table row. |
514 : | print HTMLOUT "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; | ||
515 : | } | ||
516 : | # Close the table. | ||
517 : | print HTMLOUT _CloseTable(); | ||
518 : | # Close the document. | ||
519 : | print HTMLOUT "</body>\n</html>\n"; | ||
520 : | # Close the file. | ||
521 : | close HTMLOUT; | ||
522 : | parrello | 1.5 | Trace("Built MetaData web page.") if T(3); |
523 : | parrello | 1.1 | } |
524 : | |||
525 : | =head3 DumpMetaData | ||
526 : | |||
527 : | C<< $database->DumpMetaData(); >> | ||
528 : | |||
529 : | Return a dump of the metadata structure. | ||
530 : | |||
531 : | =cut | ||
532 : | |||
533 : | sub DumpMetaData { | ||
534 : | # Get the parameters. | ||
535 : | parrello | 1.4 | my ($self) = @_; |
536 : | parrello | 1.1 | # Dump the meta-data. |
537 : | return Data::Dumper::Dumper($self->{_metaData}); | ||
538 : | } | ||
539 : | |||
540 : | =head3 CreateTables | ||
541 : | |||
542 : | C<< $datanase->CreateTables(); >> | ||
543 : | |||
544 : | This method creates the tables for the database from the metadata structure loaded by the | ||
545 : | constructor. It is expected this function will only be used on rare occasions, when the | ||
546 : | parrello | 1.2 | user needs to start with an empty database. Otherwise, the L</LoadTables> method can be |
547 : | parrello | 1.1 | used by itself with the truncate flag turned on. |
548 : | |||
549 : | =cut | ||
550 : | |||
551 : | sub CreateTables { | ||
552 : | # Get the parameters. | ||
553 : | parrello | 1.4 | my ($self) = @_; |
554 : | parrello | 1.1 | my $metadata = $self->{_metaData}; |
555 : | my $dbh = $self->{_dbh}; | ||
556 : | # Loop through the entities. | ||
557 : | parrello | 1.6 | my $entityHash = $metadata->{Entities}; |
558 : | for my $entityName (keys %{$entityHash}) { | ||
559 : | my $entityData = $entityHash->{$entityName}; | ||
560 : | parrello | 1.1 | # Tell the user what we're doing. |
561 : | Trace("Creating relations for entity $entityName.") if T(1); | ||
562 : | # Loop through the entity's relations. | ||
563 : | for my $relationName (keys %{$entityData->{Relations}}) { | ||
564 : | # Create a table for this relation. | ||
565 : | $self->CreateTable($relationName); | ||
566 : | Trace("Relation $relationName created.") if T(1); | ||
567 : | } | ||
568 : | } | ||
569 : | # Loop through the relationships. | ||
570 : | my $relationshipTable = $metadata->{Relationships}; | ||
571 : | for my $relationshipName (keys %{$metadata->{Relationships}}) { | ||
572 : | # Create a table for this relationship. | ||
573 : | Trace("Creating relationship $relationshipName.") if T(1); | ||
574 : | $self->CreateTable($relationshipName); | ||
575 : | } | ||
576 : | } | ||
577 : | |||
578 : | =head3 CreateTable | ||
579 : | |||
580 : | C<< $database->CreateTable($tableName, $indexFlag); >> | ||
581 : | |||
582 : | Create the table for a relation and optionally create its indexes. | ||
583 : | |||
584 : | =over 4 | ||
585 : | |||
586 : | =item relationName | ||
587 : | |||
588 : | Name of the relation (which will also be the table name). | ||
589 : | |||
590 : | =item $indexFlag | ||
591 : | |||
592 : | TRUE if the indexes for the relation should be created, else FALSE. If FALSE, | ||
593 : | L</CreateIndexes> must be called later to bring the indexes into existence. | ||
594 : | |||
595 : | =back | ||
596 : | |||
597 : | =cut | ||
598 : | |||
599 : | sub CreateTable { | ||
600 : | # Get the parameters. | ||
601 : | parrello | 1.4 | my ($self, $relationName, $indexFlag) = @_; |
602 : | parrello | 1.1 | # Get the database handle. |
603 : | my $dbh = $self->{_dbh}; | ||
604 : | # Get the relation data and determine whether or not the relation is primary. | ||
605 : | my $relationData = $self->_FindRelation($relationName); | ||
606 : | my $rootFlag = $self->_IsPrimary($relationName); | ||
607 : | # Create a list of the field data. | ||
608 : | my @fieldList; | ||
609 : | for my $fieldData (@{$relationData->{Fields}}) { | ||
610 : | # Assemble the field name and type. | ||
611 : | my $fieldName = _FixName($fieldData->{name}); | ||
612 : | my $fieldString = "$fieldName $TypeTable{$fieldData->{type}}->{sqlType} NOT NULL "; | ||
613 : | # Push the result into the field list. | ||
614 : | push @fieldList, $fieldString; | ||
615 : | } | ||
616 : | # If this is a root table, add the "new_record" flag. It defaults to 0, so | ||
617 : | if ($rootFlag) { | ||
618 : | push @fieldList, "new_record $TypeTable{boolean}->{sqlType} NOT NULL DEFAULT 0"; | ||
619 : | } | ||
620 : | # Convert the field list into a comma-delimited string. | ||
621 : | my $fieldThing = join(', ', @fieldList); | ||
622 : | # Insure the table is not already there. | ||
623 : | $dbh->drop_table(tbl => $relationName); | ||
624 : | Trace("Table $relationName dropped.") if T(2); | ||
625 : | # Create the table. | ||
626 : | Trace("Creating table $relationName: $fieldThing") if T(2); | ||
627 : | $dbh->create_table(tbl => $relationName, flds => $fieldThing); | ||
628 : | Trace("Relation $relationName created in database.") if T(2); | ||
629 : | # If we want to build the indexes, we do it here. | ||
630 : | if ($indexFlag) { | ||
631 : | $self->CreateIndex($relationName); | ||
632 : | } | ||
633 : | } | ||
634 : | |||
635 : | =head3 CreateIndex | ||
636 : | |||
637 : | C<< $database->CreateIndex($relationName); >> | ||
638 : | |||
639 : | Create the indexes for a relation. If a table is being loaded from a large source file (as | ||
640 : | is the case in L</LoadTable>), it is best to create the indexes after the load. If that is | ||
641 : | the case, then L</CreateTable> should be called with the index flag set to FALSE, and this | ||
642 : | method used after the load to create the indexes for the table. | ||
643 : | |||
644 : | =cut | ||
645 : | |||
646 : | sub CreateIndex { | ||
647 : | # Get the parameters. | ||
648 : | parrello | 1.4 | my ($self, $relationName) = @_; |
649 : | parrello | 1.1 | # Get the relation's descriptor. |
650 : | parrello | 1.2 | my $relationData = $self->_FindRelation($relationName); |
651 : | parrello | 1.1 | # Get the database handle. |
652 : | my $dbh = $self->{_dbh}; | ||
653 : | # Now we need to create this relation's indexes. We do this by looping through its index table. | ||
654 : | parrello | 1.6 | my $indexHash = $relationData->{Indexes}; |
655 : | for my $indexName (keys %{$indexHash}) { | ||
656 : | my $indexData = $indexHash->{$indexName}; | ||
657 : | parrello | 1.1 | # Get the index's field list. |
658 : | my @fieldList = _FixNames(@{$indexData->{IndexFields}}); | ||
659 : | my $flds = join(', ', @fieldList); | ||
660 : | # Get the index's uniqueness flag. | ||
661 : | my $unique = (exists $indexData->{Unique} ? $indexData->{Unique} : 'false'); | ||
662 : | # Create the index. | ||
663 : | $dbh->create_index(idx => $indexName, tbl => $relationName, flds => $flds, unique => $unique); | ||
664 : | Trace("Index created: $indexName for $relationName ($flds)") if T(1); | ||
665 : | } | ||
666 : | } | ||
667 : | |||
668 : | =head3 LoadTables | ||
669 : | |||
670 : | C<< my $stats = $database->LoadTables($directoryName, $rebuild); >> | ||
671 : | |||
672 : | This method will load the database tables from a directory. The tables must already have been created | ||
673 : | in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; | ||
674 : | all of the relations to be loaded must have a file in the directory with the same name as the relation | ||
675 : | (optionally with a suffix of C<.dtx>). Each file must be a tab-delimited table of field values. Each | ||
676 : | line of the file will be loaded as a row of the target relation table. The field values should be in | ||
677 : | the same order as the fields in the relation tables generated by L</ShowMetaData>. The old data is | ||
678 : | erased before the new data is loaded in. | ||
679 : | |||
680 : | A certain amount of translation automatically takes place. Ctrl-M characters are deleted, and | ||
681 : | tab and new-line characters inside a field are escaped as C<\t> and C<\n>, respectively. Dates must | ||
682 : | be entered as a Unix timestamp, that is, as an integer number of seconds since the base epoch. | ||
683 : | |||
684 : | =over 4 | ||
685 : | |||
686 : | =item directoryName | ||
687 : | |||
688 : | Name of the directory containing the relation files to be loaded. | ||
689 : | |||
690 : | =item rebuild | ||
691 : | |||
692 : | TRUE if the tables should be dropped and rebuilt, else FALSE. This is, unfortunately, the | ||
693 : | only way to erase existing data in the tables, since the TRUNCATE command is not supported | ||
694 : | by all of the DB engines we use. | ||
695 : | |||
696 : | =item RETURN | ||
697 : | |||
698 : | Returns a statistical object describing the number of records read and a list of the error messages. | ||
699 : | |||
700 : | =back | ||
701 : | |||
702 : | =cut | ||
703 : | |||
704 : | sub LoadTables { | ||
705 : | # Get the parameters. | ||
706 : | parrello | 1.4 | my ($self, $directoryName, $rebuild) = @_; |
707 : | parrello | 1.1 | # Start the timer. |
708 : | my $startTime = gettimeofday; | ||
709 : | # Clean any trailing slash from the directory name. | ||
710 : | $directoryName =~ s!/\\$!!; | ||
711 : | # Declare the return variable. | ||
712 : | my $retVal = Stats->new(); | ||
713 : | # Get the metadata structure. | ||
714 : | my $metaData = $self->{_metaData}; | ||
715 : | # Loop through the entities. | ||
716 : | for my $entity (values %{$metaData->{Entities}}) { | ||
717 : | # Loop through the entity's relations. | ||
718 : | for my $relationName (keys %{$entity->{Relations}}) { | ||
719 : | # Try to load this relation. | ||
720 : | my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); | ||
721 : | # Accumulate the statistics. | ||
722 : | $retVal->Accumulate($result); | ||
723 : | } | ||
724 : | } | ||
725 : | # Loop through the relationships. | ||
726 : | for my $relationshipName (keys %{$metaData->{Relationships}}) { | ||
727 : | # Try to load this relationship's relation. | ||
728 : | my $result = $self->_LoadRelation($directoryName, $relationshipName, $rebuild); | ||
729 : | # Accumulate the statistics. | ||
730 : | $retVal->Accumulate($result); | ||
731 : | } | ||
732 : | # Add the duration of the load to the statistical object. | ||
733 : | $retVal->Add('duration', gettimeofday - $startTime); | ||
734 : | # Return the accumulated statistics. | ||
735 : | return $retVal; | ||
736 : | } | ||
737 : | |||
738 : | =head3 GetTableNames | ||
739 : | |||
740 : | C<< my @names = $database->GetTableNames; >> | ||
741 : | |||
742 : | Return a list of the relations required to implement this database. | ||
743 : | |||
744 : | =cut | ||
745 : | |||
746 : | sub GetTableNames { | ||
747 : | # Get the parameters. | ||
748 : | parrello | 1.4 | my ($self) = @_; |
749 : | parrello | 1.1 | # Get the relation list from the metadata. |
750 : | my $relationTable = $self->{_metaData}->{RelationTable}; | ||
751 : | # Return the relation names. | ||
752 : | return keys %{$relationTable}; | ||
753 : | } | ||
754 : | |||
755 : | =head3 GetEntityTypes | ||
756 : | |||
757 : | C<< my @names = $database->GetEntityTypes; >> | ||
758 : | |||
759 : | Return a list of the entity type names. | ||
760 : | |||
761 : | =cut | ||
762 : | |||
763 : | sub GetEntityTypes { | ||
764 : | # Get the database object. | ||
765 : | parrello | 1.4 | my ($self) = @_; |
766 : | parrello | 1.1 | # Get the entity list from the metadata object. |
767 : | my $entityList = $self->{_metaData}->{Entities}; | ||
768 : | # Return the list of entity names in alphabetical order. | ||
769 : | return sort keys %{$entityList}; | ||
770 : | } | ||
771 : | |||
772 : | =head3 Get | ||
773 : | |||
774 : | C<< my $query = $database->Get(\@objectNames, $filterClause, $param1, $param2, ..., $paramN); >> | ||
775 : | |||
776 : | This method returns a query object for entities of a specified type using a specified filter. | ||
777 : | The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each | ||
778 : | field name represented in the form B<I<objectName>(I<fieldName>)>. For example, the | ||
779 : | following call requests all B<Genome> objects for the genus specified in the variable | ||
780 : | $genus. | ||
781 : | |||
782 : | C<< $query = $sprout->Get(['Genome'], "Genome(genus) = ?", $genus); >> | ||
783 : | |||
784 : | The WHERE clause contains a single question mark, so there is a single additional | ||
785 : | parameter representing the parameter value. It would also be possible to code | ||
786 : | |||
787 : | C<< $query = $sprout->Get(['Genome'], "Genome(genus) = \'$genus\'"); >> | ||
788 : | |||
789 : | however, this version of the call would generate a syntax error if there were any quote | ||
790 : | characters inside the variable C<$genus>. | ||
791 : | |||
792 : | The use of the strange parenthesized notation for field names enables us to distinguish | ||
793 : | hyphens contained within field names from minus signs that participate in the computation | ||
794 : | of the WHERE clause. All of the methods that manipulate fields will use this same notation. | ||
795 : | |||
796 : | It is possible to specify multiple entity and relationship names in order to retrieve more than | ||
797 : | one object's data at the same time, which allows highly complex joined queries. For example, | ||
798 : | |||
799 : | C<< $query = $sprout->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", $genus); >> | ||
800 : | |||
801 : | If multiple names are specified, then the query processor will automatically determine a | ||
802 : | join path between the entities and relationships. The algorithm used is very simplistic. | ||
803 : | In particular, you can't specify any entity or relationship more than once, and if a | ||
804 : | relationship is recursive, the path is determined by the order in which the entity | ||
805 : | and the relationship appear. For example, consider a recursive relationship B<IsParentOf> | ||
806 : | which relates B<People> objects to other B<People> objects. If the join path is | ||
807 : | coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, | ||
808 : | the join path is C<['IsParentOf', 'People']>, then the people returned will be children. | ||
809 : | |||
810 : | =over 4 | ||
811 : | |||
812 : | =item objectNames | ||
813 : | |||
814 : | List containing the names of the entity and relationship objects to be retrieved. | ||
815 : | |||
816 : | =item filterClause | ||
817 : | |||
818 : | WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
819 : | be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be | ||
820 : | specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified | ||
821 : | in the filter clause should be added to the parameter list as additional parameters. The | ||
822 : | fields in a filter clause can come from primary entity relations, relationship relations, | ||
823 : | or secondary entity relations; however, all of the entities and relationships involved must | ||
824 : | be included in the list of object names. | ||
825 : | |||
826 : | The filter clause can also specify a sort order. To do this, simply follow the filter string | ||
827 : | with an ORDER BY clause. For example, the following filter string gets all genomes for a | ||
828 : | particular genus and sorts them by species name. | ||
829 : | |||
830 : | C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | ||
831 : | |||
832 : | The rules for field references in a sort order are the same as those for field references in the | ||
833 : | filter clause in general; however, odd things may happen if a sort field is from a secondary | ||
834 : | relation. | ||
835 : | |||
836 : | =item param1, param2, ..., paramN | ||
837 : | |||
838 : | Parameter values to be substituted into the filter clause. | ||
839 : | |||
840 : | =item RETURN | ||
841 : | |||
842 : | Returns a B<DBQuery> that can be used to iterate through all of the results. | ||
843 : | |||
844 : | =back | ||
845 : | |||
846 : | =cut | ||
847 : | |||
848 : | sub Get { | ||
849 : | # Get the parameters. | ||
850 : | parrello | 1.4 | my ($self, $objectNames, $filterClause, @params) = @_; |
851 : | parrello | 1.1 | # Construct the SELECT statement. The general pattern is |
852 : | # | ||
853 : | # SELECT name1.*, name2.*, ... nameN.* FROM name1, name2, ... nameN | ||
854 : | # | ||
855 : | my $dbh = $self->{_dbh}; | ||
856 : | my $command = "SELECT DISTINCT " . join('.*, ', @{$objectNames}) . ".* FROM " . | ||
857 : | join(', ', @{$objectNames}); | ||
858 : | # Check for a filter clause. | ||
859 : | if ($filterClause) { | ||
860 : | # Here we have one, so we convert its field names and add it to the query. First, | ||
861 : | # We create a copy of the filter string we can work with. | ||
862 : | my $filterString = $filterClause; | ||
863 : | # Next, we sort the object names by length. This helps protect us from finding | ||
864 : | # object names inside other object names when we're doing our search and replace. | ||
865 : | my @sortedNames = sort { length($b) - length($a) } @{$objectNames}; | ||
866 : | # We will also keep a list of conditions to add to the WHERE clause in order to link | ||
867 : | # entities and relationships as well as primary relations to secondary ones. | ||
868 : | my @joinWhere = (); | ||
869 : | # The final preparatory step is to create a hash table of relation names. The | ||
870 : | # table begins with the relation names already in the SELECT command. | ||
871 : | my %fromNames = (); | ||
872 : | for my $objectName (@sortedNames) { | ||
873 : | $fromNames{$objectName} = 1; | ||
874 : | } | ||
875 : | # We are ready to begin. We loop through the object names, replacing each | ||
876 : | # object name's field references by the corresponding SQL field reference. | ||
877 : | # Along the way, if we find a secondary relation, we will need to add it | ||
878 : | # to the FROM clause. | ||
879 : | for my $objectName (@sortedNames) { | ||
880 : | # Get the length of the object name plus 2. This is the value we add to the | ||
881 : | # size of the field name to determine the size of the field reference as a | ||
882 : | # whole. | ||
883 : | my $nameLength = 2 + length $objectName; | ||
884 : | # Get the object's field list. | ||
885 : | my $fieldList = $self->_GetFieldTable($objectName); | ||
886 : | # Find the field references for this object. | ||
887 : | while ($filterString =~ m/$objectName\(([^)]*)\)/g) { | ||
888 : | # At this point, $1 contains the field name, and the current position | ||
889 : | # is set immediately after the final parenthesis. We pull out the name of | ||
890 : | # the field and the position and length of the field reference as a whole. | ||
891 : | my $fieldName = $1; | ||
892 : | my $len = $nameLength + length $fieldName; | ||
893 : | my $pos = pos($filterString) - $len; | ||
894 : | # Insure the field exists. | ||
895 : | if (!exists $fieldList->{$fieldName}) { | ||
896 : | Confess("Field $fieldName not found for object $objectName."); | ||
897 : | } else { | ||
898 : | # Get the field's relation. | ||
899 : | my $relationName = $fieldList->{$fieldName}->{relation}; | ||
900 : | # Insure the relation is in the FROM clause. | ||
901 : | if (!exists $fromNames{$relationName}) { | ||
902 : | # Add the relation to the FROM clause. | ||
903 : | $command .= ", $relationName"; | ||
904 : | # Create its join sub-clause. | ||
905 : | push @joinWhere, "$objectName.id = $relationName.id"; | ||
906 : | # Denote we have it available for future fields. | ||
907 : | $fromNames{$relationName} = 1; | ||
908 : | } | ||
909 : | # Form an SQL field reference from the relation name and the field name. | ||
910 : | my $sqlReference = "$relationName." . _FixName($fieldName); | ||
911 : | # Put it into the filter string in place of the old value. | ||
912 : | substr($filterString, $pos, $len) = $sqlReference; | ||
913 : | # Reposition the search. | ||
914 : | pos $filterString = $pos + length $sqlReference; | ||
915 : | } | ||
916 : | } | ||
917 : | } | ||
918 : | # The next step is to join the objects together. We only need to do this if there | ||
919 : | # is more than one object in the object list. We start with the first object and | ||
920 : | # run through the objects after it. Note also that we make a safety copy of the | ||
921 : | # list before running through it. | ||
922 : | my @objectList = @{$objectNames}; | ||
923 : | my $lastObject = shift @objectList; | ||
924 : | # Get the join table. | ||
925 : | my $joinTable = $self->{_metaData}->{Joins}; | ||
926 : | # Loop through the object list. | ||
927 : | for my $thisObject (@objectList) { | ||
928 : | # Look for a join. | ||
929 : | my $joinKey = "$lastObject/$thisObject"; | ||
930 : | if (!exists $joinTable->{$joinKey}) { | ||
931 : | # Here there's no join, so we throw an error. | ||
932 : | Confess("No join exists to connect from $lastObject to $thisObject."); | ||
933 : | } else { | ||
934 : | # Get the join clause and add it to the WHERE list. | ||
935 : | push @joinWhere, $joinTable->{$joinKey}; | ||
936 : | # Save this object as the last object for the next iteration. | ||
937 : | $lastObject = $thisObject; | ||
938 : | } | ||
939 : | } | ||
940 : | # Now we need to handle the whole ORDER BY thing. We'll put the order by clause | ||
941 : | # in the following variable. | ||
942 : | my $orderClause = ""; | ||
943 : | # Locate the ORDER BY verb (if any). | ||
944 : | if ($filterString =~ m/^(.*)ORDER BY/g) { | ||
945 : | # Here we have an ORDER BY verb. Split it off of the filter string. | ||
946 : | my $pos = pos $filterString; | ||
947 : | $orderClause = substr($filterString, $pos); | ||
948 : | $filterString = $1; | ||
949 : | } | ||
950 : | # Add the filter and the join clauses (if any) to the SELECT command. | ||
951 : | if ($filterString) { | ||
952 : | push @joinWhere, "($filterString)"; | ||
953 : | } | ||
954 : | if (@joinWhere) { | ||
955 : | $command .= " WHERE " . join(' AND ', @joinWhere); | ||
956 : | } | ||
957 : | # Add the sort clause (if any) to the SELECT command. | ||
958 : | if ($orderClause) { | ||
959 : | $command .= " ORDER BY $orderClause"; | ||
960 : | } | ||
961 : | } | ||
962 : | Trace("SQL query: $command") if T(2); | ||
963 : | Trace("PARMS: '" . (join "', '", @params) . "'") if (T(3) && (@params > 0)); | ||
964 : | my $sth = $dbh->prepare_command($command); | ||
965 : | # Execute it with the parameters bound in. | ||
966 : | $sth->execute(@params) || Confess("SELECT error" . $sth->errstr()); | ||
967 : | # Return the statement object. | ||
968 : | my $retVal = DBQuery::_new($self, $sth, @{$objectNames}); | ||
969 : | return $retVal; | ||
970 : | } | ||
971 : | |||
972 : | parrello | 1.6 | =head3 GetList |
973 : | |||
974 : | C<< my @dbObjects = $database->GetList(\@objectNames, $filterClause, $param1, $param2, ..., $paramN); >> | ||
975 : | |||
976 : | Return a list of object descriptors for the specified objects as determined by the | ||
977 : | specified filter clause. | ||
978 : | |||
979 : | This method is essentially the same as L</Get> except it returns a list of objects rather | ||
980 : | parrello | 1.7 | than a query object that can be used to get the results one record at a time. |
981 : | parrello | 1.6 | |
982 : | =over 4 | ||
983 : | |||
984 : | =item objectNames | ||
985 : | |||
986 : | List containing the names of the entity and relationship objects to be retrieved. | ||
987 : | |||
988 : | =item filterClause | ||
989 : | |||
990 : | WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
991 : | be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be | ||
992 : | specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified | ||
993 : | in the filter clause should be added to the parameter list as additional parameters. The | ||
994 : | fields in a filter clause can come from primary entity relations, relationship relations, | ||
995 : | or secondary entity relations; however, all of the entities and relationships involved must | ||
996 : | be included in the list of object names. | ||
997 : | |||
998 : | The filter clause can also specify a sort order. To do this, simply follow the filter string | ||
999 : | with an ORDER BY clause. For example, the following filter string gets all genomes for a | ||
1000 : | particular genus and sorts them by species name. | ||
1001 : | |||
1002 : | C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | ||
1003 : | |||
1004 : | The rules for field references in a sort order are the same as those for field references in the | ||
1005 : | filter clause in general; however, odd things may happen if a sort field is from a secondary | ||
1006 : | relation. | ||
1007 : | |||
1008 : | =item param1, param2, ..., paramN | ||
1009 : | |||
1010 : | Parameter values to be substituted into the filter clause. | ||
1011 : | |||
1012 : | =item RETURN | ||
1013 : | |||
1014 : | Returns a list of B<DBObject>s that satisfy the query conditions. | ||
1015 : | |||
1016 : | =back | ||
1017 : | |||
1018 : | =cut | ||
1019 : | #: Return Type @% | ||
1020 : | sub GetList { | ||
1021 : | # Get the parameters. | ||
1022 : | my ($self, $objectNames, $filterClause, @params) = @_; | ||
1023 : | # Declare the return variable. | ||
1024 : | my @retVal = (); | ||
1025 : | # Perform the query. | ||
1026 : | my $query = $self->Get($objectNames, $filterClause, @params); | ||
1027 : | # Loop through the results. | ||
1028 : | while (my $object = $query->Fetch) { | ||
1029 : | push @retVal, $object; | ||
1030 : | } | ||
1031 : | # Return the result. | ||
1032 : | return @retVal; | ||
1033 : | } | ||
1034 : | |||
1035 : | parrello | 1.1 | =head3 ComputeObjectSentence |
1036 : | |||
1037 : | C<< my $sentence = $database->ComputeObjectSentence($objectName); >> | ||
1038 : | |||
1039 : | Check an object name, and if it is a relationship convert it to a relationship sentence. | ||
1040 : | |||
1041 : | =over 4 | ||
1042 : | |||
1043 : | =item objectName | ||
1044 : | |||
1045 : | Name of the entity or relationship. | ||
1046 : | |||
1047 : | =item RETURN | ||
1048 : | |||
1049 : | Returns a string containing the entity name or a relationship sentence. | ||
1050 : | |||
1051 : | =back | ||
1052 : | |||
1053 : | =cut | ||
1054 : | |||
1055 : | sub ComputeObjectSentence { | ||
1056 : | # Get the parameters. | ||
1057 : | parrello | 1.4 | my ($self, $objectName) = @_; |
1058 : | parrello | 1.1 | # Set the default return value. |
1059 : | my $retVal = $objectName; | ||
1060 : | # Look for the object as a relationship. | ||
1061 : | my $relTable = $self->{_metaData}->{Relationships}; | ||
1062 : | if (exists $relTable->{$objectName}) { | ||
1063 : | # Get the relationship sentence. | ||
1064 : | $retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); | ||
1065 : | } | ||
1066 : | # Return the result. | ||
1067 : | return $retVal; | ||
1068 : | } | ||
1069 : | |||
1070 : | =head3 DumpRelations | ||
1071 : | |||
1072 : | C<< $database->DumpRelations($outputDirectory); >> | ||
1073 : | |||
1074 : | Write the contents of all the relations to tab-delimited files in the specified directory. | ||
1075 : | Each file will have the same name as the relation dumped, with an extension of DTX. | ||
1076 : | |||
1077 : | =over 4 | ||
1078 : | |||
1079 : | =item outputDirectory | ||
1080 : | |||
1081 : | Name of the directory into which the relation files should be dumped. | ||
1082 : | |||
1083 : | =back | ||
1084 : | |||
1085 : | =cut | ||
1086 : | |||
1087 : | sub DumpRelations { | ||
1088 : | # Get the parameters. | ||
1089 : | parrello | 1.4 | my ($self, $outputDirectory) = @_; |
1090 : | parrello | 1.1 | # Now we need to run through all the relations. First, we loop through the entities. |
1091 : | my $metaData = $self->{_metaData}; | ||
1092 : | my $entities = $metaData->{Entities}; | ||
1093 : | parrello | 1.6 | for my $entityName (keys %{$entities}) { |
1094 : | my $entityStructure = $entities->{$entityName}; | ||
1095 : | parrello | 1.1 | # Get the entity's relations. |
1096 : | my $relationList = $entityStructure->{Relations}; | ||
1097 : | # Loop through the relations, dumping them. | ||
1098 : | parrello | 1.6 | for my $relationName (keys %{$relationList}) { |
1099 : | my $relation = $relationList->{$relationName}; | ||
1100 : | parrello | 1.1 | $self->_DumpRelation($outputDirectory, $relationName, $relation); |
1101 : | } | ||
1102 : | } | ||
1103 : | # Next, we loop through the relationships. | ||
1104 : | my $relationships = $metaData->{Relationships}; | ||
1105 : | parrello | 1.6 | for my $relationshipName (keys %{$relationships}) { |
1106 : | my $relationshipStructure = $relationships->{$relationshipName}; | ||
1107 : | parrello | 1.1 | # Dump this relationship's relation. |
1108 : | $self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); | ||
1109 : | } | ||
1110 : | } | ||
1111 : | |||
1112 : | =head3 InsertObject | ||
1113 : | |||
1114 : | C<< my $ok = $database->InsertObject($objectType, \%fieldHash); >> | ||
1115 : | |||
1116 : | Insert an object into the database. The object is defined by a type name and then a hash | ||
1117 : | of field names to values. Field values in the primary relation are represented by scalars. | ||
1118 : | (Note that for relationships, the primary relation is the B<only> relation.) | ||
1119 : | Field values for the other relations comprising the entity are always list references. For | ||
1120 : | example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases | ||
1121 : | C<ZP_00210270.1> and C<gi|46206278>. | ||
1122 : | |||
1123 : | C<< $database->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); >> | ||
1124 : | |||
1125 : | The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and | ||
1126 : | property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. | ||
1127 : | |||
1128 : | C<< $database->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence = 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); >> | ||
1129 : | |||
1130 : | =over 4 | ||
1131 : | |||
1132 : | =item newObjectType | ||
1133 : | |||
1134 : | Type name of the object to insert. | ||
1135 : | |||
1136 : | =item fieldHash | ||
1137 : | |||
1138 : | Hash of field names to values. | ||
1139 : | |||
1140 : | =item RETURN | ||
1141 : | |||
1142 : | Returns 1 if successful, 0 if an error occurred. | ||
1143 : | |||
1144 : | =back | ||
1145 : | |||
1146 : | =cut | ||
1147 : | |||
1148 : | sub InsertObject { | ||
1149 : | # Get the parameters. | ||
1150 : | parrello | 1.4 | my ($self, $newObjectType, $fieldHash) = @_; |
1151 : | parrello | 1.1 | # Denote that so far we appear successful. |
1152 : | my $retVal = 1; | ||
1153 : | # Get the database handle. | ||
1154 : | my $dbh = $self->{_dbh}; | ||
1155 : | # Get the relation list. | ||
1156 : | my $relationTable = $self->_GetRelationTable($newObjectType); | ||
1157 : | # Loop through the relations. We'll build insert statements for each one. If a relation is | ||
1158 : | # secondary, we may end up generating multiple insert statements. If an error occurs, we | ||
1159 : | # stop the loop. | ||
1160 : | parrello | 1.6 | my @relationList = keys %{$relationTable}; |
1161 : | for (my $i = 0; $retVal && $i <= $#relationList; $i++) { | ||
1162 : | my $relationName = $relationList[$i]; | ||
1163 : | my $relationDefinition = $relationTable->{$relationName}; | ||
1164 : | parrello | 1.1 | # Get the relation's fields. For each field we will collect a value in the corresponding |
1165 : | # position of the @valueList array. If one of the fields is missing, we will add it to the | ||
1166 : | # @missing list. | ||
1167 : | my @fieldList = @{$relationDefinition->{Fields}}; | ||
1168 : | my @fieldNameList = (); | ||
1169 : | my @valueList = (); | ||
1170 : | my @missing = (); | ||
1171 : | my $recordCount = 1; | ||
1172 : | for my $fieldDescriptor (@fieldList) { | ||
1173 : | # Get the field name and save it. Note we need to fix it up so the hyphens | ||
1174 : | # are converted to underscores. | ||
1175 : | my $fieldName = $fieldDescriptor->{name}; | ||
1176 : | push @fieldNameList, _FixName($fieldName); | ||
1177 : | # Look for the named field in the incoming structure. Note that we are looking | ||
1178 : | # for the real field name, not the fixed-up one! | ||
1179 : | if (exists $fieldHash->{$fieldName}) { | ||
1180 : | # Here we found the field. Stash it in the value list. | ||
1181 : | my $value = $fieldHash->{$fieldName}; | ||
1182 : | push @valueList, $value; | ||
1183 : | # If the value is a list, we may need to increment the record count. | ||
1184 : | if (ref $value eq "ARRAY") { | ||
1185 : | my $thisCount = @{$value}; | ||
1186 : | if ($recordCount == 1) { | ||
1187 : | # Here we have our first list, so we save its count. | ||
1188 : | $recordCount = $thisCount; | ||
1189 : | } elsif ($recordCount != $thisCount) { | ||
1190 : | # Here we have a second list, so its length has to match the | ||
1191 : | # previous lists. | ||
1192 : | Trace("Field $value in new $newObjectType object has an invalid list length $thisCount. Expected $recordCount.") if T(0); | ||
1193 : | $retVal = 0; | ||
1194 : | } | ||
1195 : | } | ||
1196 : | } else { | ||
1197 : | # Here the field is not present. Flag it as missing. | ||
1198 : | push @missing, $fieldName; | ||
1199 : | } | ||
1200 : | } | ||
1201 : | # If we are the primary relation, add the new-record flag. | ||
1202 : | if ($relationName eq $newObjectType) { | ||
1203 : | push @valueList, 1; | ||
1204 : | push @fieldNameList, "new_record"; | ||
1205 : | } | ||
1206 : | # Only proceed if there are no missing fields. | ||
1207 : | if (@missing > 0) { | ||
1208 : | Trace("Relation $relationName for $newObjectType skipped due to missing fields: " . | ||
1209 : | join(' ', @missing)) if T(1); | ||
1210 : | } else { | ||
1211 : | # Build the INSERT statement. | ||
1212 : | my $statement = "INSERT INTO $relationName (" . join (', ', @fieldNameList) . | ||
1213 : | ") VALUES ("; | ||
1214 : | # Create a marker list of the proper size and put it in the statement. | ||
1215 : | my @markers = (); | ||
1216 : | while (@markers < @fieldNameList) { push @markers, '?'; } | ||
1217 : | $statement .= join(', ', @markers) . ")"; | ||
1218 : | # We have the insert statement, so prepare it. | ||
1219 : | my $sth = $dbh->prepare_command($statement); | ||
1220 : | Trace("Insert statement prepared: $statement") if T(3); | ||
1221 : | # Now we loop through the values. If a value is scalar, we use it unmodified. If it's | ||
1222 : | # a list, we use the current element. The values are stored in the @parameterList array. | ||
1223 : | my $done = 0; | ||
1224 : | for (my $i = 0; $i < $recordCount; $i++) { | ||
1225 : | # Clear the parameter list array. | ||
1226 : | my @parameterList = (); | ||
1227 : | # Loop through the values. | ||
1228 : | for my $value (@valueList) { | ||
1229 : | # Check to see if this is a scalar value. | ||
1230 : | if (ref $value eq "ARRAY") { | ||
1231 : | # Here we have a list value. Pull the current entry. | ||
1232 : | push @parameterList, $value->[$i]; | ||
1233 : | } else { | ||
1234 : | # Here we have a scalar value. Use it unmodified. | ||
1235 : | push @parameterList, $value; | ||
1236 : | } | ||
1237 : | } | ||
1238 : | # Execute the INSERT statement with the specified parameter list. | ||
1239 : | $retVal = $sth->execute(@parameterList); | ||
1240 : | if (!$retVal) { | ||
1241 : | my $errorString = $sth->errstr(); | ||
1242 : | Trace("Insert error: $errorString.") if T(0); | ||
1243 : | } | ||
1244 : | } | ||
1245 : | } | ||
1246 : | } | ||
1247 : | # Return the success indicator. | ||
1248 : | return $retVal; | ||
1249 : | } | ||
1250 : | |||
1251 : | =head3 LoadTable | ||
1252 : | |||
1253 : | C<< my %results = $database->LoadTable($fileName, $relationName, $truncateFlag); >> | ||
1254 : | |||
1255 : | parrello | 1.9 | Load data from a tab-delimited file into a specified table, optionally re-creating the table |
1256 : | first. | ||
1257 : | parrello | 1.1 | |
1258 : | =over 4 | ||
1259 : | |||
1260 : | =item fileName | ||
1261 : | |||
1262 : | Name of the file from which the table data should be loaded. | ||
1263 : | |||
1264 : | =item relationName | ||
1265 : | |||
1266 : | Name of the relation to be loaded. This is the same as the table name. | ||
1267 : | |||
1268 : | =item truncateFlag | ||
1269 : | |||
1270 : | TRUE if the table should be dropped and re-created, else FALSE | ||
1271 : | |||
1272 : | =item RETURN | ||
1273 : | |||
1274 : | parrello | 1.9 | Returns a statistical object containing the number of records read and a list of |
1275 : | the error messages. | ||
1276 : | parrello | 1.1 | |
1277 : | =back | ||
1278 : | |||
1279 : | =cut | ||
1280 : | sub LoadTable { | ||
1281 : | # Get the parameters. | ||
1282 : | parrello | 1.4 | my ($self, $fileName, $relationName, $truncateFlag) = @_; |
1283 : | parrello | 1.1 | # Create the statistical return object. |
1284 : | my $retVal = _GetLoadStats(); | ||
1285 : | # Trace the fact of the load. | ||
1286 : | Trace("Loading table $relationName from $fileName") if T(1); | ||
1287 : | # Get the database handle. | ||
1288 : | my $dbh = $self->{_dbh}; | ||
1289 : | # Get the relation data. | ||
1290 : | my $relation = $self->_FindRelation($relationName); | ||
1291 : | # Check the truncation flag. | ||
1292 : | if ($truncateFlag) { | ||
1293 : | Trace("Creating table $relationName") if T(1); | ||
1294 : | # Re-create the table without its index. | ||
1295 : | $self->CreateTable($relationName, 0); | ||
1296 : | } | ||
1297 : | # Determine whether or not this is a primary relation. Primary relations have an extra | ||
1298 : | # field indicating whether or not a given object is new or was loaded from the flat files. | ||
1299 : | my $primary = $self->_IsPrimary($relationName); | ||
1300 : | # Get the number of fields in this relation. | ||
1301 : | my @fieldList = @{$relation->{Fields}}; | ||
1302 : | my $fieldCount = @fieldList; | ||
1303 : | # Start a database transaction. | ||
1304 : | $dbh->begin_tran; | ||
1305 : | # Open the relation file. We need to create a cleaned-up copy before loading. | ||
1306 : | open TABLEIN, '<', $fileName; | ||
1307 : | my $tempName = "$fileName.tbl"; | ||
1308 : | open TABLEOUT, '>', $tempName; | ||
1309 : | parrello | 1.9 | my $inputCount = 0; |
1310 : | parrello | 1.1 | # Loop through the file. |
1311 : | while (<TABLEIN>) { | ||
1312 : | parrello | 1.9 | $inputCount++; |
1313 : | parrello | 1.1 | # Chop off the new-line character. |
1314 : | parrello | 1.9 | my $record = Tracer::Strip($_); |
1315 : | parrello | 1.2 | # Only proceed if the record is non-blank. |
1316 : | if ($record) { | ||
1317 : | # Escape all the backslashes found in the line. | ||
1318 : | $record =~ s/\\/\\\\/g; | ||
1319 : | parrello | 1.9 | # Insure the number of fields is correct. |
1320 : | my @fields = split /\t/, $record; | ||
1321 : | while (@fields > $fieldCount) { | ||
1322 : | my $extraField = $fields[$#fields]; | ||
1323 : | delete $fields[$#fields]; | ||
1324 : | if ($extraField) { | ||
1325 : | Trace("Nonblank extra field value \"$extraField\" deleted from record $inputCount of $fileName.") if T(1); | ||
1326 : | } | ||
1327 : | } | ||
1328 : | while (@fields < $fieldCount) { | ||
1329 : | push @fields, ""; | ||
1330 : | } | ||
1331 : | parrello | 1.2 | # If this is a primary relation, add a 0 for the new-record flag (indicating that |
1332 : | # this record is not new, but part of the original load). | ||
1333 : | if ($primary) { | ||
1334 : | parrello | 1.9 | push @fields, "0"; |
1335 : | parrello | 1.2 | } |
1336 : | # Write the record. | ||
1337 : | parrello | 1.9 | $record = join "\t", @fields; |
1338 : | parrello | 1.2 | print TABLEOUT "$record\n"; |
1339 : | parrello | 1.9 | # Count the record written. |
1340 : | parrello | 1.2 | my $count = $retVal->Add('records'); |
1341 : | parrello | 1.3 | my $len = length $record; |
1342 : | Trace("Record $count written with $len characters.") if T(4); | ||
1343 : | parrello | 1.9 | } else { |
1344 : | # Here we have a blank record. | ||
1345 : | $retVal->Add('skipped'); | ||
1346 : | } | ||
1347 : | parrello | 1.1 | } |
1348 : | # Close the files. | ||
1349 : | close TABLEIN; | ||
1350 : | close TABLEOUT; | ||
1351 : | parrello | 1.3 | Trace("Temporary file $tempName created.") if T(4); |
1352 : | # Load the table. | ||
1353 : | parrello | 1.1 | my $rv; |
1354 : | eval { | ||
1355 : | $rv = $dbh->load_table(file => $tempName, tbl => $relationName); | ||
1356 : | }; | ||
1357 : | if (!defined $rv) { | ||
1358 : | parrello | 1.3 | $retVal->AddMessage($@) if ($@); |
1359 : | $retVal->AddMessage("Table load failed for $relationName using $tempName."); | ||
1360 : | parrello | 1.1 | Trace("Table load failed for $relationName.") if T(1); |
1361 : | } else { | ||
1362 : | # Here we successfully loaded the table. Trace the number of records loaded. | ||
1363 : | Trace("$retVal->{records} records read for $relationName.") if T(1); | ||
1364 : | # If we're rebuilding, we need to create the table indexes. | ||
1365 : | if ($truncateFlag) { | ||
1366 : | eval { | ||
1367 : | $self->CreateIndex($relationName); | ||
1368 : | }; | ||
1369 : | if ($@) { | ||
1370 : | $retVal->AddMessage($@); | ||
1371 : | } | ||
1372 : | parrello | 1.2 | } |
1373 : | parrello | 1.1 | } |
1374 : | # Commit the database changes. | ||
1375 : | $dbh->commit_tran; | ||
1376 : | # Delete the temporary file. | ||
1377 : | unlink $tempName; | ||
1378 : | # Return the statistics. | ||
1379 : | return $retVal; | ||
1380 : | } | ||
1381 : | |||
1382 : | =head3 GenerateEntity | ||
1383 : | |||
1384 : | C<< my $fieldHash = $database->GenerateEntity($id, $type, \%values); >> | ||
1385 : | |||
1386 : | Generate the data for a new entity instance. This method creates a field hash suitable for | ||
1387 : | passing as a parameter to L</InsertObject>. The ID is specified by the callr, but the rest | ||
1388 : | of the fields are generated using information in the database schema. | ||
1389 : | |||
1390 : | Each data type has a default algorithm for generating random test data. This can be overridden | ||
1391 : | by including a B<DataGen> element in the field. If this happens, the content of the element is | ||
1392 : | executed as a PERL program in the context of this module. The element may make use of a C<$this> | ||
1393 : | variable which contains the field hash as it has been built up to the current point. If any | ||
1394 : | fields are dependent on other fields, the C<pass> attribute can be used to control the order | ||
1395 : | in which the fields are generated. A field with a high data pass number will be generated after | ||
1396 : | a field with a lower one. If any external values are needed, they should be passed in via the | ||
1397 : | optional third parameter, which will be available to the data generation script under the name | ||
1398 : | C<$value>. Several useful utility methods are provided for generating random values, including | ||
1399 : | L</IntGen>, L</StringGen>, L</FloatGen>, and L</DateGen>. Note that dates are stored and generated | ||
1400 : | in the form of a timestamp number rather than a string. | ||
1401 : | |||
1402 : | =over 4 | ||
1403 : | |||
1404 : | =item id | ||
1405 : | |||
1406 : | ID to assign to the new entity. | ||
1407 : | |||
1408 : | =item type | ||
1409 : | |||
1410 : | Type name for the new entity. | ||
1411 : | |||
1412 : | =item values | ||
1413 : | |||
1414 : | Hash containing additional values that might be needed by the data generation methods (optional). | ||
1415 : | |||
1416 : | =back | ||
1417 : | |||
1418 : | =cut | ||
1419 : | |||
1420 : | sub GenerateEntity { | ||
1421 : | # Get the parameters. | ||
1422 : | parrello | 1.4 | my ($self, $id, $type, $values) = @_; |
1423 : | parrello | 1.1 | # Create the return hash. |
1424 : | my $this = { id => $id }; | ||
1425 : | # Get the metadata structure. | ||
1426 : | my $metadata = $self->{_metaData}; | ||
1427 : | # Get this entity's list of fields. | ||
1428 : | if (!exists $metadata->{Entities}->{$type}) { | ||
1429 : | Confess("Unrecognized entity type $type in GenerateEntity."); | ||
1430 : | } else { | ||
1431 : | my $entity = $metadata->{Entities}->{$type}; | ||
1432 : | my $fields = $entity->{Fields}; | ||
1433 : | # Generate data from the fields. | ||
1434 : | _GenerateFields($this, $fields, $type, $values); | ||
1435 : | } | ||
1436 : | # Return the hash created. | ||
1437 : | return $this; | ||
1438 : | } | ||
1439 : | |||
1440 : | parrello | 1.6 | =head3 GetEntity |
1441 : | |||
1442 : | C<< my $entityObject = $sprout->GetEntity($entityType, $ID); >> | ||
1443 : | |||
1444 : | Return an object describing the entity instance with a specified ID. | ||
1445 : | |||
1446 : | =over 4 | ||
1447 : | |||
1448 : | =item entityType | ||
1449 : | |||
1450 : | Entity type name. | ||
1451 : | |||
1452 : | =item ID | ||
1453 : | |||
1454 : | ID of the desired entity. | ||
1455 : | |||
1456 : | =item RETURN | ||
1457 : | |||
1458 : | Returns a B<DBObject> representing the desired entity instance, or an undefined value if no | ||
1459 : | instance is found with the specified key. | ||
1460 : | |||
1461 : | =back | ||
1462 : | |||
1463 : | =cut | ||
1464 : | |||
1465 : | sub GetEntity { | ||
1466 : | # Get the parameters. | ||
1467 : | my ($self, $entityType, $ID) = @_; | ||
1468 : | # Create a query. | ||
1469 : | my $query = $self->Get([$entityType], "$entityType(id) = ?", $ID); | ||
1470 : | # Get the first (and only) object. | ||
1471 : | my $retVal = $query->Fetch(); | ||
1472 : | # Return the result. | ||
1473 : | return $retVal; | ||
1474 : | } | ||
1475 : | |||
1476 : | =head3 GetEntityValues | ||
1477 : | |||
1478 : | C<< my @values = GetEntityValues($entityType, $ID, \@fields); >> | ||
1479 : | |||
1480 : | Return a list of values from a specified entity instance. | ||
1481 : | |||
1482 : | =over 4 | ||
1483 : | |||
1484 : | =item entityType | ||
1485 : | |||
1486 : | Entity type name. | ||
1487 : | |||
1488 : | =item ID | ||
1489 : | |||
1490 : | ID of the desired entity. | ||
1491 : | |||
1492 : | =item fields | ||
1493 : | |||
1494 : | List of field names, each of the form I<objectName>C<(>I<fieldName>C<)>. | ||
1495 : | |||
1496 : | =item RETURN | ||
1497 : | |||
1498 : | Returns a flattened list of the values of the specified fields for the specified entity. | ||
1499 : | |||
1500 : | =back | ||
1501 : | |||
1502 : | =cut | ||
1503 : | |||
1504 : | sub GetEntityValues { | ||
1505 : | # Get the parameters. | ||
1506 : | my ($self, $entityType, $ID, $fields) = @_; | ||
1507 : | # Get the specified entity. | ||
1508 : | my $entity = $self->GetEntity($entityType, $ID); | ||
1509 : | # Declare the return list. | ||
1510 : | my @retVal = (); | ||
1511 : | # If we found the entity, push the values into the return list. | ||
1512 : | if ($entity) { | ||
1513 : | push @retVal, $entity->Values($fields); | ||
1514 : | } | ||
1515 : | # Return the result. | ||
1516 : | return @retVal; | ||
1517 : | } | ||
1518 : | parrello | 1.1 | |
1519 : | parrello | 1.7 | =head3 GetAll |
1520 : | |||
1521 : | C<< my @list = $sprout->GetAll(\@objectNames, $filterClause, \@parameters, \@fields, $count); >> | ||
1522 : | |||
1523 : | Return a list of values taken from the objects returned by a query. The first three | ||
1524 : | parameters correspond to the parameters of the L</Get> method. The final parameter is | ||
1525 : | a list of the fields desired from each record found by the query. The field name | ||
1526 : | syntax is the standard syntax used for fields in the B<ERDB> system-- | ||
1527 : | B<I<objectName>(I<fieldName>)>-- where I<objectName> is the name of the relevant entity | ||
1528 : | or relationship and I<fieldName> is the name of the field. | ||
1529 : | |||
1530 : | The list returned will be a list of lists. Each element of the list will contain | ||
1531 : | the values returned for the fields specified in the fourth parameter. If one of the | ||
1532 : | fields specified returns multiple values, they are flattened in with the rest. For | ||
1533 : | example, the following call will return a list of the features in a particular | ||
1534 : | spreadsheet cell, and each feature will be represented by a list containing the | ||
1535 : | feature ID followed by all of its aliases. | ||
1536 : | |||
1537 : | C<< $query = $sprout->Get(['ContainsFeature', 'Feature'], "ContainsFeature(from-link) = ?", [$ssCellID], ['Feature(id)', 'Feature(alias)']); >> | ||
1538 : | |||
1539 : | =over 4 | ||
1540 : | |||
1541 : | =item objectNames | ||
1542 : | |||
1543 : | List containing the names of the entity and relationship objects to be retrieved. | ||
1544 : | |||
1545 : | =item filterClause | ||
1546 : | |||
1547 : | WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | ||
1548 : | be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form | ||
1549 : | B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the | ||
1550 : | parameter list as additional parameters. The fields in a filter clause can come from primary | ||
1551 : | entity relations, relationship relations, or secondary entity relations; however, all of the | ||
1552 : | entities and relationships involved must be included in the list of object names. | ||
1553 : | |||
1554 : | =item parameterList | ||
1555 : | |||
1556 : | List of the parameters to be substituted in for the parameters marks in the filter clause. | ||
1557 : | |||
1558 : | =item fields | ||
1559 : | |||
1560 : | List of the fields to be returned in each element of the list returned. | ||
1561 : | |||
1562 : | =item count | ||
1563 : | |||
1564 : | Maximum number of records to return. If omitted or 0, all available records will be returned. | ||
1565 : | |||
1566 : | =item RETURN | ||
1567 : | |||
1568 : | Returns a list of list references. Each element of the return list contains the values for the | ||
1569 : | fields specified in the B<fields> parameter. | ||
1570 : | |||
1571 : | =back | ||
1572 : | |||
1573 : | =cut | ||
1574 : | #: Return Type @@; | ||
1575 : | sub GetAll { | ||
1576 : | # Get the parameters. | ||
1577 : | my ($self, $objectNames, $filterClause, $parameterList, $fields, $count) = @_; | ||
1578 : | # Translate the parameters from a list reference to a list. If the parameter | ||
1579 : | # list is a scalar we convert it into a singleton list. | ||
1580 : | my @parmList = (); | ||
1581 : | if (ref $parameterList eq "ARRAY") { | ||
1582 : | @parmList = @{$parameterList}; | ||
1583 : | } else { | ||
1584 : | push @parmList, $parameterList; | ||
1585 : | } | ||
1586 : | # Create the query. | ||
1587 : | my $query = $self->Get($objectNames, $filterClause, @parmList); | ||
1588 : | # Set up a counter of the number of records read. | ||
1589 : | my $fetched = 0; | ||
1590 : | # Insure the counter has a value. | ||
1591 : | if (!defined $count) { | ||
1592 : | $count = 0; | ||
1593 : | } | ||
1594 : | # Loop through the records returned, extracting the fields. Note that if the | ||
1595 : | # counter is non-zero, we stop when the number of records read hits the count. | ||
1596 : | my @retVal = (); | ||
1597 : | while (($count == 0 || $fetched < $count) && (my $row = $query->Fetch())) { | ||
1598 : | my @rowData = $row->Values($fields); | ||
1599 : | push @retVal, \@rowData; | ||
1600 : | $fetched++; | ||
1601 : | } | ||
1602 : | # Return the resulting list. | ||
1603 : | return @retVal; | ||
1604 : | } | ||
1605 : | |||
1606 : | parrello | 1.1 | =head2 Internal Utility Methods |
1607 : | |||
1608 : | =head3 GetLoadStats | ||
1609 : | |||
1610 : | Return a blank statistics object for use by the load methods. | ||
1611 : | |||
1612 : | This is a static method. | ||
1613 : | |||
1614 : | =cut | ||
1615 : | |||
1616 : | sub _GetLoadStats { | ||
1617 : | return Stats->new('records'); | ||
1618 : | } | ||
1619 : | |||
1620 : | =head3 GenerateFields | ||
1621 : | |||
1622 : | Generate field values from a field structure and store in a specified table. The field names | ||
1623 : | are first sorted by pass count, certain pre-defined fields are removed from the list, and | ||
1624 : | then we rip through them evaluation the data generation string. Fields in the primary relation | ||
1625 : | are stored as scalars; fields in secondary relations are stored as value lists. | ||
1626 : | |||
1627 : | This is a static method. | ||
1628 : | |||
1629 : | =over 4 | ||
1630 : | |||
1631 : | =item this | ||
1632 : | |||
1633 : | Hash table into which the field values should be placed. | ||
1634 : | |||
1635 : | =item fields | ||
1636 : | |||
1637 : | Field structure from which the field descriptors should be taken. | ||
1638 : | |||
1639 : | =item type | ||
1640 : | |||
1641 : | Type name of the object whose fields are being generated. | ||
1642 : | |||
1643 : | =item values (optional) | ||
1644 : | |||
1645 : | Reference to a value structure from which additional values can be taken. | ||
1646 : | |||
1647 : | =item from (optiona) | ||
1648 : | |||
1649 : | Reference to the source entity instance if relationship data is being generated. | ||
1650 : | |||
1651 : | =item to (optional) | ||
1652 : | |||
1653 : | Reference to the target entity instance if relationship data is being generated. | ||
1654 : | |||
1655 : | =back | ||
1656 : | |||
1657 : | =cut | ||
1658 : | |||
1659 : | sub _GenerateFields { | ||
1660 : | # Get the parameters. | ||
1661 : | my ($this, $fields, $type, $values, $from, $to) = @_; | ||
1662 : | # Sort the field names by pass number. | ||
1663 : | my @fieldNames = sort { $fields->{$a}->{DataGen}->{pass} <=> $fields->{$b}->{DataGen}->{pass} } keys %{$fields}; | ||
1664 : | # Loop through the field names, generating data. | ||
1665 : | for my $name (@fieldNames) { | ||
1666 : | # Only proceed if this field needs to be generated. | ||
1667 : | if (!exists $this->{$name}) { | ||
1668 : | # Get this field's data generation descriptor. | ||
1669 : | my $fieldDescriptor = $fields->{$name}; | ||
1670 : | my $data = $fieldDescriptor->{DataGen}; | ||
1671 : | # Get the code to generate the field value. | ||
1672 : | my $codeString = $data->{content}; | ||
1673 : | # Determine whether or not this field is in the primary relation. | ||
1674 : | if ($fieldDescriptor->{relation} eq $type) { | ||
1675 : | # Here we have a primary relation field. Store the field value as | ||
1676 : | # a scalar. | ||
1677 : | $this->{$name} = eval($codeString); | ||
1678 : | } else { | ||
1679 : | # Here we have a secondary relation field. Create a null list | ||
1680 : | # and push the desired number of field values onto it. | ||
1681 : | my @fieldValues = (); | ||
1682 : | my $count = IntGen(0,$data->{testCount}); | ||
1683 : | for (my $i = 0; $i < $count; $i++) { | ||
1684 : | my $newValue = eval($codeString); | ||
1685 : | push @fieldValues, $newValue; | ||
1686 : | } | ||
1687 : | # Store the value list in the main hash. | ||
1688 : | $this->{$name} = \@fieldValues; | ||
1689 : | } | ||
1690 : | } | ||
1691 : | } | ||
1692 : | } | ||
1693 : | |||
1694 : | =head3 DumpRelation | ||
1695 : | |||
1696 : | Dump the specified relation's to the specified output file in tab-delimited format. | ||
1697 : | |||
1698 : | This is an instance method. | ||
1699 : | |||
1700 : | =over 4 | ||
1701 : | |||
1702 : | =item outputDirectory | ||
1703 : | |||
1704 : | Directory to contain the output file. | ||
1705 : | |||
1706 : | =item relationName | ||
1707 : | |||
1708 : | Name of the relation to dump. | ||
1709 : | |||
1710 : | =item relation | ||
1711 : | |||
1712 : | Structure describing the relation to be dumped. | ||
1713 : | |||
1714 : | =back | ||
1715 : | |||
1716 : | =cut | ||
1717 : | |||
1718 : | sub _DumpRelation { | ||
1719 : | # Get the parameters. | ||
1720 : | parrello | 1.4 | my ($self, $outputDirectory, $relationName, $relation) = @_; |
1721 : | parrello | 1.1 | # Open the output file. |
1722 : | my $fileName = "$outputDirectory/$relationName.dtx"; | ||
1723 : | open(DTXOUT, ">$fileName") || Confess("Could not open dump file $fileName: $!"); | ||
1724 : | # Create a query for the specified relation. | ||
1725 : | my $dbh = $self->{_dbh}; | ||
1726 : | my $query = $dbh->prepare_command("SELECT * FROM $relationName"); | ||
1727 : | # Execute the query. | ||
1728 : | $query->execute() || Confess("SELECT error dumping $relationName."); | ||
1729 : | # Loop through the results. | ||
1730 : | while (my @row = $query->fetchrow) { | ||
1731 : | # Escape any tabs or new-lines in the row text. | ||
1732 : | for my $field (@row) { | ||
1733 : | $field =~ s/\n/\\n/g; | ||
1734 : | $field =~ s/\t/\\t/g; | ||
1735 : | } | ||
1736 : | # Tab-join the row and write it to the output file. | ||
1737 : | my $rowText = join("\t", @row); | ||
1738 : | print DTXOUT "$rowText\n"; | ||
1739 : | } | ||
1740 : | # Close the output file. | ||
1741 : | close DTXOUT; | ||
1742 : | } | ||
1743 : | |||
1744 : | =head3 GetStructure | ||
1745 : | |||
1746 : | Get the data structure for a specified entity or relationship. | ||
1747 : | |||
1748 : | This is an instance method. | ||
1749 : | |||
1750 : | =over 4 | ||
1751 : | |||
1752 : | =item objectName | ||
1753 : | |||
1754 : | Name of the desired entity or relationship. | ||
1755 : | |||
1756 : | =item RETURN | ||
1757 : | |||
1758 : | The descriptor for the specified object. | ||
1759 : | |||
1760 : | =back | ||
1761 : | |||
1762 : | =cut | ||
1763 : | |||
1764 : | sub _GetStructure { | ||
1765 : | # Get the parameters. | ||
1766 : | parrello | 1.4 | my ($self, $objectName) = @_; |
1767 : | parrello | 1.1 | # Get the metadata structure. |
1768 : | my $metadata = $self->{_metaData}; | ||
1769 : | # Declare the variable to receive the descriptor. | ||
1770 : | my $retVal; | ||
1771 : | # Get the descriptor from the metadata. | ||
1772 : | if (exists $metadata->{Entities}->{$objectName}) { | ||
1773 : | $retVal = $metadata->{Entities}->{$objectName}; | ||
1774 : | } elsif (exists $metadata->{Relationships}->{$objectName}) { | ||
1775 : | $retVal = $metadata->{Relationships}->{$objectName}; | ||
1776 : | } else { | ||
1777 : | Confess("Object $objectName not found in database."); | ||
1778 : | } | ||
1779 : | # Return the descriptor. | ||
1780 : | return $retVal; | ||
1781 : | } | ||
1782 : | |||
1783 : | =head3 GetRelationTable | ||
1784 : | |||
1785 : | Get the list of relations for a specified entity or relationship. | ||
1786 : | |||
1787 : | This is an instance method. | ||
1788 : | |||
1789 : | =over 4 | ||
1790 : | |||
1791 : | =item objectName | ||
1792 : | |||
1793 : | Name of the desired entity or relationship. | ||
1794 : | |||
1795 : | =item RETURN | ||
1796 : | |||
1797 : | A table containing the relations for the specified object. | ||
1798 : | |||
1799 : | =back | ||
1800 : | |||
1801 : | =cut | ||
1802 : | |||
1803 : | sub _GetRelationTable { | ||
1804 : | # Get the parameters. | ||
1805 : | parrello | 1.4 | my ($self, $objectName) = @_; |
1806 : | parrello | 1.1 | # Get the descriptor from the metadata. |
1807 : | my $objectData = $self->_GetStructure($objectName); | ||
1808 : | # Return the object's relation list. | ||
1809 : | return $objectData->{Relations}; | ||
1810 : | } | ||
1811 : | |||
1812 : | =head3 GetFieldTable | ||
1813 : | |||
1814 : | Get the field structure for a specified entity or relationship. | ||
1815 : | |||
1816 : | This is an instance method. | ||
1817 : | |||
1818 : | =over 4 | ||
1819 : | |||
1820 : | =item objectName | ||
1821 : | |||
1822 : | Name of the desired entity or relationship. | ||
1823 : | |||
1824 : | =item RETURN | ||
1825 : | |||
1826 : | The table containing the field descriptors for the specified object. | ||
1827 : | |||
1828 : | =back | ||
1829 : | |||
1830 : | =cut | ||
1831 : | |||
1832 : | sub _GetFieldTable { | ||
1833 : | # Get the parameters. | ||
1834 : | parrello | 1.4 | my ($self, $objectName) = @_; |
1835 : | parrello | 1.1 | # Get the descriptor from the metadata. |
1836 : | my $objectData = $self->_GetStructure($objectName); | ||
1837 : | # Return the object's field table. | ||
1838 : | return $objectData->{Fields}; | ||
1839 : | } | ||
1840 : | |||
1841 : | =head3 ValidateFieldNames | ||
1842 : | |||
1843 : | Determine whether or not the field names are valid. A description of the problems with the names | ||
1844 : | will be written to the standard error output. If there is an error, this method will abort. This is | ||
1845 : | a static method. | ||
1846 : | |||
1847 : | =over 4 | ||
1848 : | |||
1849 : | =item metadata | ||
1850 : | |||
1851 : | Metadata structure loaded from the XML data definition. | ||
1852 : | |||
1853 : | =back | ||
1854 : | |||
1855 : | =cut | ||
1856 : | |||
1857 : | sub _ValidateFieldNames { | ||
1858 : | # Get the object. | ||
1859 : | my ($metadata) = @_; | ||
1860 : | # Declare the return value. We assume success. | ||
1861 : | my $retVal = 1; | ||
1862 : | # Loop through the sections of the database definition. | ||
1863 : | for my $section ('Entities', 'Relationships') { | ||
1864 : | # Loop through the objects in this section. | ||
1865 : | for my $object (values %{$metadata->{$section}}) { | ||
1866 : | # Loop through the object's fields. | ||
1867 : | for my $fieldName (keys %{$object->{Fields}}) { | ||
1868 : | # Now we make some initial validations. | ||
1869 : | if ($fieldName =~ /--/) { | ||
1870 : | # Here we have a doubled minus sign. | ||
1871 : | print STDERR "Field name $fieldName has a doubled hyphen.\n"; | ||
1872 : | $retVal = 0; | ||
1873 : | } elsif ($fieldName !~ /^[A-Za-z]/) { | ||
1874 : | # Here the field name is missing the initial letter. | ||
1875 : | print STDERR "Field name $fieldName does not begin with a letter.\n"; | ||
1876 : | $retVal = 0; | ||
1877 : | } else { | ||
1878 : | # Strip out the minus signs. Everything remaining must be a letter | ||
1879 : | # or digit. | ||
1880 : | my $strippedName = $fieldName; | ||
1881 : | $strippedName =~ s/-//g; | ||
1882 : | if ($strippedName !~ /^[A-Za-z0-9]+$/) { | ||
1883 : | print STDERR "Field name $fieldName contains illegal characters.\n"; | ||
1884 : | $retVal = 0; | ||
1885 : | } | ||
1886 : | } | ||
1887 : | } | ||
1888 : | } | ||
1889 : | } | ||
1890 : | # If an error was found, fail. | ||
1891 : | if ($retVal == 0) { | ||
1892 : | Confess("Errors found in field names."); | ||
1893 : | } | ||
1894 : | } | ||
1895 : | |||
1896 : | =head3 LoadRelation | ||
1897 : | |||
1898 : | Load a relation from the data in a tab-delimited disk file. The load will only take place if a disk | ||
1899 : | file with the same name as the relation exists in the specified directory. | ||
1900 : | |||
1901 : | This is an instance method. | ||
1902 : | |||
1903 : | =over 4 | ||
1904 : | |||
1905 : | =item dbh | ||
1906 : | |||
1907 : | DBKernel object for accessing the database. | ||
1908 : | |||
1909 : | =item directoryName | ||
1910 : | |||
1911 : | Name of the directory containing the tab-delimited data files. | ||
1912 : | |||
1913 : | =item relationName | ||
1914 : | |||
1915 : | Name of the relation to load. | ||
1916 : | |||
1917 : | =item rebuild | ||
1918 : | |||
1919 : | TRUE if the table should be dropped and re-created before loading. | ||
1920 : | |||
1921 : | =item RETURN | ||
1922 : | |||
1923 : | Returns a statistical object describing the number of records read and a list of error messages. | ||
1924 : | |||
1925 : | =back | ||
1926 : | |||
1927 : | =cut | ||
1928 : | |||
1929 : | sub _LoadRelation { | ||
1930 : | # Get the parameters. | ||
1931 : | parrello | 1.4 | my ($self, $directoryName, $relationName, $rebuild) = @_; |
1932 : | parrello | 1.1 | # Create the file name. |
1933 : | my $fileName = "$directoryName/$relationName"; | ||
1934 : | # If the file doesn't exist, try adding the .dtx suffix. | ||
1935 : | if (! -e $fileName) { | ||
1936 : | $fileName .= ".dtx"; | ||
1937 : | if (! -e $fileName) { | ||
1938 : | $fileName = ""; | ||
1939 : | } | ||
1940 : | } | ||
1941 : | # Create the return object. | ||
1942 : | my $retVal = _GetLoadStats(); | ||
1943 : | # If a file exists to load the table, its name will be in $fileName. Otherwise, $fileName will | ||
1944 : | # be a null string. | ||
1945 : | if ($fileName ne "") { | ||
1946 : | # Load the relation from the file. | ||
1947 : | $retVal = $self->LoadTable($fileName, $relationName, $rebuild); | ||
1948 : | } elsif ($rebuild) { | ||
1949 : | # Here we are rebuilding, but no file exists, so we just re-create the table. | ||
1950 : | $self->CreateTable($relationName, 1); | ||
1951 : | } | ||
1952 : | # Return the statistics from the load. | ||
1953 : | return $retVal; | ||
1954 : | } | ||
1955 : | |||
1956 : | =head3 LoadMetaData | ||
1957 : | |||
1958 : | This method loads the data describing this database from an XML file into a metadata structure. | ||
1959 : | The resulting structure is a set of nested hash tables containing all the information needed to | ||
1960 : | load or use the database. The schema for the XML file is F<ERDatabase.xml>. | ||
1961 : | |||
1962 : | This is a static method. | ||
1963 : | |||
1964 : | =over 4 | ||
1965 : | |||
1966 : | =item filename | ||
1967 : | |||
1968 : | Name of the file containing the database definition. | ||
1969 : | |||
1970 : | =item RETURN | ||
1971 : | |||
1972 : | Returns a structure describing the database. | ||
1973 : | |||
1974 : | =back | ||
1975 : | |||
1976 : | =cut | ||
1977 : | |||
1978 : | sub _LoadMetaData { | ||
1979 : | # Get the parameters. | ||
1980 : | my ($filename) = @_; | ||
1981 : | # Slurp the XML file into a variable. Extensive use of options is used to insure we | ||
1982 : | # get the exact structure we want. | ||
1983 : | my $metadata = XML::Simple::XMLin($filename, | ||
1984 : | GroupTags => { Relationships => 'Relationship', | ||
1985 : | Entities => 'Entity', | ||
1986 : | Fields => 'Field', | ||
1987 : | Indexes => 'Index', | ||
1988 : | IndexFields => 'IndexField'}, | ||
1989 : | KeyAttr => { Relationship => 'name', | ||
1990 : | Entity => 'name', | ||
1991 : | Field => 'name'}, | ||
1992 : | ForceArray => ['Field', 'Index', 'IndexField'], | ||
1993 : | ForceContent => 1, | ||
1994 : | NormalizeSpace => 2 | ||
1995 : | ); | ||
1996 : | Trace("XML metadata loaded from file $filename.") if T(1); | ||
1997 : | # Before we go any farther, we need to validate the field and object names. If an error is found, | ||
1998 : | # the method below will fail. | ||
1999 : | _ValidateFieldNames($metadata); | ||
2000 : | # Next we need to create a hash table for finding relations. The entities and relationships are | ||
2001 : | # implemented as one or more database relations. | ||
2002 : | my %masterRelationTable = (); | ||
2003 : | # Loop through the entities. | ||
2004 : | my $entityList = $metadata->{Entities}; | ||
2005 : | parrello | 1.6 | for my $entityName (keys %{$entityList}) { |
2006 : | my $entityStructure = $entityList->{$entityName}; | ||
2007 : | parrello | 1.1 | # |
2008 : | # The first step is to run creating all the entity's default values. For C<Field> elements, | ||
2009 : | # the relation name must be added where it is not specified. For relationships, | ||
2010 : | # the B<from-link> and B<to-link> fields must be inserted, and for entities an B<id> | ||
2011 : | # field must be added to each relation. Finally, each field will have a C<PrettySort> attribute | ||
2012 : | # added that can be used to pull the implicit fields to the top when displaying the field | ||
2013 : | # documentation. The PrettySort values are 1-based and indicate in which pass through a | ||
2014 : | # relation's data the field should be displayed-- 1 for the first pass, 2 for the second, | ||
2015 : | # and so on. | ||
2016 : | # | ||
2017 : | # Fix up this entity. | ||
2018 : | _FixupFields($entityStructure, $entityName, 2, 3); | ||
2019 : | # Add the ID field. | ||
2020 : | _AddField($entityStructure, 'id', { type => $entityStructure->{keyType}, | ||
2021 : | relation => $entityName, | ||
2022 : | Notes => { content => "Unique identifier for this \[b\]$entityName\[/b\]." }, | ||
2023 : | PrettySort => 1}); | ||
2024 : | # | ||
2025 : | # The current field list enables us to quickly find the relation containing a particular field. | ||
2026 : | # We also need a list that tells us the fields in each relation. We do this by creating a | ||
2027 : | # Relations structure in the entity structure and collating the fields into it based on their | ||
2028 : | # C<relation> property. There is one tricky bit, which is that every relation has to have the | ||
2029 : | # C<id> field in it. Note also that the field list is put into a C<Fields> member of the | ||
2030 : | # relation's structure so that it looks more like the entity and relationship structures. | ||
2031 : | # | ||
2032 : | # First we need to create the relations list. | ||
2033 : | my $relationTable = { }; | ||
2034 : | # Loop through the fields. We use a list of field names to prevent a problem with | ||
2035 : | # the hash table cursor losing its place during the loop. | ||
2036 : | my $fieldList = $entityStructure->{Fields}; | ||
2037 : | my @fieldNames = keys %{$fieldList}; | ||
2038 : | for my $fieldName (@fieldNames) { | ||
2039 : | my $fieldData = $fieldList->{$fieldName}; | ||
2040 : | # Get the current field's relation name. | ||
2041 : | my $relationName = $fieldData->{relation}; | ||
2042 : | # Insure the relation exists. | ||
2043 : | if (!exists $relationTable->{$relationName}) { | ||
2044 : | $relationTable->{$relationName} = { Fields => { } }; | ||
2045 : | } | ||
2046 : | # Add the field to the relation's field structure. | ||
2047 : | $relationTable->{$relationName}->{Fields}->{$fieldName} = $fieldData; | ||
2048 : | } | ||
2049 : | # Now that we've organized all our fields by relation name we need to do some serious | ||
2050 : | # housekeeping. We must add the C<id> field to every relation and convert each relation | ||
2051 : | # to a list of fields. First, we need the ID field itself. | ||
2052 : | my $idField = $fieldList->{id}; | ||
2053 : | # Loop through the relations. | ||
2054 : | parrello | 1.6 | for my $relationName (keys %{$relationTable}) { |
2055 : | my $relation = $relationTable->{$relationName}; | ||
2056 : | parrello | 1.1 | # Get the relation's field list. |
2057 : | my $relationFieldList = $relation->{Fields}; | ||
2058 : | # Add the ID field to it. If the field's already there, it will not make any | ||
2059 : | # difference. | ||
2060 : | $relationFieldList->{id} = $idField; | ||
2061 : | # Convert the field set from a hash into a list using the pretty-sort number. | ||
2062 : | $relation->{Fields} = _ReOrderRelationTable($relationFieldList); | ||
2063 : | # Add the relation to the master table. | ||
2064 : | $masterRelationTable{$relationName} = $relation; | ||
2065 : | } | ||
2066 : | # The indexes come next. The primary relation will have a unique-keyed index based on the ID field. | ||
2067 : | # The other relations must have at least one index that begins with the ID field. In addition, the | ||
2068 : | # metadata may require alternate indexes. We do those alternate indexes first. To begin, we need to | ||
2069 : | # get the entity's field list and index list. | ||
2070 : | my $indexList = $entityStructure->{Indexes}; | ||
2071 : | # Loop through the indexes. | ||
2072 : | for my $indexData (@{$indexList}) { | ||
2073 : | # We need to find this index's fields. All of them should belong to the same relation. | ||
2074 : | # The ID field is an exception, since it's in all relations. | ||
2075 : | my $relationName = '0'; | ||
2076 : | for my $fieldDescriptor (@{$indexData->{IndexFields}}) { | ||
2077 : | # Get this field's name. | ||
2078 : | my $fieldName = $fieldDescriptor->{name}; | ||
2079 : | # Only proceed if it is NOT the ID field. | ||
2080 : | if ($fieldName ne 'id') { | ||
2081 : | # Find the relation containing the current index field. | ||
2082 : | my $thisName = $fieldList->{$fieldName}->{relation}; | ||
2083 : | if ($relationName eq '0') { | ||
2084 : | # Here we're looking at the first field, so we save its relation name. | ||
2085 : | $relationName = $thisName; | ||
2086 : | } elsif ($relationName ne $thisName) { | ||
2087 : | # Here we have a field mismatch. | ||
2088 : | Confess("Mixed index: field $fieldName does not belong to relation $relationName."); | ||
2089 : | } | ||
2090 : | } | ||
2091 : | } | ||
2092 : | # Now $relationName is the name of the relation that contains this index. Add the index structure | ||
2093 : | # to the relation. | ||
2094 : | push @{$relationTable->{$relationName}->{Indexes}}, $indexData; | ||
2095 : | } | ||
2096 : | # Now each index has been put in a relation. We need to add the primary index for the primary | ||
2097 : | # relation. | ||
2098 : | push @{$relationTable->{$entityName}->{Indexes}}, | ||
2099 : | { IndexFields => [ {name => 'id', order => 'ascending'} ], Unique => 'true', | ||
2100 : | Notes => { content => "Primary index for $entityName." } | ||
2101 : | }; | ||
2102 : | # The next step is to insure that each relation has at least one index that begins with the ID field. | ||
2103 : | # After that, we convert each relation's index list to an index table. We first need to loop through | ||
2104 : | # the relations. | ||
2105 : | parrello | 1.6 | for my $relationName (keys %{$relationTable}) { |
2106 : | my $relation = $relationTable->{$relationName}; | ||
2107 : | parrello | 1.1 | # Get the relation's index list. |
2108 : | my $indexList = $relation->{Indexes}; | ||
2109 : | # Insure this relation has an ID index. | ||
2110 : | my $found = 0; | ||
2111 : | for my $index (@{$indexList}) { | ||
2112 : | if ($index->{IndexFields}->[0]->{name} eq "id") { | ||
2113 : | $found = 1; | ||
2114 : | } | ||
2115 : | } | ||
2116 : | if ($found == 0) { | ||
2117 : | push @{$indexList}, { IndexFields => [ {name => 'id', order => 'ascending'} ] }; | ||
2118 : | } | ||
2119 : | # Now we need to convert the relation's index list to an index table. We begin by creating | ||
2120 : | # an empty table in the relation structure. | ||
2121 : | $relation->{Indexes} = { }; | ||
2122 : | # Loop through the indexes. | ||
2123 : | my $count = 0; | ||
2124 : | for my $index (@{$indexList}) { | ||
2125 : | # Add this index to the index table. | ||
2126 : | _AddIndex("idx$relationName$count", $relation, $index); | ||
2127 : | # Increment the counter so that the next index has a different name. | ||
2128 : | $count++; | ||
2129 : | } | ||
2130 : | } | ||
2131 : | # Finally, we add the relation structure to the entity. | ||
2132 : | $entityStructure->{Relations} = $relationTable; | ||
2133 : | } | ||
2134 : | # Loop through the relationships. Relationships actually turn out to be much simpler than entities. | ||
2135 : | # For one thing, there is only a single constituent relation. | ||
2136 : | my $relationshipList = $metadata->{Relationships}; | ||
2137 : | parrello | 1.6 | for my $relationshipName (keys %{$relationshipList}) { |
2138 : | my $relationshipStructure = $relationshipList->{$relationshipName}; | ||
2139 : | parrello | 1.1 | # Fix up this relationship. |
2140 : | _FixupFields($relationshipStructure, $relationshipName, 2, 3); | ||
2141 : | # Format a description for the FROM field. | ||
2142 : | my $fromEntity = $relationshipStructure->{from}; | ||
2143 : | my $fromComment = "<b>id</b> of the source <b><a href=\"#$fromEntity\">$fromEntity</a></b>."; | ||
2144 : | # Get the FROM entity's key type. | ||
2145 : | my $fromType = $entityList->{$fromEntity}->{keyType}; | ||
2146 : | # Add the FROM field. | ||
2147 : | _AddField($relationshipStructure, 'from-link', { type => $fromType, | ||
2148 : | relation => $relationshipName, | ||
2149 : | Notes => { content => $fromComment }, | ||
2150 : | PrettySort => 1}); | ||
2151 : | # Format a description for the TO field. | ||
2152 : | my $toEntity = $relationshipStructure->{to}; | ||
2153 : | my $toComment = "<b>id</b> of the target <b><a href=\"#$toEntity\">$toEntity</a></b>."; | ||
2154 : | # Get the TO entity's key type. | ||
2155 : | my $toType = $entityList->{$toEntity}->{keyType}; | ||
2156 : | # Add the TO field. | ||
2157 : | _AddField($relationshipStructure, 'to-link', { type=> $toType, | ||
2158 : | relation => $relationshipName, | ||
2159 : | Notes => { content => $toComment }, | ||
2160 : | PrettySort => 1}); | ||
2161 : | # Create an index-free relation from the fields. | ||
2162 : | my $thisRelation = { Fields => _ReOrderRelationTable($relationshipStructure->{Fields}), | ||
2163 : | Indexes => { } }; | ||
2164 : | $relationshipStructure->{Relations} = { $relationshipName => $thisRelation }; | ||
2165 : | # Create the FROM and TO indexes. | ||
2166 : | _CreateRelationshipIndex("From", $relationshipName, $relationshipStructure); | ||
2167 : | _CreateRelationshipIndex("To", $relationshipName, $relationshipStructure); | ||
2168 : | # Add the relation to the master table. | ||
2169 : | $masterRelationTable{$relationshipName} = $thisRelation; | ||
2170 : | } | ||
2171 : | # Now store the master relation table in the metadata structure. | ||
2172 : | $metadata->{RelationTable} = \%masterRelationTable; | ||
2173 : | # Our final task is to create the join table. The join table is a hash that describes all | ||
2174 : | # the join clauses for traveling through the relationships. The join clause is an equality | ||
2175 : | # condition that can be put into a WHERE clause in order to join two objects. Two relationships | ||
2176 : | # can be joined if they share an entity in common; and an entity can be joined to a relationship | ||
2177 : | # if the entity is at either end of the relationship. | ||
2178 : | my %joinTable = (); | ||
2179 : | # Loop through the entities. | ||
2180 : | for my $entityName (keys %{$entityList}) { | ||
2181 : | # Build three lists of the relationships connected to this entity. One will be | ||
2182 : | # for relationships from the entity, one for relationships to the entity, and | ||
2183 : | # one for recursive relationships. | ||
2184 : | my @fromList = (); | ||
2185 : | my @toList = (); | ||
2186 : | my @bothList = (); | ||
2187 : | parrello | 1.6 | Trace("Join table build for $entityName.") if T(3); |
2188 : | for my $relationshipName (keys %{$relationshipList}) { | ||
2189 : | my $relationship = $relationshipList->{$relationshipName}; | ||
2190 : | parrello | 1.1 | # Determine if this relationship has our entity in one of its link fields. |
2191 : | parrello | 1.6 | my $fromEntity = $relationship->{from}; |
2192 : | my $toEntity = $relationship->{to}; | ||
2193 : | Trace("Join check for relationship $relationshipName from $fromEntity to $toEntity.") if T(3); | ||
2194 : | if ($fromEntity eq $entityName) { | ||
2195 : | if ($toEntity eq $entityName) { | ||
2196 : | parrello | 1.1 | # Here the relationship is recursive. |
2197 : | push @bothList, $relationshipName; | ||
2198 : | parrello | 1.6 | Trace("Relationship $relationshipName put in both-list.") if T(3); |
2199 : | parrello | 1.1 | } else { |
2200 : | # Here the relationship comes from the entity. | ||
2201 : | push @fromList, $relationshipName; | ||
2202 : | parrello | 1.6 | Trace("Relationship $relationshipName put in from-list.") if T(3); |
2203 : | parrello | 1.1 | } |
2204 : | parrello | 1.6 | } elsif ($toEntity eq $entityName) { |
2205 : | parrello | 1.1 | # Here the relationship goes to the entity. |
2206 : | push @toList, $relationshipName; | ||
2207 : | parrello | 1.6 | Trace("Relationship $relationshipName put in to-list.") if T(3); |
2208 : | parrello | 1.1 | } |
2209 : | } | ||
2210 : | # Create the nonrecursive joins. Note that we build two hashes for running | ||
2211 : | # through the nonrecursive relationships since we'll have an outer loop | ||
2212 : | # and an inner loop, and we can't do two "each" iterations on the same | ||
2213 : | # hash table at the same time. | ||
2214 : | my %directRelationships = ( from => \@fromList, to => \@toList ); | ||
2215 : | my %otherRelationships = ( from => \@fromList, to => \@toList ); | ||
2216 : | parrello | 1.6 | for my $linkType (keys %directRelationships) { |
2217 : | my $relationships = $directRelationships{$linkType}; | ||
2218 : | parrello | 1.1 | # Loop through all the relationships. |
2219 : | for my $relationshipName (@{$relationships}) { | ||
2220 : | # Create joins between the entity and this relationship. | ||
2221 : | my $linkField = "$relationshipName.${linkType}_link"; | ||
2222 : | my $joinClause = "$entityName.id = $linkField"; | ||
2223 : | parrello | 1.6 | Trace("Entity join clause is $joinClause for $entityName and $relationshipName.") if T(4); |
2224 : | parrello | 1.1 | $joinTable{"$entityName/$relationshipName"} = $joinClause; |
2225 : | $joinTable{"$relationshipName/$entityName"} = $joinClause; | ||
2226 : | # Create joins between this relationship and the other relationships. | ||
2227 : | parrello | 1.6 | for my $otherType (keys %otherRelationships) { |
2228 : | my $otherships = $otherRelationships{$otherType}; | ||
2229 : | parrello | 1.1 | for my $otherName (@{$otherships}) { |
2230 : | # Get the key for this join. | ||
2231 : | my $joinKey = "$otherName/$relationshipName"; | ||
2232 : | # Check for a duplicate or a self-join. | ||
2233 : | if (exists $joinTable{$joinKey}) { | ||
2234 : | # Here we have a duplicate, which means that the join | ||
2235 : | # path is ambiguous. We delete the join from the join | ||
2236 : | # table to prevent it from being used. | ||
2237 : | delete $joinTable{$joinKey}; | ||
2238 : | parrello | 1.6 | Trace("Deleting ambiguous join $joinKey.") if T(4); |
2239 : | parrello | 1.1 | } elsif ($otherName ne $relationshipName) { |
2240 : | # Here we have a valid join. Note that joins between a | ||
2241 : | # relationship and itself are prohibited. | ||
2242 : | parrello | 1.6 | my $relJoinClause = "$otherName.${otherType}_link = $linkField"; |
2243 : | $joinTable{$joinKey} = $relJoinClause; | ||
2244 : | Trace("Relationship join clause is $relJoinClause for $joinKey.") if T(4); | ||
2245 : | parrello | 1.1 | } |
2246 : | } | ||
2247 : | } | ||
2248 : | # Create joins between this relationship and the recursive relationships. | ||
2249 : | # We don't need to check for ambiguous joins here, because a recursive | ||
2250 : | # relationship can only be ambiguous with another recursive relationship, | ||
2251 : | # and the incoming relationship from the outer loop is never recursive. | ||
2252 : | for my $otherName (@bothList) { | ||
2253 : | parrello | 1.6 | Trace("Setting up relationship joins to recursive relationship $otherName with $relationshipName.") if T(3); |
2254 : | parrello | 1.1 | # Join from the left. |
2255 : | $joinTable{"$relationshipName/$otherName"} = | ||
2256 : | "$linkField = $otherName.from_link"; | ||
2257 : | # Join from the right. | ||
2258 : | $joinTable{"$otherName/$relationshipName"} = | ||
2259 : | "$otherName.to_link = $linkField"; | ||
2260 : | } | ||
2261 : | } | ||
2262 : | } | ||
2263 : | # Create entity joins for the recursive relationships. Unlike the non-recursive | ||
2264 : | # joins, the direction makes a difference with the recursive joins. This can give | ||
2265 : | # rise to situations where we can't create the path we want; however, it is always | ||
2266 : | # possible to get the same effect using multiple queries. | ||
2267 : | for my $relationshipName (@bothList) { | ||
2268 : | parrello | 1.6 | Trace("Setting up entity joins to recursive relationship $relationshipName with $entityName.") if T(3); |
2269 : | parrello | 1.1 | # Join to the entity from each direction. |
2270 : | $joinTable{"$entityName/$relationshipName"} = | ||
2271 : | "$entityName.id = $relationshipName.from_link"; | ||
2272 : | $joinTable{"$relationshipName/$entityName"} = | ||
2273 : | "$relationshipName.to_link = $entityName.id"; | ||
2274 : | } | ||
2275 : | } | ||
2276 : | # Add the join table to the structure. | ||
2277 : | $metadata->{Joins} = \%joinTable; | ||
2278 : | # Return the slurped and fixed-up structure. | ||
2279 : | return $metadata; | ||
2280 : | } | ||
2281 : | |||
2282 : | =head3 CreateRelationshipIndex | ||
2283 : | |||
2284 : | Create an index for a relationship's relation. | ||
2285 : | |||
2286 : | This is a static method. | ||
2287 : | |||
2288 : | =over 4 | ||
2289 : | |||
2290 : | =item indexKey | ||
2291 : | |||
2292 : | Type of index: either C<"From"> or C<"To">. | ||
2293 : | |||
2294 : | =item relationshipName | ||
2295 : | |||
2296 : | Name of the relationship. | ||
2297 : | |||
2298 : | =item relationshipStructure | ||
2299 : | |||
2300 : | Structure describing the relationship that the index will sort. | ||
2301 : | |||
2302 : | =back | ||
2303 : | |||
2304 : | =cut | ||
2305 : | |||
2306 : | sub _CreateRelationshipIndex { | ||
2307 : | # Get the parameters. | ||
2308 : | my ($indexKey, $relationshipName, $relationshipStructure) = @_; | ||
2309 : | # Get the target relation. | ||
2310 : | my $relationStructure = $relationshipStructure->{Relations}->{$relationshipName}; | ||
2311 : | # Create a descriptor for the link field that goes at the beginning of this index. | ||
2312 : | my $firstField = { name => lcfirst $indexKey . '-link', order => 'ascending' }; | ||
2313 : | # Get the target index descriptor. | ||
2314 : | my $newIndex = $relationshipStructure->{$indexKey . "Index"}; | ||
2315 : | # Add the first field to the index's field list. Due to the craziness of PERL, if the | ||
2316 : | # index descriptor does not exist, it will be created automatically so we can add | ||
2317 : | # the field to it. | ||
2318 : | unshift @{$newIndex->{IndexFields}}, $firstField; | ||
2319 : | # Add the index to the relation. | ||
2320 : | _AddIndex("idx$relationshipName$indexKey", $relationStructure, $newIndex); | ||
2321 : | } | ||
2322 : | |||
2323 : | =head3 AddIndex | ||
2324 : | |||
2325 : | Add an index to a relation structure. | ||
2326 : | |||
2327 : | This is a static method. | ||
2328 : | |||
2329 : | =over 4 | ||
2330 : | |||
2331 : | =item indexName | ||
2332 : | |||
2333 : | Name to give to the new index. | ||
2334 : | |||
2335 : | =item relationStructure | ||
2336 : | |||
2337 : | Relation structure to which the new index should be added. | ||
2338 : | |||
2339 : | =item newIndex | ||
2340 : | |||
2341 : | New index to add. | ||
2342 : | |||
2343 : | =back | ||
2344 : | |||
2345 : | =cut | ||
2346 : | |||
2347 : | sub _AddIndex { | ||
2348 : | # Get the parameters. | ||
2349 : | my ($indexName, $relationStructure, $newIndex) = @_; | ||
2350 : | # We want to re-do the index's field list. Instead of an object for each field, | ||
2351 : | # we want a string consisting of the field name optionally followed by the token DESC. | ||
2352 : | my @fieldList = ( ); | ||
2353 : | for my $field (@{$newIndex->{IndexFields}}) { | ||
2354 : | # Create a string containing the field name. | ||
2355 : | my $fieldString = $field->{name}; | ||
2356 : | # Add the ordering token if needed. | ||
2357 : | if ($field->{order} eq "descending") { | ||
2358 : | $fieldString .= " DESC"; | ||
2359 : | } | ||
2360 : | # Push the result onto the field list. | ||
2361 : | push @fieldList, $fieldString; | ||
2362 : | } | ||
2363 : | # Store the field list just created as the new index field list. | ||
2364 : | $newIndex->{IndexFields} = \@fieldList; | ||
2365 : | # Add the index to the relation's index list. | ||
2366 : | $relationStructure->{Indexes}->{$indexName} = $newIndex; | ||
2367 : | } | ||
2368 : | |||
2369 : | =head3 FixupFields | ||
2370 : | |||
2371 : | This method fixes the field list for an entity or relationship. It will add the caller-specified | ||
2372 : | relation name to fields that do not have a name and set the C<PrettySort> value as specified. | ||
2373 : | |||
2374 : | This is a static method. | ||
2375 : | |||
2376 : | =over 4 | ||
2377 : | |||
2378 : | =item structure | ||
2379 : | |||
2380 : | Entity or relationship structure to be fixed up. | ||
2381 : | |||
2382 : | =item defaultRelationName | ||
2383 : | |||
2384 : | Default relation name to be added to the fields. | ||
2385 : | |||
2386 : | =item prettySortValue | ||
2387 : | |||
2388 : | C<PrettySort> value for the relation's normal fields. | ||
2389 : | |||
2390 : | =item textPrettySortValue | ||
2391 : | |||
2392 : | C<PrettySort> value for the relation's text fields. This value can be set to one greater than the | ||
2393 : | normal pretty sort value so that text fields go at the end of each relation. | ||
2394 : | |||
2395 : | =back | ||
2396 : | |||
2397 : | =cut | ||
2398 : | |||
2399 : | sub _FixupFields { | ||
2400 : | # Get the parameters. | ||
2401 : | my ($structure, $defaultRelationName, $prettySortValue, $textPrettySortValue) = @_; | ||
2402 : | # Insure the structure has a field list. | ||
2403 : | if (!exists $structure->{Fields}) { | ||
2404 : | # Here it doesn't, so we create a new one. | ||
2405 : | $structure->{Fields} = { }; | ||
2406 : | } else { | ||
2407 : | # Here we have a field list. Loop through its fields. | ||
2408 : | parrello | 1.6 | my $fieldStructures = $structure->{Fields}; |
2409 : | for my $fieldName (keys %{$fieldStructures}) { | ||
2410 : | parrello | 1.8 | Trace("Processing field $fieldName of $defaultRelationName.") if T(4); |
2411 : | parrello | 1.6 | my $fieldData = $fieldStructures->{$fieldName}; |
2412 : | parrello | 1.1 | # Get the field type. |
2413 : | my $type = $fieldData->{type}; | ||
2414 : | # Plug in a relation name if it is needed. | ||
2415 : | Tracer::MergeOptions($fieldData, { relation => $defaultRelationName }); | ||
2416 : | # Plug in a data generator if we need one. | ||
2417 : | if (!exists $fieldData->{DataGen}) { | ||
2418 : | # The data generator will use the default for the field's type. | ||
2419 : | $fieldData->{DataGen} = { content => $TypeTable{$type}->{dataGen} }; | ||
2420 : | } | ||
2421 : | # Plug in the defaults for the optional data generation parameters. | ||
2422 : | Tracer::MergeOptions($fieldData->{DataGen}, { testCount => 1, pass => 0 }); | ||
2423 : | # Add the PrettySortValue. | ||
2424 : | $fieldData->{PrettySort} = (($type eq "text") ? $textPrettySortValue : $prettySortValue); | ||
2425 : | } | ||
2426 : | } | ||
2427 : | } | ||
2428 : | |||
2429 : | =head3 FixName | ||
2430 : | |||
2431 : | Fix the incoming field name so that it is a legal SQL column name. | ||
2432 : | |||
2433 : | This is a static method. | ||
2434 : | |||
2435 : | =over 4 | ||
2436 : | |||
2437 : | =item fieldName | ||
2438 : | |||
2439 : | Field name to fix. | ||
2440 : | |||
2441 : | =item RETURN | ||
2442 : | |||
2443 : | Returns the fixed-up field name. | ||
2444 : | |||
2445 : | =back | ||
2446 : | |||
2447 : | =cut | ||
2448 : | |||
2449 : | sub _FixName { | ||
2450 : | # Get the parameter. | ||
2451 : | my ($fieldName) = @_; | ||
2452 : | # Replace its minus signs with underscores. | ||
2453 : | $fieldName =~ s/-/_/g; | ||
2454 : | # Return the result. | ||
2455 : | return $fieldName; | ||
2456 : | } | ||
2457 : | |||
2458 : | =head3 FixNames | ||
2459 : | |||
2460 : | Fix all the field names in a list. | ||
2461 : | |||
2462 : | This is a static method. | ||
2463 : | |||
2464 : | =over 4 | ||
2465 : | |||
2466 : | =item field1, field2, field3, ... fieldn | ||
2467 : | |||
2468 : | List of field names to fix. | ||
2469 : | |||
2470 : | =item RETURN | ||
2471 : | |||
2472 : | Returns a list of fixed-up versions of the incoming field names. | ||
2473 : | |||
2474 : | =back | ||
2475 : | |||
2476 : | =cut | ||
2477 : | |||
2478 : | sub _FixNames { | ||
2479 : | # Create the result list. | ||
2480 : | my @result = ( ); | ||
2481 : | # Loop through the incoming parameters. | ||
2482 : | for my $field (@_) { | ||
2483 : | push @result, _FixName($field); | ||
2484 : | } | ||
2485 : | # Return the result. | ||
2486 : | return @result; | ||
2487 : | } | ||
2488 : | |||
2489 : | =head3 AddField | ||
2490 : | |||
2491 : | Add a field to a field list. | ||
2492 : | |||
2493 : | This is a static method. | ||
2494 : | |||
2495 : | =over 4 | ||
2496 : | |||
2497 : | =item structure | ||
2498 : | |||
2499 : | Structure (usually an entity or relationship) that is to contain the field. | ||
2500 : | |||
2501 : | =item fieldName | ||
2502 : | |||
2503 : | Name of the new field. | ||
2504 : | |||
2505 : | =item fieldData | ||
2506 : | |||
2507 : | Structure containing the data to put in the field. | ||
2508 : | |||
2509 : | =back | ||
2510 : | |||
2511 : | =cut | ||
2512 : | |||
2513 : | sub _AddField { | ||
2514 : | # Get the parameters. | ||
2515 : | my ($structure, $fieldName, $fieldData) = @_; | ||
2516 : | # Create the field structure by copying the incoming data. | ||
2517 : | my $fieldStructure = {%{$fieldData}}; | ||
2518 : | # Get a reference to the field list itself. | ||
2519 : | my $fieldList = $structure->{Fields}; | ||
2520 : | # Add the field to the field list. | ||
2521 : | $fieldList->{$fieldName} = $fieldStructure; | ||
2522 : | } | ||
2523 : | |||
2524 : | =head3 ReOrderRelationTable | ||
2525 : | |||
2526 : | This method will take a relation table and re-sort it according to the implicit ordering of the | ||
2527 : | C<PrettySort> property. Instead of a hash based on field names, it will return a list of fields. | ||
2528 : | This requires creating a new hash that contains the field name in the C<name> property but doesn't | ||
2529 : | have the C<PrettySort> property, and then inserting that new hash into the field list. | ||
2530 : | |||
2531 : | This is a static method. | ||
2532 : | |||
2533 : | =over 4 | ||
2534 : | |||
2535 : | =item relationTable | ||
2536 : | |||
2537 : | Relation hash to be reformatted into a list. | ||
2538 : | |||
2539 : | =item RETURN | ||
2540 : | |||
2541 : | A list of field hashes. | ||
2542 : | |||
2543 : | =back | ||
2544 : | |||
2545 : | =cut | ||
2546 : | |||
2547 : | sub _ReOrderRelationTable { | ||
2548 : | # Get the parameters. | ||
2549 : | my ($relationTable) = @_; | ||
2550 : | # Create the return list. | ||
2551 : | my @resultList; | ||
2552 : | # Rather than copy all the fields in a single pass, we make multiple passes and only copy | ||
2553 : | # fields whose PrettySort value matches the current pass number. This process continues | ||
2554 : | # until we process all the fields in the relation. | ||
2555 : | my $fieldsLeft = (values %{$relationTable}); | ||
2556 : | for (my $sortPass = 1; $fieldsLeft > 0; $sortPass++) { | ||
2557 : | # Loop through the fields. Note that we lexically sort the fields. This makes field name | ||
2558 : | # secondary to pretty-sort number in the final ordering. | ||
2559 : | for my $fieldName (sort keys %{$relationTable}) { | ||
2560 : | # Get this field's data. | ||
2561 : | my $fieldData = $relationTable->{$fieldName}; | ||
2562 : | # Verify the sort pass. | ||
2563 : | if ($fieldData->{PrettySort} == $sortPass) { | ||
2564 : | # Here we're in the correct pass. Denote we've found a field. | ||
2565 : | $fieldsLeft--; | ||
2566 : | # The next step is to create the field structure. This done by copying all | ||
2567 : | # of the field elements except PrettySort and adding the name. | ||
2568 : | my %thisField; | ||
2569 : | for my $property (keys %{$fieldData}) { | ||
2570 : | if ($property ne 'PrettySort') { | ||
2571 : | $thisField{$property} = $fieldData->{$property}; | ||
2572 : | } | ||
2573 : | } | ||
2574 : | $thisField{name} = $fieldName; | ||
2575 : | # Now we add this field to the end of the result list. | ||
2576 : | push @resultList, \%thisField; | ||
2577 : | } | ||
2578 : | } | ||
2579 : | } | ||
2580 : | # Return a reference to the result list. | ||
2581 : | return \@resultList; | ||
2582 : | |||
2583 : | } | ||
2584 : | |||
2585 : | =head3 IsPrimary | ||
2586 : | |||
2587 : | Return TRUE if a specified relation is a primary relation, else FALSE. A relation is primary | ||
2588 : | if it has the same name as an entity or relationship. | ||
2589 : | |||
2590 : | This is an instance method. | ||
2591 : | |||
2592 : | =over 4 | ||
2593 : | |||
2594 : | =item relationName | ||
2595 : | |||
2596 : | Name of the relevant relation. | ||
2597 : | |||
2598 : | =item RETURN | ||
2599 : | |||
2600 : | Returns TRUE for a primary relation, else FALSE. | ||
2601 : | |||
2602 : | =back | ||
2603 : | |||
2604 : | =cut | ||
2605 : | |||
2606 : | sub _IsPrimary { | ||
2607 : | # Get the parameters. | ||
2608 : | parrello | 1.4 | my ($self, $relationName) = @_; |
2609 : | parrello | 1.1 | # Check for the relation in the entity table. |
2610 : | my $entityTable = $self->{_metaData}->{Entities}; | ||
2611 : | my $retVal = exists $entityTable->{$relationName}; | ||
2612 : | if (! $retVal) { | ||
2613 : | # Check for it in the relationship table. | ||
2614 : | my $relationshipTable = $self->{_metaData}->{Relationships}; | ||
2615 : | $retVal = exists $relationshipTable->{$relationName}; | ||
2616 : | } | ||
2617 : | # Return the determination indicator. | ||
2618 : | return $retVal; | ||
2619 : | } | ||
2620 : | |||
2621 : | =head3 FindRelation | ||
2622 : | |||
2623 : | Return the descriptor for the specified relation. | ||
2624 : | |||
2625 : | This is an instance method. | ||
2626 : | |||
2627 : | =over 4 | ||
2628 : | |||
2629 : | =item relationName | ||
2630 : | |||
2631 : | Name of the relation whose descriptor is to be returned. | ||
2632 : | |||
2633 : | =item RETURN | ||
2634 : | |||
2635 : | Returns the object that describes the relation's indexes and fields. | ||
2636 : | |||
2637 : | =back | ||
2638 : | |||
2639 : | =cut | ||
2640 : | sub _FindRelation { | ||
2641 : | # Get the parameters. | ||
2642 : | parrello | 1.4 | my ($self, $relationName) = @_; |
2643 : | parrello | 1.1 | # Get the relation's structure from the master relation table in the metadata structure. |
2644 : | my $metaData = $self->{_metaData}; | ||
2645 : | my $retVal = $metaData->{RelationTable}->{$relationName}; | ||
2646 : | # Return it to the caller. | ||
2647 : | return $retVal; | ||
2648 : | } | ||
2649 : | |||
2650 : | =head2 HTML Documentation Utility Methods | ||
2651 : | |||
2652 : | =head3 ComputeRelationshipSentence | ||
2653 : | |||
2654 : | The relationship sentence consists of the relationship name between the names of the | ||
2655 : | two related entities and an arity indicator. | ||
2656 : | |||
2657 : | This is a static method. | ||
2658 : | |||
2659 : | =over 4 | ||
2660 : | |||
2661 : | =item relationshipName | ||
2662 : | |||
2663 : | Name of the relationship. | ||
2664 : | |||
2665 : | =item relationshipStructure | ||
2666 : | |||
2667 : | Relationship structure containing the relationship's description and properties. | ||
2668 : | |||
2669 : | =item RETURN | ||
2670 : | |||
2671 : | Returns a string containing the entity names on either side of the relationship name and an | ||
2672 : | indicator of the arity. | ||
2673 : | |||
2674 : | =back | ||
2675 : | |||
2676 : | =cut | ||
2677 : | |||
2678 : | sub _ComputeRelationshipSentence { | ||
2679 : | # Get the parameters. | ||
2680 : | my ($relationshipName, $relationshipStructure) = @_; | ||
2681 : | # Format the relationship sentence. | ||
2682 : | my $result = "$relationshipStructure->{from} <b>$relationshipName</b> $relationshipStructure->{to}"; | ||
2683 : | # Compute the arity. | ||
2684 : | my $arityCode = $relationshipStructure->{arity}; | ||
2685 : | my $arity = $ArityTable{$arityCode}; | ||
2686 : | $result .= " ($arity)"; | ||
2687 : | return $result; | ||
2688 : | } | ||
2689 : | |||
2690 : | =head3 ComputeRelationshipHeading | ||
2691 : | |||
2692 : | The relationship heading is the L<relationship sentence|/ComputeRelationshipSentence> with the entity | ||
2693 : | names hyperlinked to the appropriate entity sections of the document. | ||
2694 : | |||
2695 : | This is a static method. | ||
2696 : | |||
2697 : | =over 4 | ||
2698 : | |||
2699 : | =item relationshipName | ||
2700 : | |||
2701 : | Name of the relationship. | ||
2702 : | |||
2703 : | =item relationshipStructure | ||
2704 : | |||
2705 : | Relationship structure containing the relationship's description and properties. | ||
2706 : | |||
2707 : | =item RETURN | ||
2708 : | |||
2709 : | Returns a string containing the entity names on either side of the relationship name with the entity | ||
2710 : | names hyperlinked. | ||
2711 : | |||
2712 : | =back | ||
2713 : | |||
2714 : | =cut | ||
2715 : | |||
2716 : | sub _ComputeRelationshipHeading { | ||
2717 : | # Get the parameters. | ||
2718 : | my ($relationshipName, $relationshipStructure) = @_; | ||
2719 : | # Get the FROM and TO entity names. | ||
2720 : | my $fromEntity = $relationshipStructure->{from}; | ||
2721 : | my $toEntity = $relationshipStructure->{to}; | ||
2722 : | # Format a relationship sentence with hyperlinks in it. | ||
2723 : | my $result = "<a href=\"#$fromEntity\">$fromEntity</a> $relationshipName <a href=\"#$toEntity\">$toEntity</a>"; | ||
2724 : | return $result; | ||
2725 : | } | ||
2726 : | |||
2727 : | =head3 ShowRelationTable | ||
2728 : | |||
2729 : | Generate the HTML string for a particular relation. The relation's data will be formatted as an HTML | ||
2730 : | table with three columns-- the field name, the field type, and the field description. | ||
2731 : | |||
2732 : | This is a static method. | ||
2733 : | |||
2734 : | =over 4 | ||
2735 : | |||
2736 : | =item relationName | ||
2737 : | |||
2738 : | Name of the relation being formatted. | ||
2739 : | |||
2740 : | =item relationData | ||
2741 : | |||
2742 : | Hash containing the relation's fields and indexes. | ||
2743 : | |||
2744 : | =item RETURN | ||
2745 : | |||
2746 : | Returns an HTML string that can be used to display the relation name and all of its fields. | ||
2747 : | |||
2748 : | =back | ||
2749 : | |||
2750 : | =cut | ||
2751 : | |||
2752 : | sub _ShowRelationTable { | ||
2753 : | # Get the parameters. | ||
2754 : | my ($relationName, $relationData) = @_; | ||
2755 : | # Start the relation's field table. | ||
2756 : | my $htmlString = _OpenFieldTable($relationName); | ||
2757 : | # Loop through the fields. | ||
2758 : | for my $field (@{$relationData->{Fields}}) { | ||
2759 : | $htmlString .= _ShowField($field); | ||
2760 : | } | ||
2761 : | # Close this relation's field table. | ||
2762 : | $htmlString .= &_CloseTable; | ||
2763 : | # Now we show the relation's indexes. | ||
2764 : | $htmlString .= "<ul>\n"; | ||
2765 : | my $indexTable = $relationData->{Indexes}; | ||
2766 : | for my $indexName (sort keys %{$indexTable}) { | ||
2767 : | my $indexData = $indexTable->{$indexName}; | ||
2768 : | # Determine whether or not the index is unique. | ||
2769 : | my $fullName = $indexName; | ||
2770 : | parrello | 1.5 | if (exists $indexData->{Unique} && $indexData->{Unique} eq "true") { |
2771 : | parrello | 1.1 | $fullName .= " (unique)"; |
2772 : | } | ||
2773 : | # Start an HTML list item for this index. | ||
2774 : | $htmlString .= "<li><b>Index $fullName</b>\n<ul>\n"; | ||
2775 : | # Add any note text. | ||
2776 : | if (my $note = $indexData->{Notes}) { | ||
2777 : | $htmlString .= "<li>" . _HTMLNote($note->{content}) . "</li>\n"; | ||
2778 : | } | ||
2779 : | # Add the fiield list. | ||
2780 : | $htmlString .= "<li><i>" . join(', ', @{$indexData->{IndexFields}}) . "</i></li>\n"; | ||
2781 : | # Close this entry. | ||
2782 : | $htmlString .= "</ul></li>\n"; | ||
2783 : | } | ||
2784 : | # Close off the index list. | ||
2785 : | $htmlString .= "</ul>\n"; | ||
2786 : | } | ||
2787 : | |||
2788 : | =head3 OpenFieldTable | ||
2789 : | |||
2790 : | This method creates the header string for the field table generated by L</ShowMetaData>. | ||
2791 : | |||
2792 : | This is a static method. | ||
2793 : | |||
2794 : | =over 4 | ||
2795 : | |||
2796 : | =item tablename | ||
2797 : | |||
2798 : | Name of the table whose fields will be displayed. | ||
2799 : | |||
2800 : | =item RETURN | ||
2801 : | |||
2802 : | Returns a string containing the HTML for a field table's header. | ||
2803 : | |||
2804 : | =back | ||
2805 : | |||
2806 : | =cut | ||
2807 : | |||
2808 : | sub _OpenFieldTable { | ||
2809 : | my ($tablename) = @_; | ||
2810 : | return _OpenTable($tablename, 'Field', 'Type', 'Description'); | ||
2811 : | } | ||
2812 : | |||
2813 : | =head3 OpenTable | ||
2814 : | |||
2815 : | This method creates the header string for an HTML table. | ||
2816 : | |||
2817 : | This is a static method. | ||
2818 : | |||
2819 : | =over 4 | ||
2820 : | |||
2821 : | =item tablename | ||
2822 : | |||
2823 : | Title of the table. | ||
2824 : | |||
2825 : | =item colName1, colName2, ..., colNameN | ||
2826 : | |||
2827 : | List of column names. | ||
2828 : | |||
2829 : | =item RETURN | ||
2830 : | |||
2831 : | Returns a string containing the HTML for the desired table's header. | ||
2832 : | |||
2833 : | =back | ||
2834 : | |||
2835 : | =cut | ||
2836 : | |||
2837 : | sub _OpenTable { | ||
2838 : | # Get the parameters. | ||
2839 : | my ($tablename, @colNames) = @_; | ||
2840 : | # Compute the number of columns. | ||
2841 : | my $colCount = @colNames; | ||
2842 : | # Generate the title row. | ||
2843 : | my $htmlString = "<p><table border=\"2\"><tr><td colspan=\"$colCount\" align=\"center\">$tablename</td></tr>\n"; | ||
2844 : | # Loop through the columns, adding the column header rows. | ||
2845 : | $htmlString .= "<tr>"; | ||
2846 : | for my $colName (@colNames) { | ||
2847 : | $htmlString .= "<th>$colName</th>"; | ||
2848 : | } | ||
2849 : | $htmlString .= "</tr>\n"; | ||
2850 : | return $htmlString; | ||
2851 : | } | ||
2852 : | |||
2853 : | =head3 CloseTable | ||
2854 : | |||
2855 : | This method returns the HTML for closing a table. | ||
2856 : | |||
2857 : | This is a static method. | ||
2858 : | |||
2859 : | =cut | ||
2860 : | |||
2861 : | sub _CloseTable { | ||
2862 : | return "</table></p>\n"; | ||
2863 : | } | ||
2864 : | |||
2865 : | =head3 ShowField | ||
2866 : | |||
2867 : | This method returns the HTML for displaying a row of field information in a field table. | ||
2868 : | |||
2869 : | This is a static method. | ||
2870 : | |||
2871 : | =over 4 | ||
2872 : | |||
2873 : | =item fieldData | ||
2874 : | |||
2875 : | Table of data about the field. | ||
2876 : | |||
2877 : | =item RETURN | ||
2878 : | |||
2879 : | Returns an HTML string for a table row that shows the field's name, type, and description. | ||
2880 : | |||
2881 : | =back | ||
2882 : | |||
2883 : | =cut | ||
2884 : | |||
2885 : | sub _ShowField { | ||
2886 : | # Get the parameters. | ||
2887 : | my ($fieldData) = @_; | ||
2888 : | # Create the HTML string. | ||
2889 : | my $htmlString = "<tr><th align=\"left\">$fieldData->{name}</th><td>$fieldData->{type}</td>"; | ||
2890 : | # If we have content, add it as a third column. | ||
2891 : | if (exists $fieldData->{Notes}) { | ||
2892 : | $htmlString .= "<td>" . _HTMLNote($fieldData->{Notes}->{content}) . "</td>"; | ||
2893 : | } | ||
2894 : | # Close off the row. | ||
2895 : | $htmlString .= "</tr>\n"; | ||
2896 : | # Return the result. | ||
2897 : | return $htmlString; | ||
2898 : | } | ||
2899 : | |||
2900 : | =head3 HTMLNote | ||
2901 : | |||
2902 : | Convert a note or comment to HTML by replacing some bulletin-board codes with HTML. The codes | ||
2903 : | supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a new paragraph. | ||
2904 : | Except for C<[p]>, all the codes are closed by slash-codes. So, for | ||
2905 : | example, C<[b]Feature[/b]> displays the string C<Feature> in boldface. | ||
2906 : | |||
2907 : | This is a static method. | ||
2908 : | |||
2909 : | =over 4 | ||
2910 : | |||
2911 : | =item dataString | ||
2912 : | |||
2913 : | String to convert to HTML. | ||
2914 : | |||
2915 : | =item RETURN | ||
2916 : | |||
2917 : | An HTML string derived from the input string. | ||
2918 : | |||
2919 : | =back | ||
2920 : | |||
2921 : | =cut | ||
2922 : | |||
2923 : | sub _HTMLNote { | ||
2924 : | # Get the parameter. | ||
2925 : | my ($dataString) = @_; | ||
2926 : | # Substitute the codes. | ||
2927 : | $dataString =~ s!\[(/?[bi])\]!<$1>!g; | ||
2928 : | $dataString =~ s!\[p\]!</p><p>!g; | ||
2929 : | # Return the result. | ||
2930 : | return $dataString; | ||
2931 : | } | ||
2932 : | |||
2933 : | =head2 Data Generation Utilities | ||
2934 : | |||
2935 : | =head3 IntGen | ||
2936 : | |||
2937 : | C<< my $integer = IntGen($min, $max); >> | ||
2938 : | |||
2939 : | Returns a random number between the specified minimum and maximum (inclusive). | ||
2940 : | |||
2941 : | =over 4 | ||
2942 : | |||
2943 : | =item min | ||
2944 : | |||
2945 : | Minimum permissible return value. | ||
2946 : | |||
2947 : | =item max | ||
2948 : | |||
2949 : | Maximum permissible return value. | ||
2950 : | |||
2951 : | =item RETURN | ||
2952 : | |||
2953 : | Returns a value no lower than the minimum and no greater than the maximum. | ||
2954 : | |||
2955 : | =back | ||
2956 : | |||
2957 : | =cut | ||
2958 : | |||
2959 : | sub IntGen { | ||
2960 : | # Get the parameters. | ||
2961 : | my ($min, $max) = @_; | ||
2962 : | # Determine the range of possible values. Note we put some space well above the | ||
2963 : | # maximum value to give it a fighting chance of apppearing in the list. | ||
2964 : | my $span = $max + 0.99 - $min; | ||
2965 : | # Create an integer in the range. | ||
2966 : | my $retVal = $min + int(rand($span)); | ||
2967 : | # Return the result. | ||
2968 : | return $retVal; | ||
2969 : | } | ||
2970 : | |||
2971 : | =head3 RandChar | ||
2972 : | |||
2973 : | C<< my $char = RandChar($sourceString); >> | ||
2974 : | |||
2975 : | Select a random character from a string. | ||
2976 : | |||
2977 : | =over 4 | ||
2978 : | |||
2979 : | =item sourceString | ||
2980 : | |||
2981 : | String from which the random character should be selected. | ||
2982 : | |||
2983 : | =item RETURN | ||
2984 : | |||
2985 : | Returns a single character from the incoming string. | ||
2986 : | |||
2987 : | =back | ||
2988 : | |||
2989 : | =cut | ||
2990 : | |||
2991 : | sub RandChar { | ||
2992 : | # Get the parameter. | ||
2993 : | my ($sourceString) = @_; | ||
2994 : | # Select a random character. | ||
2995 : | my $retVal = IntGen(0, (length $sourceString) - 1); | ||
2996 : | # Return it. | ||
2997 : | return substr($sourceString, $retVal, 1); | ||
2998 : | } | ||
2999 : | |||
3000 : | =head3 RandChars | ||
3001 : | |||
3002 : | C<< my $string = RandChars($sourceString, $length); >> | ||
3003 : | |||
3004 : | Create a string from characters taken from a source string. | ||
3005 : | |||
3006 : | =over 4 | ||
3007 : | |||
3008 : | =item sourceString | ||
3009 : | |||
3010 : | String from which the random characters should be selected. | ||
3011 : | |||
3012 : | =item length | ||
3013 : | |||
3014 : | Number of characters to put in the output string. | ||
3015 : | |||
3016 : | =item RETURN | ||
3017 : | |||
3018 : | Returns a string of the specified length consisting of characters taken from the | ||
3019 : | source string. | ||
3020 : | |||
3021 : | =back | ||
3022 : | |||
3023 : | =cut | ||
3024 : | |||
3025 : | sub RandChars { | ||
3026 : | # Get the parameters. | ||
3027 : | my ($sourceString, $length) = @_; | ||
3028 : | # Call RandChar repeatedly to generate the string. | ||
3029 : | my $retVal = ""; | ||
3030 : | for (my $i = 0; $i < $length; $i++) { | ||
3031 : | $retVal .= RandChar($sourceString); | ||
3032 : | } | ||
3033 : | # Return the result. | ||
3034 : | return $retVal; | ||
3035 : | } | ||
3036 : | |||
3037 : | =head3 RandParam | ||
3038 : | |||
3039 : | C<< my $value = RandParam($parm1, $parm2, ... $parmN); >> | ||
3040 : | |||
3041 : | Return a randomly-selected value from the parameter list. | ||
3042 : | |||
3043 : | =over 4 | ||
3044 : | |||
3045 : | =item parm1, parm2, ... parmN | ||
3046 : | |||
3047 : | List of values of which one will be selected. | ||
3048 : | |||
3049 : | =item RETURN | ||
3050 : | |||
3051 : | Returns a randomly-chosen value from the specified list. | ||
3052 : | |||
3053 : | =back | ||
3054 : | |||
3055 : | =cut | ||
3056 : | |||
3057 : | sub RandParam { | ||
3058 : | # Get the parameter. | ||
3059 : | my @parms = @_; | ||
3060 : | # Choose a random parameter from the list. | ||
3061 : | my $chosenIndex = IntGen(0, $#parms); | ||
3062 : | return $parms[$chosenIndex]; | ||
3063 : | } | ||
3064 : | |||
3065 : | =head3 StringGen | ||
3066 : | |||
3067 : | C<< my $string = StringGen($pattern1, $pattern2, ... $patternN); >> | ||
3068 : | |||
3069 : | Returns a random string derived from a randomly-chosen format pattern. The pattern | ||
3070 : | can either be a number (indicating the number of characters desired, or the letter | ||
3071 : | C<P> followed by a picture. The picture should contain C<A> when a letter is desired, | ||
3072 : | C<9> when a digit is desired, C<V> when a vowel is desired, C<K> when a consonant is | ||
3073 : | desired, and C<X> when a letter or a digit is desired. Any other character will be | ||
3074 : | translated as a literal. | ||
3075 : | |||
3076 : | =over 4 | ||
3077 : | |||
3078 : | =item pattern1, pattern2, ... patternN | ||
3079 : | |||
3080 : | List of patterns to be used to generate string values. | ||
3081 : | |||
3082 : | =item RETURN | ||
3083 : | |||
3084 : | A single string generated from a pattern. | ||
3085 : | |||
3086 : | =back | ||
3087 : | |||
3088 : | =cut | ||
3089 : | |||
3090 : | sub StringGen { | ||
3091 : | # Get the parameters. | ||
3092 : | my @patterns = @_; | ||
3093 : | # Choose the appropriate pattern. | ||
3094 : | my $chosenPattern = RandParam(@patterns); | ||
3095 : | # Declare the return variable. | ||
3096 : | my $retVal = ""; | ||
3097 : | # Determine whether this is a count or a picture pattern. | ||
3098 : | if ($chosenPattern =~ m/^\d+/) { | ||
3099 : | # Here we have a count. Get the string of source characters. | ||
3100 : | my $letterString = $PictureTable{'X'}; | ||
3101 : | my $stringLen = length $letterString; | ||
3102 : | # Save the number of characters we have to generate. | ||
3103 : | my $charsLeft = $chosenPattern; | ||
3104 : | # Loop until the return variable is full. | ||
3105 : | while ($charsLeft > 0) { | ||
3106 : | # Generate a random position in the soruce string. | ||
3107 : | my $stringIndex = IntGen(0, $stringLen - 1); | ||
3108 : | # Compute the number of characters to pull out of the source string. | ||
3109 : | my $chunkSize = $stringLen - $stringIndex; | ||
3110 : | if ($chunkSize > $charsLeft) { $chunkSize = $charsLeft; } | ||
3111 : | # Stuff this chunk into the return value. | ||
3112 : | $retVal .= substr($letterString, $stringIndex, $chunkSize); | ||
3113 : | # Record the data moved. | ||
3114 : | $charsLeft -= $chunkSize; | ||
3115 : | } | ||
3116 : | } elsif ($chosenPattern =~ m/^P/) { | ||
3117 : | # Here we have a picture string. We will move through the picture one | ||
3118 : | # character at a time generating data. | ||
3119 : | for (my $i = 1; $i < length $chosenPattern; $i++) { | ||
3120 : | # Get this picture character. | ||
3121 : | my $chr = substr($chosenPattern, $i, 1); | ||
3122 : | # Check to see if the picture char is one we recognize. | ||
3123 : | if (exists $PictureTable{$chr}) { | ||
3124 : | # Choose a random character from the available values for this | ||
3125 : | # picture character. | ||
3126 : | $retVal .= RandChar($PictureTable{$chr}); | ||
3127 : | } else { | ||
3128 : | # Copy in the picture character as a literal. | ||
3129 : | $retVal .= $chr; | ||
3130 : | } | ||
3131 : | } | ||
3132 : | } else { | ||
3133 : | # Here we have neither a picture string or a letter count, so we treat | ||
3134 : | # the string as a literal. | ||
3135 : | $retVal = $chosenPattern; | ||
3136 : | } | ||
3137 : | # Return the string formed. | ||
3138 : | return $retVal; | ||
3139 : | } | ||
3140 : | |||
3141 : | =head3 DateGen | ||
3142 : | |||
3143 : | C<< my $date = DateGen($startDayOffset, $endDayOffset, $minutes); >> | ||
3144 : | |||
3145 : | Return a numeric timestamp within the specified range of days with the specified minute | ||
3146 : | value. The range of days is specified relevant to the current day. Thus, the call | ||
3147 : | |||
3148 : | C<< my $date = DateGen(-1, 5, 720); >> | ||
3149 : | |||
3150 : | will return a timestamp at noon (72 minutes past midnight) sometime during the week that | ||
3151 : | began on the preceding day. If you want a random minute of the day, simply combine with | ||
3152 : | a call to L</IntGen>, as follows. | ||
3153 : | |||
3154 : | C<< my $date = DateGen(-1, 5, IntGen(0, 1439)); >> | ||
3155 : | |||
3156 : | =over 4 | ||
3157 : | |||
3158 : | =item startDayOffset | ||
3159 : | |||
3160 : | The earliest day that can be returned, relative to the current day. | ||
3161 : | |||
3162 : | =item endDayOffset | ||
3163 : | |||
3164 : | The latest day that can be returned, related to the current day. | ||
3165 : | |||
3166 : | =item minutes | ||
3167 : | |||
3168 : | Number of minutes into the selected day that should be used. | ||
3169 : | |||
3170 : | =back | ||
3171 : | |||
3172 : | =cut | ||
3173 : | |||
3174 : | sub DateGen { | ||
3175 : | # Get the parameters. | ||
3176 : | my ($startDayOffset, $endDayOffset, $minutes) = @_; | ||
3177 : | # Get midnight of the current day. | ||
3178 : | my $now = time(); | ||
3179 : | my ($sec, $min, $hour) = localtime($now); | ||
3180 : | my $today = $now - (($hour * 60 + $min) * 60 + $sec); | ||
3181 : | # Compute the day we want. | ||
3182 : | my $newDay = IntGen($startDayOffset, $endDayOffset) * 86400 + $today; | ||
3183 : | # Add the minutes. | ||
3184 : | my $retVal = $newDay + $minutes * 60; | ||
3185 : | # Return the result. | ||
3186 : | return $retVal; | ||
3187 : | } | ||
3188 : | |||
3189 : | =head3 FloatGen | ||
3190 : | |||
3191 : | C<< my $number = FloatGen($min, $max); >> | ||
3192 : | |||
3193 : | Return a random floating-point number greater than or equal to the specified minimum and | ||
3194 : | less than the specified maximum. | ||
3195 : | |||
3196 : | =over 4 | ||
3197 : | |||
3198 : | =item min | ||
3199 : | |||
3200 : | Minimum permissible value for the number returned. | ||
3201 : | |||
3202 : | =item max | ||
3203 : | |||
3204 : | Maximum permissible value for the number returned. | ||
3205 : | |||
3206 : | =item RETURN | ||
3207 : | |||
3208 : | Returns a floating-point number anywhere in the specified range. | ||
3209 : | |||
3210 : | =back | ||
3211 : | |||
3212 : | =cut | ||
3213 : | |||
3214 : | sub FloatGen { | ||
3215 : | # Get the parameters. | ||
3216 : | my ($min, $max) = @_; | ||
3217 : | # Generate the result. | ||
3218 : | my $retVal = rand($max - $min) + $min; | ||
3219 : | return $retVal; | ||
3220 : | } | ||
3221 : | |||
3222 : | =head3 ListGen | ||
3223 : | |||
3224 : | C<< my @list = ListGen($pattern, $count); >> | ||
3225 : | |||
3226 : | Return a list containing a fixed number of randomly-generated strings. | ||
3227 : | |||
3228 : | =over 4 | ||
3229 : | |||
3230 : | =item pattern | ||
3231 : | |||
3232 : | A pattern (in the form expected by L</StringGen>) that should be used to generate the | ||
3233 : | strings in the list. | ||
3234 : | |||
3235 : | =item count | ||
3236 : | |||
3237 : | The number of list entries to generate. | ||
3238 : | |||
3239 : | =item RETURN | ||
3240 : | |||
3241 : | Returns a list consisting of the specified number of strings. | ||
3242 : | |||
3243 : | =back | ||
3244 : | |||
3245 : | =cut | ||
3246 : | |||
3247 : | sub ListGen { | ||
3248 : | # Get the parameters. | ||
3249 : | my ($pattern, $count) = @_; | ||
3250 : | # Generate the list. | ||
3251 : | my @retVal = (); | ||
3252 : | for (my $i = 0; $i < $count; $i++) { | ||
3253 : | push @retVal, StringGen($pattern); | ||
3254 : | } | ||
3255 : | # Return it. | ||
3256 : | return @retVal; | ||
3257 : | } | ||
3258 : | |||
3259 : | 1; |
MCS Webmaster | ViewVC Help |
Powered by ViewVC 1.0.3 |