Parent Directory
|
Revision Log
|
Patch
revision 1.16, Tue Jun 28 23:51:06 2005 UTC | revision 1.104, Mon Sep 22 20:33:30 2008 UTC | |
---|---|---|
# | Line 6 | Line 6 |
6 | use Data::Dumper; | use Data::Dumper; |
7 | use XML::Simple; | use XML::Simple; |
8 | use DBQuery; | use DBQuery; |
9 | use DBObject; | use ERDBObject; |
10 | use Stats; | use Stats; |
11 | use Time::HiRes qw(gettimeofday); | use Time::HiRes qw(gettimeofday); |
12 | use Digest::MD5 qw(md5_base64); | |
13 | use CGI; | |
14 | use WikiTools; | |
15 | ||
16 | =head1 Entity-Relationship Database Package | =head1 Entity-Relationship Database Package |
17 | ||
# | Line 57 | Line 60 |
60 | B<start-position>, which indicates where in the contig that the sequence begins. This attribute | B<start-position>, which indicates where in the contig that the sequence begins. This attribute |
61 | is implemented as the C<start_position> field in the C<IsMadeUpOf> relation. | is implemented as the C<start_position> field in the C<IsMadeUpOf> relation. |
62 | ||
63 | The database itself is described by an XML file using the F<ERDatabase.xsd> schema. In addition to | The database itself is described by an XML file. In addition to all the data required to define |
64 | all the data required to define the entities, relationships, and attributes, the schema provides | the entities, relationships, and attributes, the schema provides space for notes describing |
65 | space for notes describing the data and what it means. These notes are used by L</ShowMetaData> | the data and what it means. These notes are used by L</ShowMetaData> to generate documentation |
66 | to generate documentation for the database. | for the database. |
67 | ||
68 | Special support is provided for text searching. An entity field can be marked as <em>searchable</em>, | |
69 | in which case it will be used to generate a text search index in which the user searches for words | |
70 | in the field instead of a particular field value. | |
71 | ||
72 | Finally, every entity and relationship object has a flag indicating if it is new or old. The object | Finally, every entity and relationship object has a flag indicating if it is new or old. The object |
73 | is considered I<old> if it was loaded by the L</LoadTables> method. It is considered I<new> if it | is considered I<old> if it was loaded by the L</LoadTables> method. It is considered I<new> if it |
74 | was inserted by the L</InsertObject> method. | was inserted by the L</InsertObject> method. |
75 | ||
To facilitate testing, the ERDB module supports automatic generation of test data. This process | ||
is described in the L</GenerateEntity> and L</GenerateConnection> methods, though it is not yet | ||
fully implemented. | ||
76 | =head2 XML Database Description | =head2 XML Database Description |
77 | ||
78 | =head3 Data Types | =head3 Data Types |
# | Line 89 | Line 92 |
92 | ||
93 | 32-bit signed integer | 32-bit signed integer |
94 | ||
95 | =item counter | |
96 | ||
97 | 32-bit unsigned integer | |
98 | ||
99 | =item date | =item date |
100 | ||
101 | 64-bit unsigned integer, representing a PERL date/time value | 64-bit unsigned integer, representing a PERL date/time value |
102 | ||
103 | =item text | =item text |
104 | ||
105 | long string; Text fields cannot be used in indexes or sorting and do not support the | long string; Text fields do not support the normal syntax of filter clauses, |
106 | normal syntax of filter clauses, but can be up to a billion character in length | but can be up to a billion character in length |
107 | ||
108 | =item dna | |
109 | ||
110 | long string, used to store DNA and protein sequences | |
111 | ||
112 | =item image | |
113 | ||
114 | long string, used to store encoded image data | |
115 | ||
116 | =item float | =item float |
117 | ||
# | Line 108 | Line 123 |
123 | compatability with certain database packages), but the only values supported are | compatability with certain database packages), but the only values supported are |
124 | 0 and 1. | 0 and 1. |
125 | ||
126 | =item id-string | |
127 | ||
128 | variable-length string, maximum 25 characters | |
129 | ||
130 | =item key-string | =item key-string |
131 | ||
132 | variable-length string, maximum 40 characters | variable-length string, maximum 40 characters |
# | Line 124 | Line 143 |
143 | ||
144 | variable-length string, maximum 255 characters | variable-length string, maximum 255 characters |
145 | ||
146 | =item hash-string | |
147 | ||
148 | variable-length string, maximum 22 characters | |
149 | ||
150 | =back | =back |
151 | ||
152 | The hash-string data type has a special meaning. The actual key passed into the loader will | |
153 | be a string, but it will be digested into a 22-character MD5 code to save space. Although the | |
154 | MD5 algorithm is not perfect, it is extremely unlikely two strings will have the same | |
155 | digest. Therefore, it is presumed the keys will be unique. When the database is actually | |
156 | in use, the hashed keys will be presented rather than the original values. For this reason, | |
157 | they should not be used for entities where the key is meaningful. | |
158 | ||
159 | =head3 Global Tags | =head3 Global Tags |
160 | ||
161 | The entire database definition must be inside a B<Database> tag. The display name of | The entire database definition must be inside a B<Database> tag. The display name of |
# | Line 169 | Line 199 |
199 | ||
200 | Name of the field. The field name should contain only letters, digits, and hyphens (C<->), | Name of the field. The field name should contain only letters, digits, and hyphens (C<->), |
201 | and the first character should be a letter. Most underlying databases are case-insensitive | and the first character should be a letter. Most underlying databases are case-insensitive |
202 | with the respect to field names, so a best practice is to use lower-case letters only. | with the respect to field names, so a best practice is to use lower-case letters only. Finally, |
203 | the name C<search-relevance> has special meaning for full-text searches and should not be | |
204 | used as a field name. | |
205 | ||
206 | =item type | =item type |
207 | ||
# | Line 188 | Line 220 |
220 | entity, the fields without a relation attribute are said to belong to the | entity, the fields without a relation attribute are said to belong to the |
221 | I<primary relation>. This relation has the same name as the entity itself. | I<primary relation>. This relation has the same name as the entity itself. |
222 | ||
223 | =item searchable | |
224 | ||
225 | If specified, then the field is a candidate for full-text searching. A single full-text | |
226 | index will be created for each relation with at least one searchable field in it. | |
227 | For best results, this option should only be used for string or text fields. | |
228 | ||
229 | =item special | |
230 | ||
231 | This attribute allows the subclass to assign special meaning for certain fields. | |
232 | The interpretation is up to the subclass itself. Currently, only entity fields | |
233 | can have this attribute. | |
234 | ||
235 | =back | =back |
236 | ||
237 | =head3 Indexes | =head3 Indexes |
238 | ||
239 | An entity can have multiple alternate indexes associated with it. The fields must | An entity can have multiple alternate indexes associated with it. The fields in an |
240 | be from the primary relation. The alternate indexes assist in ordering results | index must all be from the same relation. The alternate indexes assist in searching |
241 | from a query. A relationship can have up to two indexes-- a I<to-index> and a | on fields other than the entity ID. A relationship has at least two indexes-- a I<to-index> and a |
242 | I<from-index>. These order the results when crossing the relationship. For | I<from-index> that order the results when crossing the relationship. For |
243 | example, in the relationship C<HasContig> from C<Genome> to C<Contig>, the | example, in the relationship C<HasContig> from C<Genome> to C<Contig>, the |
244 | from-index would order the contigs of a ganome, and the to-index would order | from-index would order the contigs of a ganome, and the to-index would order |
245 | the genomes of a contig. A relationship's index must specify only fields in | the genomes of a contig. In addition, it can have zero or more alternate |
246 | indexes. A relationship's index must specify only fields in | |
247 | the relationship. | the relationship. |
248 | ||
249 | The indexes for an entity must be listed inside the B<Indexes> tag. The from-index | The alternate indexes for an entity or relationship must be listed inside the B<Indexes> tag. |
250 | of a relationship is specified using the B<FromIndex> tag; the to-index is specified | The from-index of a relationship is specified using the B<FromIndex> tag; the to-index is |
251 | using the B<ToIndex> tag. | specified using the B<ToIndex> tag. |
252 | ||
253 | Each index can contain a B<Notes> tag. In addition, it will have an B<IndexFields> | Each index can contain a B<Notes> tag. In addition, it will have an B<IndexFields> |
254 | tag containing the B<IndexField> tags. These specify, in order, the fields used in | tag containing the B<IndexField> tags. These specify, in order, the fields used in |
# | Line 221 | Line 266 |
266 | ||
267 | =back | =back |
268 | ||
269 | The B<Index>, B<FromIndex>, and B<ToIndex> tags themselves have no attributes. | The B<FromIndex>, and B<ToIndex> tags have no attributes. The B<Index> tag can |
270 | have a B<Unique> attribute. If specified, the index will be generated as a unique | |
271 | index. | |
272 | ||
273 | =head3 Object and Field Names | =head3 Object and Field Names |
274 | ||
# | Line 265 | Line 312 |
312 | ||
313 | A relationship is described by the C<Relationship> tag. Within a relationship, | A relationship is described by the C<Relationship> tag. Within a relationship, |
314 | there can be a C<Notes> tag, a C<Fields> tag containing the intersection data | there can be a C<Notes> tag, a C<Fields> tag containing the intersection data |
315 | fields, a C<FromIndex> tag containing the from-index, and a C<ToIndex> tag containing | fields, a C<FromIndex> tag containing the from-index, a C<ToIndex> tag containing |
316 | the to-index. | the to-index, and an C<Indexes> tag containing the alternate indexes. |
317 | ||
318 | The C<Relationship> tag has the following attributes. | The C<Relationship> tag has the following attributes. |
319 | ||
# | Line 299 | Line 346 |
346 | ||
347 | # Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. | # Table of information about our datatypes. "sqlType" is the corresponding SQL datatype string. |
348 | # "maxLen" is the maximum permissible length of the incoming string data used to populate a field | # "maxLen" is the maximum permissible length of the incoming string data used to populate a field |
349 | # of the specified type. "dataGen" is PERL string that will be evaluated if no test data generation | # of the specified type. "avgLen" is the average byte length for estimating |
350 | #string is specified in the field definition. | # record sizes. "sort" is the key modifier for the sort command, "notes" is a type description, |
351 | my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, dataGen => "StringGen('A')" }, | # and "indexMod", if non-zero, is the number of characters to use when the field is specified in an |
352 | int => { sqlType => 'INTEGER', maxLen => 20, dataGen => "IntGen(0, 99999999)" }, | # index |
353 | string => { sqlType => 'VARCHAR(255)', maxLen => 255, dataGen => "StringGen(IntGen(10,250))" }, | my %TypeTable = ( char => { sqlType => 'CHAR(1)', maxLen => 1, avgLen => 1, sort => "", |
354 | text => { sqlType => 'TEXT', maxLen => 1000000000, dataGen => "StringGen(IntGen(80,1000))" }, | indexMod => 0, notes => "single ASCII character"}, |
355 | date => { sqlType => 'BIGINT', maxLen => 80, dataGen => "DateGen(-7, 7, IntGen(0,1400))" }, | int => { sqlType => 'INTEGER', maxLen => 20, avgLen => 4, sort => "n", |
356 | float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, dataGen => "FloatGen(0.0, 100.0)" }, | indexMod => 0, notes => "signed 32-bit integer"}, |
357 | boolean => { sqlType => 'SMALLINT', maxLen => 1, dataGen => "IntGen(0, 1)" }, | counter => { sqlType => 'INTEGER UNSIGNED', maxLen => 20, avgLen => 4, sort => "n", |
358 | indexMod => 0, notes => "unsigned 32-bit integer"}, | |
359 | image => { sqlType => 'TEXT', maxLen => 1000000000, avgLen => 100000, sort => "", | |
360 | indexMod => 255, notes => "UUencoded image, suitable for import into GD, should never be indexed"}, | |
361 | dna => { sqlType => 'TEXT', maxLen => 1000000000, avgLen => 100000, sort => "", | |
362 | indexMod => 255, notes => "DNA or protein sequence, should never be indexed"}, | |
363 | string => { sqlType => 'VARCHAR(255)', maxLen => 255, avgLen => 100, sort => "", | |
364 | indexMod => 0, notes => "character string, 0 to 255 characters"}, | |
365 | text => { sqlType => 'TEXT', maxLen => 1000000000, avgLen => 500, sort => "", | |
366 | indexMod => 255, notes => "character string, nearly unlimited length, only first 255 characters are indexed"}, | |
367 | date => { sqlType => 'BIGINT', maxLen => 80, avgLen => 8, sort => "n", | |
368 | indexMod => 0, notes => "signed, 64-bit integer"}, | |
369 | float => { sqlType => 'DOUBLE PRECISION', maxLen => 40, avgLen => 8, sort => "g", | |
370 | indexMod => 0, notes => "64-bit double precision floating-point number"}, | |
371 | boolean => { sqlType => 'SMALLINT', maxLen => 1, avgLen => 1, sort => "n", | |
372 | indexMod => 0, notes => "boolean value: 0 if false, 1 if true"}, | |
373 | 'hash-string' => | |
374 | { sqlType => 'VARCHAR(22)', maxLen => 22, avgLen => 22, sort => "", | |
375 | indexMod => 0, notes => "string stored in digested form, used for certain types of key fields"}, | |
376 | 'id-string' => | |
377 | { sqlType => 'VARCHAR(25)', maxLen => 25, avgLen => 25, sort => "", | |
378 | indexMod => 0, notes => "character string, 0 to 25 characters"}, | |
379 | 'key-string' => | 'key-string' => |
380 | { sqlType => 'VARCHAR(40)', maxLen => 40, dataGen => "StringGen(IntGen(10,40))" }, | { sqlType => 'VARCHAR(40)', maxLen => 40, avgLen => 10, sort => "", |
381 | indexMod => 0, notes => "character string, 0 to 40 characters"}, | |
382 | 'name-string' => | 'name-string' => |
383 | { sqlType => 'VARCHAR(80)', maxLen => 80, dataGen => "StringGen(IntGen(10,80))" }, | { sqlType => 'VARCHAR(80)', maxLen => 80, avgLen => 40, sort => "", |
384 | indexMod => 0, notes => "character string, 0 to 80 characters"}, | |
385 | 'medium-string' => | 'medium-string' => |
386 | { sqlType => 'VARCHAR(160)', maxLen => 160, dataGen => "StringGen(IntGen(10,160))" }, | { sqlType => 'VARCHAR(160)', maxLen => 160, avgLen => 40, sort => "", |
387 | indexMod => 0, notes => "character string, 0 to 160 characters"}, | |
388 | 'long-string' => | |
389 | { sqlType => 'VARCHAR(500)', maxLen => 500, avglen => 255, sort => "", | |
390 | indexMod => 0, notes => "character string, 0 to 500 characters"}, | |
391 | ); | ); |
392 | ||
393 | # Table translating arities into natural language. | # Table translating arities into natural language. |
# | Line 322 | Line 396 |
396 | 'MM' => 'many-to-many' | 'MM' => 'many-to-many' |
397 | ); | ); |
398 | ||
399 | # Table for interpreting string patterns. | # Options for XML input and output. |
400 | ||
401 | my %PictureTable = ( 'A' => "abcdefghijklmnopqrstuvwxyz", | my %XmlOptions = (GroupTags => { Relationships => 'Relationship', |
402 | '9' => "0123456789", | Entities => 'Entity', |
403 | 'X' => "abcdefghijklmnopqrstuvwxyz0123456789", | Fields => 'Field', |
404 | 'V' => "aeiou", | Indexes => 'Index', |
405 | 'K' => "bcdfghjklmnoprstvwxyz" | IndexFields => 'IndexField', |
406 | Issues => 'Issue', | |
407 | Shapes => 'Shape' | |
408 | }, | |
409 | KeyAttr => { Relationship => 'name', | |
410 | Entity => 'name', | |
411 | Field => 'name', | |
412 | Shape => 'name' | |
413 | }, | |
414 | SuppressEmpty => 1, | |
415 | ); | |
416 | ||
417 | my %XmlInOpts = ( | |
418 | ForceArray => [qw(Field Index IndexField Relationship Entity Shape)], | |
419 | ForceContent => 1, | |
420 | NormalizeSpace => 2, | |
421 | ); | |
422 | my %XmlOutOpts = ( | |
423 | RootName => 'Database', | |
424 | XMLDecl => 1, | |
425 | ); | ); |
426 | ||
427 | =head2 Public Methods | =head2 Public Methods |
428 | ||
429 | =head3 new | =head3 new |
430 | ||
431 | C<< my $database = ERDB->new($dbh, $metaFileName); >> | my $database = ERDB->new($dbh, $metaFileName); |
432 | ||
433 | Create a new ERDB object. | Create a new ERDB object. |
434 | ||
# | Line 355 | Line 448 |
448 | ||
449 | sub new { | sub new { |
450 | # Get the parameters. | # Get the parameters. |
451 | my ($class, $dbh, $metaFileName, $options) = @_; | my ($class, $dbh, $metaFileName, %options) = @_; |
452 | # Load the meta-data. | # Load the meta-data. |
453 | my $metaData = _LoadMetaData($metaFileName); | my $metaData = _LoadMetaData($metaFileName); |
454 | # Create the object. | # Create the object. |
# | Line 369 | Line 462 |
462 | ||
463 | =head3 ShowMetaData | =head3 ShowMetaData |
464 | ||
465 | C<< $database->ShowMetaData($fileName); >> | $erdb->ShowMetaData($fileName); |
466 | ||
467 | This method outputs a description of the database. This description can be used to help users create | This method outputs a description of the database. This description can be used to help users create |
468 | the data to be loaded into the relations. | the data to be loaded into the relations. |
# | Line 400 | Line 493 |
493 | # Write the HTML heading stuff. | # Write the HTML heading stuff. |
494 | print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; | print HTMLOUT "<html>\n<head>\n<title>$title</title>\n"; |
495 | print HTMLOUT "</head>\n<body>\n"; | print HTMLOUT "</head>\n<body>\n"; |
496 | # Write the documentation. | |
497 | print HTMLOUT $self->DisplayMetaData(); | |
498 | # Close the document. | |
499 | print HTMLOUT "</body>\n</html>\n"; | |
500 | # Close the file. | |
501 | close HTMLOUT; | |
502 | } | |
503 | ||
504 | =head3 DisplayMetaData | |
505 | ||
506 | my $html = $erdb->DisplayMetaData(); | |
507 | ||
508 | Return an HTML description of the database. This description can be used to help users create | |
509 | the data to be loaded into the relations and form queries. The output is raw includable HTML | |
510 | without any HEAD or BODY tags. | |
511 | ||
512 | =over 4 | |
513 | ||
514 | =item filename | |
515 | ||
516 | The name of the output file. | |
517 | ||
518 | =back | |
519 | ||
520 | =cut | |
521 | ||
522 | sub DisplayMetaData { | |
523 | # Get the parameters. | |
524 | my ($self) = @_; | |
525 | # Get the metadata and the title string. | |
526 | my $metadata = $self->{_metaData}; | |
527 | # Get the title string. | |
528 | my $title = $metadata->{Title}; | |
529 | # Get the entity and relationship lists. | |
530 | my $entityList = $metadata->{Entities}; | |
531 | my $relationshipList = $metadata->{Relationships}; | |
532 | # Declare the return variable. | |
533 | my $retVal = ""; | |
534 | # Open the output file. | |
535 | Trace("Building MetaData table of contents.") if T(4); | |
536 | # Here we do the table of contents. It starts as an unordered list of section names. Each | # Here we do the table of contents. It starts as an unordered list of section names. Each |
537 | # section contains an ordered list of entity or relationship subsections. | # section contains an ordered list of entity or relationship subsections. |
538 | print HTMLOUT "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; | $retVal .= "<ul>\n<li><a href=\"#EntitiesSection\">Entities</a>\n<ol>\n"; |
539 | # Loop through the Entities, displaying a list item for each. | # Loop through the Entities, displaying a list item for each. |
540 | foreach my $key (sort keys %{$entityList}) { | foreach my $key (sort keys %{$entityList}) { |
541 | # Display this item. | # Display this item. |
542 | print HTMLOUT "<li><a href=\"#$key\">$key</a></li>\n"; | $retVal .= "<li><a href=\"#$key\">$key</a></li>\n"; |
543 | } | } |
544 | # Close off the entity section and start the relationship section. | # Close off the entity section and start the relationship section. |
545 | print HTMLOUT "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; | $retVal .= "</ol></li>\n<li><a href=\"#RelationshipsSection\">Relationships</a>\n<ol>\n"; |
546 | # Loop through the Relationships. | # Loop through the Relationships. |
547 | foreach my $key (sort keys %{$relationshipList}) { | foreach my $key (sort keys %{$relationshipList}) { |
548 | # Display this item. | # Display this item. |
549 | my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); | my $relationshipTitle = _ComputeRelationshipSentence($key, $relationshipList->{$key}); |
550 | print HTMLOUT "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; | $retVal .= "<li><a href=\"#$key\">$relationshipTitle</a></li>\n"; |
551 | } | } |
552 | # Close off the relationship section and list the join table section. | # Close off the relationship section and list the join table section. |
553 | print HTMLOUT "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; | $retVal .= "</ol></li>\n<li><a href=\"#JoinTable\">Join Table</a></li>\n"; |
554 | # Close off the table of contents itself. | # Close off the table of contents itself. |
555 | print HTMLOUT "</ul>\n"; | $retVal .= "</ul>\n"; |
556 | # Now we start with the actual data. Denote we're starting the entity section. | # Now we start with the actual data. Denote we're starting the entity section. |
557 | print HTMLOUT "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; | $retVal .= "<a name=\"EntitiesSection\"></a><h2>Entities</h2>\n"; |
558 | # Loop through the entities. | # Loop through the entities. |
559 | for my $key (sort keys %{$entityList}) { | for my $key (sort keys %{$entityList}) { |
560 | Trace("Building MetaData entry for $key entity.") if T(4); | Trace("Building MetaData entry for $key entity.") if T(4); |
561 | # Create the entity header. It contains a bookmark and the entity name. | # Create the entity header. It contains a bookmark and the entity name. |
562 | print HTMLOUT "<a name=\"$key\"></a><h3>$key</h3>\n"; | $retVal .= "<a name=\"$key\"></a><h3>$key</h3>\n"; |
563 | # Get the entity data. | # Get the entity data. |
564 | my $entityData = $entityList->{$key}; | my $entityData = $entityList->{$key}; |
565 | # If there's descriptive text, display it. | # If there's descriptive text, display it. |
566 | if (my $notes = $entityData->{Notes}) { | if (my $notes = $entityData->{Notes}) { |
567 | print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; | $retVal .= "<p>" . HTMLNote($notes->{content}) . "</p>\n"; |
568 | } | } |
569 | # Now we want a list of the entity's relationships. First, we set up the relationship subsection. | # See if we need a list of the entity's relationships. |
570 | print HTMLOUT "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; | my $relCount = keys %{$relationshipList}; |
571 | if ($relCount > 0) { | |
572 | # First, we set up the relationship subsection. | |
573 | $retVal .= "<h4>Relationships for <b>$key</b></h4>\n<ul>\n"; | |
574 | # Loop through the relationships. | # Loop through the relationships. |
575 | for my $relationship (sort keys %{$relationshipList}) { | for my $relationship (sort keys %{$relationshipList}) { |
576 | # Get the relationship data. | # Get the relationship data. |
# | Line 444 | Line 580 |
580 | # Get the relationship sentence and append the arity. | # Get the relationship sentence and append the arity. |
581 | my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); | my $relationshipDescription = _ComputeRelationshipSentence($relationship, $relationshipStructure); |
582 | # Display the relationship data. | # Display the relationship data. |
583 | print HTMLOUT "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; | $retVal .= "<li><a href=\"#$relationship\">$relationshipDescription</a></li>\n"; |
584 | } | } |
585 | } | } |
586 | # Close off the relationship list. | # Close off the relationship list. |
587 | print HTMLOUT "</ul>\n"; | $retVal .= "</ul>\n"; |
588 | } | |
589 | # Get the entity's relations. | # Get the entity's relations. |
590 | my $relationList = $entityData->{Relations}; | my $relationList = $entityData->{Relations}; |
591 | # Create a header for the relation subsection. | # Create a header for the relation subsection. |
592 | print HTMLOUT "<h4>Relations for <b>$key</b></h4>\n"; | $retVal .= "<h4>Relations for <b>$key</b></h4>\n"; |
593 | # Loop through the relations, displaying them. | # Loop through the relations, displaying them. |
594 | for my $relation (sort keys %{$relationList}) { | for my $relation (sort keys %{$relationList}) { |
595 | my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); | my $htmlString = _ShowRelationTable($relation, $relationList->{$relation}); |
596 | print HTMLOUT $htmlString; | $retVal .= $htmlString; |
597 | } | } |
598 | } | } |
599 | # Denote we're starting the relationship section. | # Denote we're starting the relationship section. |
600 | print HTMLOUT "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; | $retVal .= "<a name=\"RelationshipsSection\"></a><h2>Relationships</h2>\n"; |
601 | # Loop through the relationships. | # Loop through the relationships. |
602 | for my $key (sort keys %{$relationshipList}) { | for my $key (sort keys %{$relationshipList}) { |
603 | Trace("Building MetaData entry for $key relationship.") if T(4); | Trace("Building MetaData entry for $key relationship.") if T(4); |
# | Line 468 | Line 605 |
605 | my $relationshipStructure = $relationshipList->{$key}; | my $relationshipStructure = $relationshipList->{$key}; |
606 | # Create the relationship header. | # Create the relationship header. |
607 | my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); | my $headerText = _ComputeRelationshipHeading($key, $relationshipStructure); |
608 | print HTMLOUT "<h3><a name=\"$key\"></a>$headerText</h3>\n"; | $retVal .= "<h3><a name=\"$key\"></a>$headerText</h3>\n"; |
609 | # Get the entity names. | # Get the entity names. |
610 | my $fromEntity = $relationshipStructure->{from}; | my $fromEntity = $relationshipStructure->{from}; |
611 | my $toEntity = $relationshipStructure->{to}; | my $toEntity = $relationshipStructure->{to}; |
# | Line 478 | Line 615 |
615 | # since both sentences will say the same thing. | # since both sentences will say the same thing. |
616 | my $arity = $relationshipStructure->{arity}; | my $arity = $relationshipStructure->{arity}; |
617 | if ($arity eq "11") { | if ($arity eq "11") { |
618 | print HTMLOUT "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; | $retVal .= "<p>Each <b>$fromEntity</b> relates to at most one <b>$toEntity</b>.\n"; |
619 | } else { | } else { |
620 | print HTMLOUT "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; | $retVal .= "<p>Each <b>$fromEntity</b> relates to multiple <b>$toEntity</b>s.\n"; |
621 | if ($arity eq "MM" && $fromEntity ne $toEntity) { | if ($arity eq "MM" && $fromEntity ne $toEntity) { |
622 | print HTMLOUT "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; | $retVal .= "Each <b>$toEntity</b> relates to multiple <b>$fromEntity</b>s.\n"; |
623 | } | } |
624 | } | } |
625 | print HTMLOUT "</p>\n"; | $retVal .= "</p>\n"; |
626 | # If there are notes on this relationship, display them. | # If there are notes on this relationship, display them. |
627 | if (my $notes = $relationshipStructure->{Notes}) { | if (my $notes = $relationshipStructure->{Notes}) { |
628 | print HTMLOUT "<p>" . _HTMLNote($notes->{content}) . "</p>\n"; | $retVal .= "<p>" . HTMLNote($notes->{content}) . "</p>\n"; |
629 | } | } |
630 | # Generate the relationship's relation table. | # Generate the relationship's relation table. |
631 | my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); | my $htmlString = _ShowRelationTable($key, $relationshipStructure->{Relations}->{$key}); |
632 | print HTMLOUT $htmlString; | $retVal .= $htmlString; |
633 | } | } |
634 | Trace("Building MetaData join table.") if T(4); | Trace("Building MetaData join table.") if T(4); |
635 | # Denote we're starting the join table. | # Denote we're starting the join table. |
636 | print HTMLOUT "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; | $retVal .= "<a name=\"JoinTable\"></a><h3>Join Table</h3>\n"; |
637 | # Create a table header. | # Create a table header. |
638 | print HTMLOUT _OpenTable("Join Table", "Source", "Target", "Join Condition"); | $retVal .= _OpenTable("Join Table", "Source", "Target", "Join Condition"); |
639 | # Loop through the joins. | # Loop through the joins. |
640 | my $joinTable = $metadata->{Joins}; | my $joinTable = $metadata->{Joins}; |
641 | my @joinKeys = keys %{$joinTable}; | my @joinKeys = keys %{$joinTable}; |
# | Line 506 | Line 643 |
643 | # Separate out the source, the target, and the join clause. | # Separate out the source, the target, and the join clause. |
644 | $joinKey =~ m!^([^/]+)/(.+)$!; | $joinKey =~ m!^([^/]+)/(.+)$!; |
645 | my ($sourceRelation, $targetRelation) = ($1, $2); | my ($sourceRelation, $targetRelation) = ($1, $2); |
646 | Trace("Join with key $joinKey is from $sourceRelation to $targetRelation.") if T(4); | Trace("Join with key $joinKey is from $sourceRelation to $targetRelation.") if T(Joins => 4); |
647 | my $source = $self->ComputeObjectSentence($sourceRelation); | my $source = $self->ComputeObjectSentence($sourceRelation); |
648 | my $target = $self->ComputeObjectSentence($targetRelation); | my $target = $self->ComputeObjectSentence($targetRelation); |
649 | my $clause = $joinTable->{$joinKey}; | my $clause = $joinTable->{$joinKey}; |
650 | # Display them in a table row. | # Display them in a table row. |
651 | print HTMLOUT "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; | $retVal .= "<tr><td>$source</td><td>$target</td><td>$clause</td></tr>\n"; |
652 | } | } |
653 | # Close the table. | # Close the table. |
654 | print HTMLOUT _CloseTable(); | $retVal .= _CloseTable(); |
655 | # Close the document. | Trace("Built MetaData HTML.") if T(3); |
656 | print HTMLOUT "</body>\n</html>\n"; | # Return the HTML. |
657 | # Close the file. | return $retVal; |
close HTMLOUT; | ||
Trace("Built MetaData web page.") if T(3); | ||
658 | } | } |
659 | ||
660 | =head3 DumpMetaData | =head3 DumpMetaData |
661 | ||
662 | C<< $database->DumpMetaData(); >> | $erdb->DumpMetaData(); |
663 | ||
664 | Return a dump of the metadata structure. | Return a dump of the metadata structure. |
665 | ||
# | Line 537 | Line 672 |
672 | return Data::Dumper::Dumper($self->{_metaData}); | return Data::Dumper::Dumper($self->{_metaData}); |
673 | } | } |
674 | ||
675 | =head3 GenerateWikiData | |
676 | ||
677 | my @wikiLines = $erdb->GenerateWikiData(); | |
678 | ||
679 | Build a description of the database for the wiki. The database will be | |
680 | organized into a single page, with sections for each entity and relationship. | |
681 | The return value is a list of text lines. | |
682 | ||
683 | =cut | |
684 | ||
685 | sub GenerateWikiData { | |
686 | # Get the parameters. | |
687 | my ($self) = @_; | |
688 | # We'll build the wiki text in here. | |
689 | my @retVal = (); | |
690 | # Get the metadata object. | |
691 | my $metadata = $self->{_metaData}; | |
692 | # Get the title string. This will become the page name. | |
693 | my $title = $metadata->{Title}->{content}; | |
694 | # Get the entity and relationship lists. | |
695 | my $entityList = $metadata->{Entities}; | |
696 | my $relationshipList = $metadata->{Relationships}; | |
697 | my $shapeList = $metadata->{Shapes}; | |
698 | # Start with the introductory text. | |
699 | push @retVal, WikiTools::Heading(2, "Introduction"); | |
700 | if (my $notes = $metadata->{Notes}) { | |
701 | push @retVal, WikiNote($notes->{content}); | |
702 | } | |
703 | # Generate issue list. | |
704 | if (my $issues = $metadata->{Issues}) { | |
705 | push @retVal, WikiTools::Heading(3, 'Issues'); | |
706 | push @retVal, WikiTools::List(map { $_->{content} } @{$issues}); | |
707 | } | |
708 | # Start the entity section. | |
709 | push @retVal, WikiTools::Heading(2, "Entities"); | |
710 | # Loop through the entities. Note that unlike the situation with HTML, we | |
711 | # don't need to generate the table of contents manually, just the data | |
712 | # itself. | |
713 | for my $key (sort keys %$entityList) { | |
714 | # Create a header for this entity. | |
715 | push @retVal, "", WikiTools::Heading(3, $key); | |
716 | # Get the entity data. | |
717 | my $entityData = $entityList->{$key}; | |
718 | # Plant the notes here, if there are any. | |
719 | push @retVal, _ObjectNotes($entityData); | |
720 | # Now we list the entity's relationships (if any). First, we build a list | |
721 | # of the relationships relevant to this entity. | |
722 | my @rels = (); | |
723 | for my $rel (sort keys %$relationshipList) { | |
724 | my $relStructure = $relationshipList->{$rel}; | |
725 | if ($relStructure->{from} eq $key || $relStructure->{to} eq $key) { | |
726 | # Get the relationship sentence. | |
727 | my $relSentence = _ComputeRelationshipSentence($rel, $relStructure); | |
728 | # Linkify it. | |
729 | my $linkedRel = WikiTools::LinkMarkup("#$rel", $rel); | |
730 | $relSentence =~ s/$rel/$linkedRel/; | |
731 | push @rels, $relSentence; | |
732 | } | |
733 | } | |
734 | # Add the relationships as a Wiki list. | |
735 | push @retVal, WikiTools::List(@rels); | |
736 | # Get the entity's relations. | |
737 | my $relationList = $entityData->{Relations}; | |
738 | # Loop through the relations, displaying them. | |
739 | for my $relation (sort keys %{$relationList}) { | |
740 | my $wikiString = _WikiRelationTable($relation, $relationList->{$relation}); | |
741 | push @retVal, $wikiString; | |
742 | } | |
743 | } | |
744 | # Now the entities are documented. Next we do the relationships. | |
745 | push @retVal, WikiTools::Heading(2, "Relationships"); | |
746 | for my $key (sort keys %$relationshipList) { | |
747 | my $relationshipData = $relationshipList->{$key}; | |
748 | # Create the relationship heading. | |
749 | push @retVal, WikiTools::Heading(3, $key); | |
750 | # Describe the relationship arity. Note there's a bit of trickiness involving recursive | |
751 | # many-to-many relationships. In a normal many-to-many we use two sentences to describe | |
752 | # the arity (one for each direction). This is a bad idea for a recursive relationship, | |
753 | # since both sentences will say the same thing. | |
754 | my $arity = $relationshipData->{arity}; | |
755 | my $fromEntity = $relationshipData->{from}; | |
756 | my $toEntity = $relationshipData->{to}; | |
757 | my @listElements = (); | |
758 | my $boldCode = WikiTools::BoldCode(); | |
759 | if ($arity eq "11") { | |
760 | push @listElements, "Each $boldCode$fromEntity$boldCode relates to at most one $boldCode$toEntity$boldCode."; | |
761 | } else { | |
762 | push @listElements, "Each $boldCode$fromEntity$boldCode relates to multiple $boldCode${toEntity}s$boldCode."; | |
763 | if ($arity eq "MM" && $fromEntity ne $toEntity) { | |
764 | push @listElements, "Each $boldCode$toEntity$boldCode relates to multiple $boldCode${fromEntity}s$boldCode."; | |
765 | } | |
766 | } | |
767 | push @retVal, WikiTools::List(@listElements); | |
768 | # Plant the notes here, if there are any. | |
769 | push @retVal, _ObjectNotes($relationshipData); | |
770 | # Finally, the relationship table. | |
771 | my $wikiString = _WikiRelationTable($key, $relationshipData->{Relations}->{$key}); | |
772 | push @retVal, $wikiString; | |
773 | } | |
774 | # Now loop through the miscellaneous shapes. | |
775 | if ($shapeList) { | |
776 | push @retVal, WikiTools::Heading(2, "Miscellaneous"); | |
777 | for my $shape (sort keys %$shapeList) { | |
778 | push @retVal, WikiTools::Heading(3, $shape); | |
779 | my $shapeData = $shapeList->{$shape}; | |
780 | push @retVal, _ObjectNotes($shapeData); | |
781 | } | |
782 | } | |
783 | # All done. Return the lines. | |
784 | return @retVal; | |
785 | } | |
786 | ||
787 | ||
788 | =head3 CreatePPO | |
789 | ||
790 | ERDB::CreatePPO($erdbXMLFile, $ppoXMLFile); | |
791 | ||
792 | Create a PPO XML file from an ERDB data definition XML file. At the | |
793 | current time, the PPO XML file can be used to create a database with | |
794 | similar functionality. Eventually, the PPO will be able to use the | |
795 | created XML to access the live ERDB database. | |
796 | ||
797 | =over 4 | |
798 | ||
799 | =item erdbXMLFile | |
800 | ||
801 | Name of the XML data definition file for the ERDB database. This | |
802 | file must exist. | |
803 | ||
804 | =item ppoXMLFile | |
805 | ||
806 | Output file for the PPO XML definition. If this file exists, it | |
807 | will be overwritten. | |
808 | ||
809 | =back | |
810 | ||
811 | =cut | |
812 | ||
813 | sub CreatePPO { | |
814 | # Get the parameters. | |
815 | my ($erdbXMLFile, $ppoXMLFile) = @_; | |
816 | # First, we want to slurp in the ERDB XML file in its raw form. | |
817 | my $xml = ReadMetaXML($erdbXMLFile); | |
818 | # Create a variable to hold all of the objects in the PPO project. | |
819 | my @objects = (); | |
820 | # Get the relationship hash. | |
821 | my $relationships = $xml->{Relationships}; | |
822 | # Loop through the entities. | |
823 | my $entities = $xml->{Entities}; | |
824 | for my $entityName (keys %{$entities}) { | |
825 | # Get the entity's data structures. | |
826 | my $entityObject = $entities->{$entityName}; | |
827 | # We put the object's fields in here, according to their type. | |
828 | my (@object_refs, @scalars, @indexes, @arrays); | |
829 | # Create the ID field for the entity. We get the key type from the | |
830 | # entity object and compute the corresponding SQL type. | |
831 | my $type = $TypeTable{$entityObject->{keyType}}->{sqlType}; | |
832 | push @scalars, { label => 'id', type => $type }; | |
833 | # Loop through the entity fields. | |
834 | for my $fieldName ( keys %{$entityObject->{Fields}} ) { | |
835 | # Get the field object. | |
836 | my $fieldObject = $entityObject->{Fields}->{$fieldName}; | |
837 | # Convert it to a scalar tag. | |
838 | my $scalar = _CreatePPOField($fieldName, $fieldObject); | |
839 | # If we have a relation, this field is stored in an array. | |
840 | # otherwise, it is a scalar. The array tag has scalars | |
841 | # stored as an XML array. In ERDB, there is only ever one, | |
842 | # but PPO can have more. | |
843 | my $relation = $fieldObject->{relation}; | |
844 | if ($relation) { | |
845 | push @arrays, { scalar => [$scalar] }; | |
846 | } else { | |
847 | push @scalars, $scalar; | |
848 | } | |
849 | } | |
850 | # Loop through the relationships. If this entity is the to-entity | |
851 | # on a relationship of 1M arity, then it is implemented as a PPO | |
852 | # object reference. | |
853 | for my $relationshipName (keys %{$relationships}) { | |
854 | # Get the relationship data. | |
855 | my $relationshipData = $relationships->{$relationshipName}; | |
856 | # If we have a from for this entity and an arity of 1M, we | |
857 | # have an object reference. | |
858 | if ($relationshipData->{to} eq $entityName && | |
859 | $relationshipData->{arity} eq '1M') { | |
860 | # Build the object reference tag. | |
861 | push @object_refs, { label => $relationshipName, | |
862 | type => $relationshipData->{from} }; | |
863 | } | |
864 | } | |
865 | # Create the indexes. | |
866 | my $indexList = $entityObject->{Indexes}; | |
867 | push @indexes, map { _CreatePPOIndex($_) } @{$indexList}; | |
868 | # Build the object XML tree. | |
869 | my $object = { label => $entityName, | |
870 | object_ref => \@object_refs, | |
871 | scalar => \@scalars, | |
872 | index => \@indexes, | |
873 | array => \@arrays | |
874 | }; | |
875 | # Push the object onto the objects list. | |
876 | push @objects, $object; | |
877 | } | |
878 | # Loop through the relationships, searching for MMs. The 1Ms were | |
879 | # already handled by the entity search above. | |
880 | for my $relationshipName (keys %{$relationships}) { | |
881 | # Get this relationship's object. | |
882 | my $relationshipObject = $relationships->{$relationshipName}; | |
883 | # Only proceed if it's many-to-many. | |
884 | if ($relationshipObject->{arity} eq 'MM') { | |
885 | # Create the tag lists for the relationship object. | |
886 | my (@object_refs, @scalars, @indexes); | |
887 | # The relationship will be created as an object with object | |
888 | # references for its links to the participating entities. | |
889 | my %links = ( from_link => $relationshipObject->{from}, | |
890 | to_link => $relationshipObject->{to} ); | |
891 | for my $link (keys %links) { | |
892 | # Create an object_ref tag for this piece of the | |
893 | # relationship (from or to). | |
894 | my $object_ref = { label => $link, | |
895 | type => $links{$link} }; | |
896 | push @object_refs, $object_ref; | |
897 | } | |
898 | # Loop through the intersection data fields, creating scalar tags. | |
899 | # There are no fancy array tags in a relationship. | |
900 | for my $fieldName (keys %{$relationshipObject->{Fields}}) { | |
901 | my $fieldObject = $relationshipObject->{Fields}->{$fieldName}; | |
902 | push @scalars, _CreatePPOField($fieldName, $fieldObject); | |
903 | } | |
904 | # Finally, the indexes: currently we cannot support the to-index and | |
905 | # from-index in PPO, so we just process the alternate indexes. | |
906 | my $indexList = $relationshipObject->{Indexes}; | |
907 | push @indexes, map { _CreatePPOIndex($_) } @{$indexList}; | |
908 | # Wrap up all the stuff about this relationship. | |
909 | my $object = { label => $relationshipName, | |
910 | scalar => \@scalars, | |
911 | object_ref => \@object_refs, | |
912 | index => \@indexes | |
913 | }; | |
914 | # Push it into the object list. | |
915 | push @objects, $object; | |
916 | } | |
917 | } | |
918 | # Compute a title. | |
919 | my $title; | |
920 | if ($erdbXMLFile =~ /(\/|^)([^\/]+)DBD\.xml/) { | |
921 | # Here we have a standard file name we can use for a title. | |
922 | $title = $2; | |
923 | } else { | |
924 | # Here the file name is non-standard, so we carve up the | |
925 | # database title. | |
926 | $title = $xml->{Title}->{content}; | |
927 | $title =~ s/\s\.,//g; | |
928 | } | |
929 | # Wrap up the XML as a project. | |
930 | my $ppoXML = { project => { label => $title, | |
931 | object => \@objects }}; | |
932 | # Write out the results. | |
933 | my $ppoString = XML::Simple::XMLout($ppoXML, | |
934 | AttrIndent => 1, | |
935 | KeepRoot => 1); | |
936 | Tracer::PutFile($ppoXMLFile, [ $ppoString ]); | |
937 | } | |
938 | ||
939 | =head3 FindIndexForEntity | |
940 | ||
941 | my $indexFound = ERDB::FindIndexForEntity($xml, $entityName, $attributeName); | |
942 | ||
943 | This method locates the entry in an entity's index list that begins with the | |
944 | specified attribute name. If the entity has no index list, one will be | |
945 | created. This method works on raw XML, not a live ERDB object. | |
946 | ||
947 | =over 4 | |
948 | ||
949 | =item xml | |
950 | ||
951 | The raw XML structure defining the database. | |
952 | ||
953 | =item entityName | |
954 | ||
955 | The name of the relevant entity. | |
956 | ||
957 | =item attributeName | |
958 | ||
959 | The name of the attribute relevant to the search. | |
960 | ||
961 | =item RETURN | |
962 | ||
963 | The numerical index in the index list of the index entry for the specified entity and | |
964 | attribute, or C<undef> if no such index exists. | |
965 | ||
966 | =back | |
967 | ||
968 | =cut | |
969 | ||
970 | sub FindIndexForEntity { | |
971 | # Get the parameters. | |
972 | my ($xml, $entityName, $attributeName) = @_; | |
973 | # Declare the return variable. | |
974 | my $retVal; | |
975 | # Get the named entity. | |
976 | my $entityData = $xml->{Entities}->{$entityName}; | |
977 | if (! $entityData) { | |
978 | Confess("Entity $entityName not found in DBD structure."); | |
979 | } else { | |
980 | # Insure it has an index list. | |
981 | if (! exists $entityData->{Indexes}) { | |
982 | $entityData->{Indexes} = []; | |
983 | } else { | |
984 | # Search for the desired index. | |
985 | my $indexList = $entityData->{Indexes}; | |
986 | my $n = scalar @{$indexList}; | |
987 | Trace("Searching $n indexes in index list for $entityName.") if T(2); | |
988 | # We use an indexed FOR here because we're returning an | |
989 | # index number instead of an object. We do THAT so we can | |
990 | # delete the index from the list if needed. | |
991 | for (my $i = 0; $i < $n && !defined($retVal); $i++) { | |
992 | my $index = $indexList->[$i]; | |
993 | my $fields = $index->{IndexFields}; | |
994 | # Technically this IF should be safe (that is, we are guaranteed | |
995 | # the existence of a "$fields->[0]"), because when we load the XML | |
996 | # we have SuppressEmpty specified. | |
997 | if ($fields->[0]->{name} eq $attributeName) { | |
998 | $retVal = $i; | |
999 | } | |
1000 | } | |
1001 | } | |
1002 | } | |
1003 | Trace("Index for $attributeName of $entityName found at position $retVal.") if defined($retVal) && T(3); | |
1004 | Trace("Index for $attributeName not found in $entityName.") if !defined($retVal) && T(3); | |
1005 | # Return the result. | |
1006 | return $retVal; | |
1007 | } | |
1008 | ||
1009 | =head3 CreateTables | =head3 CreateTables |
1010 | ||
1011 | C<< $datanase->CreateTables(); >> | $erdb->CreateTables(); |
1012 | ||
1013 | This method creates the tables for the database from the metadata structure loaded by the | This method creates the tables for the database from the metadata structure loaded by the |
1014 | constructor. It is expected this function will only be used on rare occasions, when the | constructor. It is expected this function will only be used on rare occasions, when the |
# | Line 551 | Line 1020 |
1020 | sub CreateTables { | sub CreateTables { |
1021 | # Get the parameters. | # Get the parameters. |
1022 | my ($self) = @_; | my ($self) = @_; |
1023 | my $metadata = $self->{_metaData}; | # Get the relation names. |
1024 | my $dbh = $self->{_dbh}; | my @relNames = $self->GetTableNames(); |
1025 | # Loop through the entities. | # Loop through the relations. |
1026 | my $entityHash = $metadata->{Entities}; | for my $relationName (@relNames) { |
for my $entityName (keys %{$entityHash}) { | ||
my $entityData = $entityHash->{$entityName}; | ||
# Tell the user what we're doing. | ||
Trace("Creating relations for entity $entityName.") if T(1); | ||
# Loop through the entity's relations. | ||
for my $relationName (keys %{$entityData->{Relations}}) { | ||
1027 | # Create a table for this relation. | # Create a table for this relation. |
1028 | $self->CreateTable($relationName); | $self->CreateTable($relationName, 1); |
1029 | Trace("Relation $relationName created.") if T(1); | Trace("Relation $relationName created.") if T(2); |
} | ||
} | ||
# Loop through the relationships. | ||
my $relationshipTable = $metadata->{Relationships}; | ||
for my $relationshipName (keys %{$metadata->{Relationships}}) { | ||
# Create a table for this relationship. | ||
Trace("Creating relationship $relationshipName.") if T(1); | ||
$self->CreateTable($relationshipName); | ||
1030 | } | } |
1031 | } | } |
1032 | ||
1033 | =head3 CreateTable | =head3 CreateTable |
1034 | ||
1035 | C<< $database->CreateTable($tableName, $indexFlag); >> | $erdb->CreateTable($tableName, $indexFlag, $estimatedRows); |
1036 | ||
1037 | Create the table for a relation and optionally create its indexes. | Create the table for a relation and optionally create its indexes. |
1038 | ||
# | Line 587 | Line 1042 |
1042 | ||
1043 | Name of the relation (which will also be the table name). | Name of the relation (which will also be the table name). |
1044 | ||
1045 | =item $indexFlag | =item indexFlag |
1046 | ||
1047 | TRUE if the indexes for the relation should be created, else FALSE. If FALSE, | TRUE if the indexes for the relation should be created, else FALSE. If FALSE, |
1048 | L</CreateIndexes> must be called later to bring the indexes into existence. | L</CreateIndexes> must be called later to bring the indexes into existence. |
1049 | ||
1050 | =item estimatedRows (optional) | |
1051 | ||
1052 | If specified, the estimated maximum number of rows for the relation. This | |
1053 | information allows the creation of tables using storage engines that are | |
1054 | faster but require size estimates, such as MyISAM. | |
1055 | ||
1056 | =back | =back |
1057 | ||
1058 | =cut | =cut |
1059 | ||
1060 | sub CreateTable { | sub CreateTable { |
1061 | # Get the parameters. | # Get the parameters. |
1062 | my ($self, $relationName, $indexFlag) = @_; | my ($self, $relationName, $indexFlag, $estimatedRows) = @_; |
1063 | # Get the database handle. | # Get the database handle. |
1064 | my $dbh = $self->{_dbh}; | my $dbh = $self->{_dbh}; |
1065 | # Get the relation data and determine whether or not the relation is primary. | # Get the relation data and determine whether or not the relation is primary. |
# | Line 613 | Line 1074 |
1074 | # Push the result into the field list. | # Push the result into the field list. |
1075 | push @fieldList, $fieldString; | push @fieldList, $fieldString; |
1076 | } | } |
# If this is a root table, add the "new_record" flag. It defaults to 0, so | ||
if ($rootFlag) { | ||
push @fieldList, "new_record $TypeTable{boolean}->{sqlType} NOT NULL DEFAULT 0"; | ||
} | ||
1077 | # Convert the field list into a comma-delimited string. | # Convert the field list into a comma-delimited string. |
1078 | my $fieldThing = join(', ', @fieldList); | my $fieldThing = join(', ', @fieldList); |
1079 | # Insure the table is not already there. | # Insure the table is not already there. |
1080 | $dbh->drop_table(tbl => $relationName); | $dbh->drop_table(tbl => $relationName); |
1081 | Trace("Table $relationName dropped.") if T(2); | Trace("Table $relationName dropped.") if T(2); |
1082 | # If there are estimated rows, create an estimate so we can take advantage of | |
1083 | # faster DB technologies. | |
1084 | my $estimation = undef; | |
1085 | if ($estimatedRows) { | |
1086 | $estimation = [$self->EstimateRowSize($relationName), $estimatedRows]; | |
1087 | Trace("$estimation->[1] rows of $estimation->[0] bytes each.") if T(3); | |
1088 | } | |
1089 | # Create the table. | # Create the table. |
1090 | Trace("Creating table $relationName: $fieldThing") if T(2); | Trace("Creating table $relationName: $fieldThing") if T(2); |
1091 | $dbh->create_table(tbl => $relationName, flds => $fieldThing); | $dbh->create_table(tbl => $relationName, flds => $fieldThing, estimates => $estimation); |
1092 | Trace("Relation $relationName created in database.") if T(2); | Trace("Relation $relationName created in database.") if T(2); |
1093 | # If we want to build the indexes, we do it here. | # If we want to build the indexes, we do it here. Note that the full-text search |
1094 | # index will not be built until the table has been loaded. | |
1095 | if ($indexFlag) { | if ($indexFlag) { |
1096 | $self->CreateIndex($relationName); | $self->CreateIndex($relationName); |
1097 | } | } |
1098 | } | } |
1099 | ||
1100 | =head3 VerifyFields | |
1101 | ||
1102 | my $count = $erdb->VerifyFields($relName, \@fieldList); | |
1103 | ||
1104 | Run through the list of proposed field values, insuring that all the character fields are | |
1105 | below the maximum length. If any fields are too long, they will be truncated in place. | |
1106 | ||
1107 | =over 4 | |
1108 | ||
1109 | =item relName | |
1110 | ||
1111 | Name of the relation for which the specified fields are destined. | |
1112 | ||
1113 | =item fieldList | |
1114 | ||
1115 | Reference to a list, in order, of the fields to be put into the relation. | |
1116 | ||
1117 | =item RETURN | |
1118 | ||
1119 | Returns the number of fields truncated. | |
1120 | ||
1121 | =back | |
1122 | ||
1123 | =cut | |
1124 | ||
1125 | sub VerifyFields { | |
1126 | # Get the parameters. | |
1127 | my ($self, $relName, $fieldList) = @_; | |
1128 | # Initialize the return value. | |
1129 | my $retVal = 0; | |
1130 | # Get the relation definition. | |
1131 | my $relData = $self->_FindRelation($relName); | |
1132 | # Get the list of field descriptors. | |
1133 | my $fieldTypes = $relData->{Fields}; | |
1134 | my $fieldCount = scalar @{$fieldTypes}; | |
1135 | # Loop through the two lists. | |
1136 | for (my $i = 0; $i < $fieldCount; $i++) { | |
1137 | # Get the type of the current field. | |
1138 | my $fieldType = $fieldTypes->[$i]->{type}; | |
1139 | # If it's a character field, verify the length. | |
1140 | if ($fieldType =~ /string/) { | |
1141 | my $maxLen = $TypeTable{$fieldType}->{maxLen}; | |
1142 | my $oldString = $fieldList->[$i]; | |
1143 | if (length($oldString) > $maxLen) { | |
1144 | # Here it's too big, so we truncate it. | |
1145 | Trace("Truncating field $i ($fieldTypes->[$i]->{name}) in relation $relName to $maxLen characters from \"$oldString\".") if T(1); | |
1146 | $fieldList->[$i] = substr $oldString, 0, $maxLen; | |
1147 | $retVal++; | |
1148 | } | |
1149 | } | |
1150 | } | |
1151 | # Return the truncation count. | |
1152 | return $retVal; | |
1153 | } | |
1154 | ||
1155 | =head3 DigestFields | |
1156 | ||
1157 | $erdb->DigestFields($relName, $fieldList); | |
1158 | ||
1159 | Digest the strings in the field list that correspond to data type C<hash-string> in the | |
1160 | specified relation. | |
1161 | ||
1162 | =over 4 | |
1163 | ||
1164 | =item relName | |
1165 | ||
1166 | Name of the relation to which the fields belong. | |
1167 | ||
1168 | =item fieldList | |
1169 | ||
1170 | List of field contents to be loaded into the relation. | |
1171 | ||
1172 | =back | |
1173 | ||
1174 | =cut | |
1175 | #: Return Type ; | |
1176 | sub DigestFields { | |
1177 | # Get the parameters. | |
1178 | my ($self, $relName, $fieldList) = @_; | |
1179 | # Get the relation definition. | |
1180 | my $relData = $self->_FindRelation($relName); | |
1181 | # Get the list of field descriptors. | |
1182 | my $fieldTypes = $relData->{Fields}; | |
1183 | my $fieldCount = scalar @{$fieldTypes}; | |
1184 | # Loop through the two lists. | |
1185 | for (my $i = 0; $i < $fieldCount; $i++) { | |
1186 | # Get the type of the current field. | |
1187 | my $fieldType = $fieldTypes->[$i]->{type}; | |
1188 | # If it's a hash string, digest it in place. | |
1189 | if ($fieldType eq 'hash-string') { | |
1190 | $fieldList->[$i] = $self->DigestKey($fieldList->[$i]); | |
1191 | } | |
1192 | } | |
1193 | } | |
1194 | ||
1195 | =head3 DigestKey | |
1196 | ||
1197 | my $digested = $erdb->DigestKey($keyValue); | |
1198 | ||
1199 | Return the digested value of a symbolic key. The digested value can then be plugged into a | |
1200 | key-based search into a table with key-type hash-string. | |
1201 | ||
1202 | Currently the digesting process is independent of the database structure, but that may not | |
1203 | always be the case, so this is an instance method instead of a static method. | |
1204 | ||
1205 | =over 4 | |
1206 | ||
1207 | =item keyValue | |
1208 | ||
1209 | Key value to digest. | |
1210 | ||
1211 | =item RETURN | |
1212 | ||
1213 | Digested value of the key. | |
1214 | ||
1215 | =back | |
1216 | ||
1217 | =cut | |
1218 | ||
1219 | sub DigestKey { | |
1220 | # Get the parameters. | |
1221 | my ($self, $keyValue) = @_; | |
1222 | # Compute the digest. | |
1223 | my $retVal = md5_base64($keyValue); | |
1224 | # Return the result. | |
1225 | return $retVal; | |
1226 | } | |
1227 | ||
1228 | =head3 CreateIndex | =head3 CreateIndex |
1229 | ||
1230 | C<< $database->CreateIndex($relationName); >> | $erdb->CreateIndex($relationName); |
1231 | ||
1232 | Create the indexes for a relation. If a table is being loaded from a large source file (as | Create the indexes for a relation. If a table is being loaded from a large source file (as |
1233 | is the case in L</LoadTable>), it is sometimes best to create the indexes after the load. | is the case in L</LoadTable>), it is sometimes best to create the indexes after the load. |
# | Line 655 | Line 1248 |
1248 | for my $indexName (keys %{$indexHash}) { | for my $indexName (keys %{$indexHash}) { |
1249 | my $indexData = $indexHash->{$indexName}; | my $indexData = $indexHash->{$indexName}; |
1250 | # Get the index's field list. | # Get the index's field list. |
1251 | my @fieldList = _FixNames(@{$indexData->{IndexFields}}); | my @rawFields = @{$indexData->{IndexFields}}; |
1252 | # Get a hash of the relation's field types. | |
1253 | my %types = map { $_->{name} => $_->{type} } @{$relationData->{Fields}}; | |
1254 | # We need to check for text fields so we can append a length limitation for them. To do | |
1255 | # that, we need the relation's field list. | |
1256 | my $relFields = $relationData->{Fields}; | |
1257 | for (my $i = 0; $i <= $#rawFields; $i++) { | |
1258 | # Get the field type. | |
1259 | my $field = $rawFields[$i]; | |
1260 | my $type = $types{$field}; | |
1261 | # Ask if it requires using prefix notation for the index. | |
1262 | my $mod = $TypeTable{$type}->{indexMod}; | |
1263 | Trace("Field $field ($i) in $relationName has type $type and indexMod $mod.") if T(3); | |
1264 | if ($mod) { | |
1265 | # Append the prefix length to the field name, | |
1266 | $rawFields[$i] .= "($mod)"; | |
1267 | } | |
1268 | } | |
1269 | my @fieldList = _FixNames(@rawFields); | |
1270 | my $flds = join(', ', @fieldList); | my $flds = join(', ', @fieldList); |
1271 | # Get the index's uniqueness flag. | # Get the index's uniqueness flag. |
1272 | my $unique = (exists $indexData->{Unique} ? $indexData->{Unique} : 'false'); | my $unique = (exists $indexData->{Unique} ? 'unique' : undef); |
1273 | # Create the index. | # Create the index. |
1274 | $dbh->create_index(idx => $indexName, tbl => $relationName, flds => $flds, unique => $unique); | my $rv = $dbh->create_index(idx => $indexName, tbl => $relationName, |
1275 | flds => $flds, kind => $unique); | |
1276 | if ($rv) { | |
1277 | Trace("Index created: $indexName for $relationName ($flds)") if T(1); | Trace("Index created: $indexName for $relationName ($flds)") if T(1); |
1278 | } else { | |
1279 | Confess("Error creating index $indexName for $relationName using ($flds): " . $dbh->error_message()); | |
1280 | } | |
1281 | } | } |
1282 | } | } |
1283 | ||
1284 | =head3 LoadTables | =head3 GetSecondaryFields |
C<< my $stats = $database->LoadTables($directoryName, $rebuild); >> | ||
1285 | ||
1286 | This method will load the database tables from a directory. The tables must already have been created | my %fieldTuples = $erdb->GetSecondaryFields($entityName); |
in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; | ||
all of the relations to be loaded must have a file in the directory with the same name as the relation | ||
(optionally with a suffix of C<.dtx>). Each file must be a tab-delimited table of field values. Each | ||
line of the file will be loaded as a row of the target relation table. The field values should be in | ||
the same order as the fields in the relation tables generated by L</ShowMetaData>. The old data is | ||
erased before the new data is loaded in. | ||
1287 | ||
1288 | A certain amount of translation automatically takes place. Ctrl-M characters are deleted, and | This method will return a list of the name and type of each of the secondary |
1289 | tab and new-line characters inside a field are escaped as C<\t> and C<\n>, respectively. Dates must | fields for a specified entity. Secondary fields are stored in two-column tables |
1290 | be entered as a Unix timestamp, that is, as an integer number of seconds since the base epoch. | in addition to the primary entity table. This enables the field to have no value |
1291 | or to have multiple values. | |
1292 | ||
1293 | =over 4 | =over 4 |
1294 | ||
1295 | =item directoryName | =item entityName |
1296 | ||
1297 | Name of the directory containing the relation files to be loaded. | Name of the entity whose secondary fields are desired. |
1298 | ||
1299 | =item rebuild | =item RETURN |
1300 | ||
1301 | Returns a hash mapping the field names to their field types. | |
1302 | ||
1303 | =back | |
1304 | ||
1305 | =cut | |
1306 | ||
1307 | sub GetSecondaryFields { | |
1308 | # Get the parameters. | |
1309 | my ($self, $entityName) = @_; | |
1310 | # Declare the return variable. | |
1311 | my %retVal = (); | |
1312 | # Look for the entity. | |
1313 | my $table = $self->GetFieldTable($entityName); | |
1314 | # Loop through the fields, pulling out the secondaries. | |
1315 | for my $field (sort keys %{$table}) { | |
1316 | if ($table->{$field}->{relation} ne $entityName) { | |
1317 | # Here we have a secondary field. | |
1318 | $retVal{$field} = $table->{$field}->{type}; | |
1319 | } | |
1320 | } | |
1321 | # Return the result. | |
1322 | return %retVal; | |
1323 | } | |
1324 | ||
1325 | =head3 GetFieldRelationName | |
1326 | ||
1327 | my $name = $erdb->GetFieldRelationName($objectName, $fieldName); | |
1328 | ||
1329 | Return the name of the relation containing a specified field. | |
1330 | ||
1331 | =over 4 | |
1332 | ||
1333 | =item objectName | |
1334 | ||
1335 | Name of the entity or relationship containing the field. | |
1336 | ||
1337 | =item fieldName | |
1338 | ||
1339 | Name of the relevant field in that entity or relationship. | |
1340 | ||
1341 | =item RETURN | |
1342 | ||
1343 | Returns the name of the database relation containing the field, or C<undef> if | |
1344 | the field does not exist. | |
1345 | ||
1346 | =back | |
1347 | ||
1348 | =cut | |
1349 | ||
1350 | sub GetFieldRelationName { | |
1351 | # Get the parameters. | |
1352 | my ($self, $objectName, $fieldName) = @_; | |
1353 | # Declare the return variable. | |
1354 | my $retVal; | |
1355 | # Get the object field table. | |
1356 | my $table = $self->GetFieldTable($objectName); | |
1357 | # Only proceed if the field exists. | |
1358 | if (exists $table->{$fieldName}) { | |
1359 | # Determine the name of the relation that contains this field. | |
1360 | $retVal = $table->{$fieldName}->{relation}; | |
1361 | } | |
1362 | # Return the result. | |
1363 | return $retVal; | |
1364 | } | |
1365 | ||
1366 | =head3 DeleteValue | |
1367 | ||
1368 | my $numDeleted = $erdb->DeleteValue($entityName, $id, $fieldName, $fieldValue); | |
1369 | ||
1370 | Delete secondary field values from the database. This method can be used to delete all | |
1371 | values of a specified field for a particular entity instance, or only a single value. | |
1372 | ||
1373 | Secondary fields are stored in two-column relations separate from an entity's primary | |
1374 | table, and as a result a secondary field can legitimately have no value or multiple | |
1375 | values. Therefore, it makes sense to talk about deleting secondary fields where it | |
1376 | would not make sense for primary fields. | |
1377 | ||
1378 | =over 4 | |
1379 | ||
1380 | =item entityName | |
1381 | ||
1382 | Name of the entity from which the fields are to be deleted. | |
1383 | ||
1384 | =item id | |
1385 | ||
1386 | ID of the entity instance to be processed. If the instance is not found, this | |
1387 | method will have no effect. If C<undef> is specified, all values for all of | |
1388 | the entity instances will be deleted. | |
1389 | ||
1390 | =item fieldName | |
1391 | ||
1392 | Name of the field whose values are to be deleted. | |
1393 | ||
1394 | =item fieldValue (optional) | |
1395 | ||
1396 | Value to be deleted. If not specified, then all values of the specified field | |
1397 | will be deleted for the entity instance. If specified, then only the values which | |
1398 | match this parameter will be deleted. | |
1399 | ||
1400 | =item RETURN | |
1401 | ||
1402 | Returns the number of rows deleted. | |
1403 | ||
1404 | =back | |
1405 | ||
1406 | =cut | |
1407 | ||
1408 | sub DeleteValue { | |
1409 | # Get the parameters. | |
1410 | my ($self, $entityName, $id, $fieldName, $fieldValue) = @_; | |
1411 | # Declare the return value. | |
1412 | my $retVal = 0; | |
1413 | # We need to set up an SQL command to do the deletion. First, we | |
1414 | # find the name of the field's relation. | |
1415 | my $table = $self->GetFieldTable($entityName); | |
1416 | my $field = $table->{$fieldName}; | |
1417 | my $relation = $field->{relation}; | |
1418 | # Make sure this is a secondary field. | |
1419 | if ($relation eq $entityName) { | |
1420 | Confess("Cannot delete values of $fieldName for $entityName."); | |
1421 | } else { | |
1422 | # Set up the SQL command to delete all values. | |
1423 | my $sql = "DELETE FROM $relation"; | |
1424 | # Build the filter. | |
1425 | my @filters = (); | |
1426 | my @parms = (); | |
1427 | # Check for a filter by ID. | |
1428 | if (defined $id) { | |
1429 | push @filters, "id = ?"; | |
1430 | push @parms, $id; | |
1431 | } | |
1432 | # Check for a filter by value. | |
1433 | if (defined $fieldValue) { | |
1434 | push @filters, "$fieldName = ?"; | |
1435 | push @parms, $fieldValue; | |
1436 | } | |
1437 | # Append the filters to the command. | |
1438 | if (@filters) { | |
1439 | $sql .= " WHERE " . join(" AND ", @filters); | |
1440 | } | |
1441 | # Execute the command. | |
1442 | my $dbh = $self->{_dbh}; | |
1443 | $retVal = $dbh->SQL($sql, 0, @parms); | |
1444 | } | |
1445 | # Return the result. | |
1446 | return $retVal; | |
1447 | } | |
1448 | ||
1449 | =head3 LoadTables | |
1450 | ||
1451 | my $stats = $erdb->LoadTables($directoryName, $rebuild); | |
1452 | ||
1453 | This method will load the database tables from a directory. The tables must already have been created | |
1454 | in the database. (This can be done by calling L</CreateTables>.) The caller passes in a directory name; | |
1455 | all of the relations to be loaded must have a file in the directory with the same name as the relation | |
1456 | (optionally with a suffix of C<.dtx>). Each file must be a tab-delimited table of field values. Each | |
1457 | line of the file will be loaded as a row of the target relation table. The field values should be in | |
1458 | the same order as the fields in the relation tables generated by L</ShowMetaData>. The old data is | |
1459 | erased before the new data is loaded in. | |
1460 | ||
1461 | A certain amount of translation automatically takes place. Ctrl-M characters are deleted, and | |
1462 | tab and new-line characters inside a field are escaped as C<\t> and C<\n>, respectively. Dates must | |
1463 | be entered as a Unix timestamp, that is, as an integer number of seconds since the base epoch. | |
1464 | ||
1465 | =over 4 | |
1466 | ||
1467 | =item directoryName | |
1468 | ||
1469 | Name of the directory containing the relation files to be loaded. | |
1470 | ||
1471 | =item rebuild | |
1472 | ||
1473 | TRUE if the tables should be dropped and rebuilt, else FALSE. This is, unfortunately, the | TRUE if the tables should be dropped and rebuilt, else FALSE. This is, unfortunately, the |
1474 | only way to erase existing data in the tables, since the TRUNCATE command is not supported | only way to erase existing data in the tables, since the TRUNCATE command is not supported |
# | Line 710 | Line 1491 |
1491 | $directoryName =~ s!/\\$!!; | $directoryName =~ s!/\\$!!; |
1492 | # Declare the return variable. | # Declare the return variable. |
1493 | my $retVal = Stats->new(); | my $retVal = Stats->new(); |
1494 | # Get the metadata structure. | # Get the relation names. |
1495 | my $metaData = $self->{_metaData}; | my @relNames = $self->GetTableNames(); |
1496 | # Loop through the entities. | for my $relationName (@relNames) { |
for my $entity (values %{$metaData->{Entities}}) { | ||
# Loop through the entity's relations. | ||
for my $relationName (keys %{$entity->{Relations}}) { | ||
1497 | # Try to load this relation. | # Try to load this relation. |
1498 | my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); | my $result = $self->_LoadRelation($directoryName, $relationName, $rebuild); |
1499 | # Accumulate the statistics. | # Accumulate the statistics. |
1500 | $retVal->Accumulate($result); | $retVal->Accumulate($result); |
1501 | } | } |
} | ||
# Loop through the relationships. | ||
for my $relationshipName (keys %{$metaData->{Relationships}}) { | ||
# Try to load this relationship's relation. | ||
my $result = $self->_LoadRelation($directoryName, $relationshipName, $rebuild); | ||
# Accumulate the statistics. | ||
$retVal->Accumulate($result); | ||
} | ||
1502 | # Add the duration of the load to the statistical object. | # Add the duration of the load to the statistical object. |
1503 | $retVal->Add('duration', gettimeofday - $startTime); | $retVal->Add('duration', gettimeofday - $startTime); |
1504 | # Return the accumulated statistics. | # Return the accumulated statistics. |
1505 | return $retVal; | return $retVal; |
1506 | } | } |
1507 | ||
1508 | ||
1509 | =head3 GetTableNames | =head3 GetTableNames |
1510 | ||
1511 | C<< my @names = $database->GetTableNames; >> | my @names = $erdb->GetTableNames; |
1512 | ||
1513 | Return a list of the relations required to implement this database. | Return a list of the relations required to implement this database. |
1514 | ||
# | Line 754 | Line 1525 |
1525 | ||
1526 | =head3 GetEntityTypes | =head3 GetEntityTypes |
1527 | ||
1528 | C<< my @names = $database->GetEntityTypes; >> | my @names = $erdb->GetEntityTypes; |
1529 | ||
1530 | Return a list of the entity type names. | Return a list of the entity type names. |
1531 | ||
# | Line 769 | Line 1540 |
1540 | return sort keys %{$entityList}; | return sort keys %{$entityList}; |
1541 | } | } |
1542 | ||
1543 | ||
1544 | =head3 GetConnectingRelationships | |
1545 | ||
1546 | my @list = $erdb->GetConnectingRelationships($entityName); | |
1547 | ||
1548 | Return a list of the relationships connected to the specified entity. | |
1549 | ||
1550 | =over 4 | |
1551 | ||
1552 | =item entityName | |
1553 | ||
1554 | Entity whose connected relationships are desired. | |
1555 | ||
1556 | =item RETURN | |
1557 | ||
1558 | Returns a list of the relationships that originate from the entity. | |
1559 | If the entity is on the from end, it will return the relationship | |
1560 | name. If the entity is on the to end it will return the converse of | |
1561 | the relationship name. | |
1562 | ||
1563 | =back | |
1564 | ||
1565 | =cut | |
1566 | ||
1567 | sub GetConnectingRelationships { | |
1568 | # Get the parameters. | |
1569 | my ($self, $entityName) = @_; | |
1570 | # Declare the return variable. | |
1571 | my @retVal; | |
1572 | # Get the relationship list. | |
1573 | my $relationships = $self->{_metaData}->{Relationships}; | |
1574 | # Find the entity. | |
1575 | my $entity = $self->{_metaData}->{Entities}->{$entityName}; | |
1576 | # Only proceed if the entity exists. | |
1577 | if (! defined $entity) { | |
1578 | Trace("Entity $entityName not found.") if T(3); | |
1579 | } else { | |
1580 | # Loop through the relationships. | |
1581 | my @rels = keys %$relationships; | |
1582 | Trace(scalar(@rels) . " relationships found in connection search.") if T(3); | |
1583 | for my $relationshipName (@rels) { | |
1584 | my $relationship = $relationships->{$relationshipName}; | |
1585 | if ($relationship->{from} eq $entityName) { | |
1586 | # Here we have a forward relationship. | |
1587 | push @retVal, $relationshipName; | |
1588 | } elsif ($relationship->{to} eq $entityName) { | |
1589 | # Here we have a backward relationship. In this case, the | |
1590 | # converse relationship name is preferred if it exists. | |
1591 | my $converse = $relationship->{converse} || $relationshipName; | |
1592 | push @retVal, $converse; | |
1593 | } | |
1594 | } | |
1595 | } | |
1596 | # Return the result. | |
1597 | return @retVal; | |
1598 | } | |
1599 | ||
1600 | ||
1601 | =head3 GetDataTypes | |
1602 | ||
1603 | my %types = ERDB::GetDataTypes(); | |
1604 | ||
1605 | Return a table of ERDB data types. The table returned is a hash of hashes. | |
1606 | The keys of the big hash are the datatypes. Each smaller hash has several | |
1607 | values used to manage the data. The most interesting is the SQL type (key | |
1608 | C<sqlType>) and the descriptive node (key C<notes>). | |
1609 | ||
1610 | Note that changing the values in the smaller hashes will seriously break | |
1611 | things, so this data should be treated as read-only. | |
1612 | ||
1613 | =cut | |
1614 | ||
1615 | sub GetDataTypes { | |
1616 | return %TypeTable; | |
1617 | } | |
1618 | ||
1619 | ||
1620 | =head3 IsEntity | |
1621 | ||
1622 | my $flag = $erdb->IsEntity($entityName); | |
1623 | ||
1624 | Return TRUE if the parameter is an entity name, else FALSE. | |
1625 | ||
1626 | =over 4 | |
1627 | ||
1628 | =item entityName | |
1629 | ||
1630 | Object name to be tested. | |
1631 | ||
1632 | =item RETURN | |
1633 | ||
1634 | Returns TRUE if the specified string is an entity name, else FALSE. | |
1635 | ||
1636 | =back | |
1637 | ||
1638 | =cut | |
1639 | ||
1640 | sub IsEntity { | |
1641 | # Get the parameters. | |
1642 | my ($self, $entityName) = @_; | |
1643 | # Test to see if it's an entity. | |
1644 | return exists $self->{_metaData}->{Entities}->{$entityName}; | |
1645 | } | |
1646 | ||
1647 | =head3 Get | =head3 Get |
1648 | ||
1649 | C<< my $query = $database->Get(\@objectNames, $filterClause, $param1, $param2, ..., $paramN); >> | my $query = $erdb->Get(\@objectNames, $filterClause, \@params); |
1650 | ||
1651 | This method returns a query object for entities of a specified type using a specified filter. | This method returns a query object for entities of a specified type using a specified filter. |
1652 | The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each | The filter is a standard WHERE/ORDER BY clause with question marks as parameter markers and each |
# | Line 779 | Line 1654 |
1654 | following call requests all B<Genome> objects for the genus specified in the variable | following call requests all B<Genome> objects for the genus specified in the variable |
1655 | $genus. | $genus. |
1656 | ||
1657 | C<< $query = $sprout->Get(['Genome'], "Genome(genus) = ?", $genus); >> | $query = $erdb->Get(['Genome'], "Genome(genus) = ?", [$genus]); |
1658 | ||
1659 | The WHERE clause contains a single question mark, so there is a single additional | The WHERE clause contains a single question mark, so there is a single additional |
1660 | parameter representing the parameter value. It would also be possible to code | parameter representing the parameter value. It would also be possible to code |
1661 | ||
1662 | C<< $query = $sprout->Get(['Genome'], "Genome(genus) = \'$genus\'"); >> | $query = $erdb->Get(['Genome'], "Genome(genus) = \'$genus\'"); |
1663 | ||
1664 | however, this version of the call would generate a syntax error if there were any quote | however, this version of the call would generate a syntax error if there were any quote |
1665 | characters inside the variable C<$genus>. | characters inside the variable C<$genus>. |
# | Line 796 | Line 1671 |
1671 | It is possible to specify multiple entity and relationship names in order to retrieve more than | It is possible to specify multiple entity and relationship names in order to retrieve more than |
1672 | one object's data at the same time, which allows highly complex joined queries. For example, | one object's data at the same time, which allows highly complex joined queries. For example, |
1673 | ||
1674 | C<< $query = $sprout->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", $genus); >> | $query = $erdb->Get(['Genome', 'ComesFrom', 'Source'], "Genome(genus) = ?", [$genus]); |
1675 | ||
1676 | If multiple names are specified, then the query processor will automatically determine a | If multiple names are specified, then the query processor will automatically determine a |
1677 | join path between the entities and relationships. The algorithm used is very simplistic. | join path between the entities and relationships. The algorithm used is very simplistic. |
1678 | In particular, you can't specify any entity or relationship more than once, and if a | In particular, if a relationship is recursive, the path is determined by the order in which |
1679 | relationship is recursive, the path is determined by the order in which the entity | the entity and the relationship appear. For example, consider a recursive relationship |
1680 | and the relationship appear. For example, consider a recursive relationship B<IsParentOf> | B<IsParentOf> which relates B<People> objects to other B<People> objects. If the join path is |
which relates B<People> objects to other B<People> objects. If the join path is | ||
1681 | coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, | coded as C<['People', 'IsParentOf']>, then the people returned will be parents. If, however, |
1682 | the join path is C<['IsParentOf', 'People']>, then the people returned will be children. | the join path is C<['IsParentOf', 'People']>, then the people returned will be children. |
1683 | ||
1684 | If an entity or relationship is mentioned twice, the name for the second occurrence will | |
1685 | be suffixed with C<2>, the third occurrence will be suffixed with C<3>, and so forth. So, | |
1686 | for example, if we have C<['Feature', 'HasContig', 'Contig', 'HasContig']>, then the | |
1687 | B<to-link> field of the first B<HasContig> is specified as C<HasContig(to-link)>, while | |
1688 | the B<to-link> field of the second B<HasContig> is specified as C<HasContig2(to-link)>. | |
1689 | ||
1690 | =over 4 | =over 4 |
1691 | ||
1692 | =item objectNames | =item objectNames |
# | Line 827 | Line 1707 |
1707 | with an ORDER BY clause. For example, the following filter string gets all genomes for a | with an ORDER BY clause. For example, the following filter string gets all genomes for a |
1708 | particular genus and sorts them by species name. | particular genus and sorts them by species name. |
1709 | ||
1710 | C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | "Genome(genus) = ? ORDER BY Genome(species)" |
1711 | ||
1712 | Note that the case is important. Only an uppercase "ORDER BY" with a single space will | |
1713 | be processed. The idea is to make it less likely to find the verb by accident. | |
1714 | ||
1715 | The rules for field references in a sort order are the same as those for field references in the | The rules for field references in a sort order are the same as those for field references in the |
1716 | filter clause in general; however, odd things may happen if a sort field is from a secondary | filter clause in general; however, odd things may happen if a sort field is from a secondary |
1717 | relation. | relation. |
1718 | ||
1719 | =item param1, param2, ..., paramN | Finally, you can limit the number of rows returned by adding a LIMIT clause. The LIMIT must |
1720 | be the last thing in the filter clause, and it contains only the word "LIMIT" followed by | |
1721 | a positive number. So, for example | |
1722 | ||
1723 | "Genome(genus) = ? ORDER BY Genome(species) LIMIT 10" | |
1724 | ||
1725 | will only return the first ten genomes for the specified genus. The ORDER BY clause is not | |
1726 | required. For example, to just get the first 10 genomes in the B<Genome> table, you could | |
1727 | use | |
1728 | ||
1729 | "LIMIT 10" | |
1730 | ||
1731 | Parameter values to be substituted into the filter clause. | =item params |
1732 | ||
1733 | Reference to a list of parameter values to be substituted into the filter clause. | |
1734 | ||
1735 | =item RETURN | =item RETURN |
1736 | ||
# | Line 847 | Line 1742 |
1742 | ||
1743 | sub Get { | sub Get { |
1744 | # Get the parameters. | # Get the parameters. |
1745 | my ($self, $objectNames, $filterClause, @params) = @_; | my ($self, $objectNames, $filterClause, $params) = @_; |
1746 | # Construct the SELECT statement. The general pattern is | # Process the SQL stuff. |
1747 | # | my ($suffix, $mappedNameListRef, $mappedNameHashRef) = |
1748 | # SELECT name1.*, name2.*, ... nameN.* FROM name1, name2, ... nameN | $self->_SetupSQL($objectNames, $filterClause); |
1749 | # | # Create the query. |
1750 | my $dbh = $self->{_dbh}; | my $command = "SELECT " . join(".*, ", @{$mappedNameListRef}) . |
1751 | my $command = "SELECT DISTINCT " . join('.*, ', @{$objectNames}) . ".* FROM " . | ".* $suffix"; |
1752 | join(', ', @{$objectNames}); | my $sth = $self->_GetStatementHandle($command, $params); |
1753 | # Check for a filter clause. | # Now we create the relation map, which enables DBQuery to determine the order, name |
1754 | if ($filterClause) { | # and mapped name for each object in the query. |
1755 | # Here we have one, so we convert its field names and add it to the query. First, | my @relationMap = (); |
1756 | # We create a copy of the filter string we can work with. | for my $mappedName (@{$mappedNameListRef}) { |
1757 | my $filterString = $filterClause; | push @relationMap, [$mappedName, $mappedNameHashRef->{$mappedName}]; |
# Next, we sort the object names by length. This helps protect us from finding | ||
# object names inside other object names when we're doing our search and replace. | ||
my @sortedNames = sort { length($b) - length($a) } @{$objectNames}; | ||
# We will also keep a list of conditions to add to the WHERE clause in order to link | ||
# entities and relationships as well as primary relations to secondary ones. | ||
my @joinWhere = (); | ||
# The final preparatory step is to create a hash table of relation names. The | ||
# table begins with the relation names already in the SELECT command. | ||
my %fromNames = (); | ||
for my $objectName (@sortedNames) { | ||
$fromNames{$objectName} = 1; | ||
} | ||
# We are ready to begin. We loop through the object names, replacing each | ||
# object name's field references by the corresponding SQL field reference. | ||
# Along the way, if we find a secondary relation, we will need to add it | ||
# to the FROM clause. | ||
for my $objectName (@sortedNames) { | ||
# Get the length of the object name plus 2. This is the value we add to the | ||
# size of the field name to determine the size of the field reference as a | ||
# whole. | ||
my $nameLength = 2 + length $objectName; | ||
# Get the object's field list. | ||
my $fieldList = $self->_GetFieldTable($objectName); | ||
# Find the field references for this object. | ||
while ($filterString =~ m/$objectName\(([^)]*)\)/g) { | ||
# At this point, $1 contains the field name, and the current position | ||
# is set immediately after the final parenthesis. We pull out the name of | ||
# the field and the position and length of the field reference as a whole. | ||
my $fieldName = $1; | ||
my $len = $nameLength + length $fieldName; | ||
my $pos = pos($filterString) - $len; | ||
# Insure the field exists. | ||
if (!exists $fieldList->{$fieldName}) { | ||
Confess("Field $fieldName not found for object $objectName."); | ||
} else { | ||
# Get the field's relation. | ||
my $relationName = $fieldList->{$fieldName}->{relation}; | ||
# Insure the relation is in the FROM clause. | ||
if (!exists $fromNames{$relationName}) { | ||
# Add the relation to the FROM clause. | ||
$command .= ", $relationName"; | ||
# Create its join sub-clause. | ||
push @joinWhere, "$objectName.id = $relationName.id"; | ||
# Denote we have it available for future fields. | ||
$fromNames{$relationName} = 1; | ||
} | ||
# Form an SQL field reference from the relation name and the field name. | ||
my $sqlReference = "$relationName." . _FixName($fieldName); | ||
# Put it into the filter string in place of the old value. | ||
substr($filterString, $pos, $len) = $sqlReference; | ||
# Reposition the search. | ||
pos $filterString = $pos + length $sqlReference; | ||
} | ||
} | ||
} | ||
# The next step is to join the objects together. We only need to do this if there | ||
# is more than one object in the object list. We start with the first object and | ||
# run through the objects after it. Note also that we make a safety copy of the | ||
# list before running through it. | ||
my @objectList = @{$objectNames}; | ||
my $lastObject = shift @objectList; | ||
# Get the join table. | ||
my $joinTable = $self->{_metaData}->{Joins}; | ||
# Loop through the object list. | ||
for my $thisObject (@objectList) { | ||
# Look for a join. | ||
my $joinKey = "$lastObject/$thisObject"; | ||
if (!exists $joinTable->{$joinKey}) { | ||
# Here there's no join, so we throw an error. | ||
Confess("No join exists to connect from $lastObject to $thisObject."); | ||
} else { | ||
# Get the join clause and add it to the WHERE list. | ||
push @joinWhere, $joinTable->{$joinKey}; | ||
# Save this object as the last object for the next iteration. | ||
$lastObject = $thisObject; | ||
} | ||
} | ||
# Now we need to handle the whole ORDER BY thing. We'll put the order by clause | ||
# in the following variable. | ||
my $orderClause = ""; | ||
# Locate the ORDER BY verb (if any). | ||
if ($filterString =~ m/^(.*)ORDER BY/g) { | ||
# Here we have an ORDER BY verb. Split it off of the filter string. | ||
my $pos = pos $filterString; | ||
$orderClause = substr($filterString, $pos); | ||
$filterString = $1; | ||
} | ||
# Add the filter and the join clauses (if any) to the SELECT command. | ||
if ($filterString) { | ||
push @joinWhere, "($filterString)"; | ||
} | ||
if (@joinWhere) { | ||
$command .= " WHERE " . join(' AND ', @joinWhere); | ||
} | ||
# Add the sort clause (if any) to the SELECT command. | ||
if ($orderClause) { | ||
$command .= " ORDER BY $orderClause"; | ||
} | ||
1758 | } | } |
Trace("SQL query: $command") if T(2); | ||
Trace("PARMS: '" . (join "', '", @params) . "'") if (T(3) && (@params > 0)); | ||
my $sth = $dbh->prepare_command($command); | ||
# Execute it with the parameters bound in. | ||
$sth->execute(@params) || Confess("SELECT error" . $sth->errstr()); | ||
1759 | # Return the statement object. | # Return the statement object. |
1760 | my $retVal = DBQuery::_new($self, $sth, @{$objectNames}); | my $retVal = DBQuery::_new($self, $sth, \@relationMap); |
1761 | return $retVal; | return $retVal; |
1762 | } | } |
1763 | ||
=head3 GetList | ||
1764 | ||
C<< my @dbObjects = $database->GetList(\@objectNames, $filterClause, $param1, $param2, ..., $paramN); >> | ||
1765 | ||
1766 | Return a list of object descriptors for the specified objects as determined by the | =head3 Search |
specified filter clause. | ||
1767 | ||
1768 | This method is essentially the same as L</Get> except it returns a list of objects rather | my $query = $erdb->Search($searchExpression, $idx, \@objectNames, $filterClause, \@params); |
1769 | than a query object that can be used to get the results one record at a time. | |
1770 | Perform a full text search with filtering. The search will be against a specified object | |
1771 | in the object name list. That object will get an extra field containing the search | |
1772 | relevance. Note that except for the search expression, the parameters of this method are | |
1773 | the same as those for L</Get> and follow the same rules. | |
1774 | ||
1775 | =over 4 | =over 4 |
1776 | ||
1777 | =item searchExpression | |
1778 | ||
1779 | Boolean search expression for the text fields of the target object. The default mode for | |
1780 | a Boolean search expression is OR, but we want the default to be AND, so we will | |
1781 | add a C<+> operator to each word with no other operator before it. | |
1782 | ||
1783 | =item idx | |
1784 | ||
1785 | Index in the I<$objectNames> list of the table to be searched in full-text mode. | |
1786 | ||
1787 | =item objectNames | =item objectNames |
1788 | ||
1789 | List containing the names of the entity and relationship objects to be retrieved. | List containing the names of the entity and relationship objects to be retrieved. |
# | Line 995 | Line 1798 |
1798 | or secondary entity relations; however, all of the entities and relationships involved must | or secondary entity relations; however, all of the entities and relationships involved must |
1799 | be included in the list of object names. | be included in the list of object names. |
1800 | ||
1801 | The filter clause can also specify a sort order. To do this, simply follow the filter string | =item params |
with an ORDER BY clause. For example, the following filter string gets all genomes for a | ||
particular genus and sorts them by species name. | ||
C<< "Genome(genus) = ? ORDER BY Genome(species)" >> | ||
The rules for field references in a sort order are the same as those for field references in the | ||
filter clause in general; however, odd things may happen if a sort field is from a secondary | ||
relation. | ||
=item param1, param2, ..., paramN | ||
1802 | ||
1803 | Parameter values to be substituted into the filter clause. | Reference to a list of parameter values to be substituted into the filter clause. |
1804 | ||
1805 | =item RETURN | =item RETURN |
1806 | ||
1807 | Returns a list of B<DBObject>s that satisfy the query conditions. | Returns a query object for the specified search. |
1808 | ||
1809 | =back | =back |
1810 | ||
1811 | =cut | =cut |
1812 | #: Return Type @% | |
1813 | sub GetList { | sub Search { |
1814 | # Get the parameters. | # Get the parameters. |
1815 | my ($self, $objectNames, $filterClause, @params) = @_; | my ($self, $searchExpression, $idx, $objectNames, $filterClause, $params) = @_; |
1816 | # Declare the return variable. | # Declare the return variable. |
1817 | my @retVal = (); | my $retVal; |
1818 | # Perform the query. | # Create a safety copy of the parameter list. Note we have to be careful to insure |
1819 | my $query = $self->Get($objectNames, $filterClause, @params); | # a parameter list exists before we copy it. |
1820 | # Loop through the results. | my @myParams = (); |
1821 | while (my $object = $query->Fetch) { | if (defined $params) { |
1822 | push @retVal, $object; | @myParams = @{$params}; |
1823 | } | |
1824 | # Get the first object's structure so we have access to the searchable fields. | |
1825 | my $object1Name = $objectNames->[$idx]; | |
1826 | my $object1Structure = $self->_GetStructure($object1Name); | |
1827 | # Get the field list. | |
1828 | if (! exists $object1Structure->{searchFields}) { | |
1829 | Confess("No searchable index for $object1Name."); | |
1830 | } else { | |
1831 | # Get the field list. | |
1832 | my @fields = @{$object1Structure->{searchFields}}; | |
1833 | # Clean the search expression. | |
1834 | my $actualKeywords = $self->CleanKeywords($searchExpression); | |
1835 | # Prefix a "+" to each uncontrolled word. This converts the default | |
1836 | # search mode from OR to AND. | |
1837 | $actualKeywords =~ s/(^|\s)(\w|")/$1\+$2/g; | |
1838 | Trace("Actual keywords for search are\n$actualKeywords") if T(3); | |
1839 | # We need two match expressions, one for the filter clause and one in the | |
1840 | # query itself. Both will use a parameter mark, so we need to push the | |
1841 | # search expression onto the front of the parameter list twice. | |
1842 | unshift @myParams, $actualKeywords, $actualKeywords; | |
1843 | # Build the match expression. | |
1844 | my @matchFilterFields = map { "$object1Name." . _FixName($_) } @fields; | |
1845 | my $matchClause = "MATCH (" . join(", ", @matchFilterFields) . ") AGAINST (? IN BOOLEAN MODE)"; | |
1846 | # Process the SQL stuff. | |
1847 | my ($suffix, $mappedNameListRef, $mappedNameHashRef) = | |
1848 | $self->_SetupSQL($objectNames, $filterClause, $matchClause); | |
1849 | # Create the query. Note that the match clause is inserted at the front of | |
1850 | # the select fields. | |
1851 | my $command = "SELECT $matchClause, " . join(".*, ", @{$mappedNameListRef}) . | |
1852 | ".* $suffix"; | |
1853 | my $sth = $self->_GetStatementHandle($command, \@myParams); | |
1854 | # Now we create the relation map, which enables DBQuery to determine the order, name | |
1855 | # and mapped name for each object in the query. | |
1856 | my @relationMap = _RelationMap($mappedNameHashRef, $mappedNameListRef); | |
1857 | # Return the statement object. | |
1858 | $retVal = DBQuery::_new($self, $sth, \@relationMap, $object1Name); | |
1859 | } | } |
1860 | # Return the result. | return $retVal; |
return @retVal; | ||
1861 | } | } |
1862 | ||
1863 | =head3 ComputeObjectSentence | =head3 GetFlat |
1864 | ||
1865 | C<< my $sentence = $database->ComputeObjectSentence($objectName); >> | my @list = $erdb->GetFlat(\@objectNames, $filterClause, \@parameterList, $field); |
1866 | ||
1867 | Check an object name, and if it is a relationship convert it to a relationship sentence. | This is a variation of L</GetAll> that asks for only a single field per record and |
1868 | returns a single flattened list. | |
1869 | ||
1870 | =over 4 | =over 4 |
1871 | ||
1872 | =item objectName | =item objectNames |
1873 | ||
1874 | Name of the entity or relationship. | List containing the names of the entity and relationship objects to be retrieved. |
1875 | ||
1876 | =item filterClause | |
1877 | ||
1878 | WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | |
1879 | be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form | |
1880 | B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the | |
1881 | parameter list as additional parameters. The fields in a filter clause can come from primary | |
1882 | entity relations, relationship relations, or secondary entity relations; however, all of the | |
1883 | entities and relationships involved must be included in the list of object names. | |
1884 | ||
1885 | =item parameterList | |
1886 | ||
1887 | List of the parameters to be substituted in for the parameters marks in the filter clause. | |
1888 | ||
1889 | =item field | |
1890 | ||
1891 | Name of the field to be used to get the elements of the list returned. | |
1892 | ||
1893 | =item RETURN | =item RETURN |
1894 | ||
1895 | Returns a string containing the entity name or a relationship sentence. | Returns a list of values. |
1896 | ||
1897 | =back | =back |
1898 | ||
1899 | =cut | =cut |
1900 | #: Return Type @; | |
1901 | sub ComputeObjectSentence { | sub GetFlat { |
1902 | # Get the parameters. | # Get the parameters. |
1903 | my ($self, $objectName) = @_; | my ($self, $objectNames, $filterClause, $parameterList, $field) = @_; |
1904 | # Set the default return value. | # Construct the query. |
1905 | my $retVal = $objectName; | my $query = $self->Get($objectNames, $filterClause, $parameterList); |
1906 | # Look for the object as a relationship. | # Create the result list. |
1907 | my $relTable = $self->{_metaData}->{Relationships}; | my @retVal = (); |
1908 | if (exists $relTable->{$objectName}) { | # Loop through the records, adding the field values found to the result list. |
1909 | # Get the relationship sentence. | while (my $row = $query->Fetch()) { |
1910 | $retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); | push @retVal, $row->Value($field); |
1911 | } | } |
1912 | # Return the result. | # Return the list created. |
1913 | return $retVal; | return @retVal; |
1914 | } | } |
1915 | ||
1916 | =head3 DumpRelations | =head3 SpecialFields |
1917 | ||
1918 | C<< $database->DumpRelations($outputDirectory); >> | my %specials = $erdb->SpecialFields($entityName); |
1919 | ||
1920 | Write the contents of all the relations to tab-delimited files in the specified directory. | Return a hash mapping special fields in the specified entity to the value of their |
1921 | Each file will have the same name as the relation dumped, with an extension of DTX. | C<special> attribute. This enables the subclass to get access to the special field |
1922 | attributes without needed to plumb the internal ERDB data structures. | |
1923 | ||
1924 | =over 4 | =over 4 |
1925 | ||
1926 | =item outputDirectory | =item entityName |
1927 | ||
1928 | Name of the directory into which the relation files should be dumped. | Name of the entity whose special fields are desired. |
1929 | ||
1930 | =item RETURN | |
1931 | ||
1932 | Returns a hash. The keys of the hash are the special field names, and the values | |
1933 | are the values from each special field's C<special> attribute. | |
1934 | ||
1935 | =back | =back |
1936 | ||
1937 | =cut | =cut |
1938 | ||
1939 | sub DumpRelations { | sub SpecialFields { |
1940 | # Get the parameters. | # Get the parameters. |
1941 | my ($self, $outputDirectory) = @_; | my ($self, $entityName) = @_; |
1942 | # Now we need to run through all the relations. First, we loop through the entities. | # Declare the return variable. |
1943 | my $metaData = $self->{_metaData}; | my %retVal = (); |
1944 | my $entities = $metaData->{Entities}; | # Find the entity's data structure. |
1945 | for my $entityName (keys %{$entities}) { | my $entityData = $self->{_metaData}->{Entities}->{$entityName}; |
1946 | my $entityStructure = $entities->{$entityName}; | # Loop through its fields, adding each special field to the return hash. |
1947 | # Get the entity's relations. | my $fieldHash = $entityData->{Fields}; |
1948 | my $relationList = $entityStructure->{Relations}; | for my $fieldName (keys %{$fieldHash}) { |
1949 | # Loop through the relations, dumping them. | my $fieldData = $fieldHash->{$fieldName}; |
1950 | for my $relationName (keys %{$relationList}) { | if (exists $fieldData->{special}) { |
1951 | my $relation = $relationList->{$relationName}; | $retVal{$fieldName} = $fieldData->{special}; |
$self->_DumpRelation($outputDirectory, $relationName, $relation); | ||
} | ||
1952 | } | } |
# Next, we loop through the relationships. | ||
my $relationships = $metaData->{Relationships}; | ||
for my $relationshipName (keys %{$relationships}) { | ||
my $relationshipStructure = $relationships->{$relationshipName}; | ||
# Dump this relationship's relation. | ||
$self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); | ||
1953 | } | } |
1954 | # Return the result. | |
1955 | return %retVal; | |
1956 | } | } |
1957 | ||
1958 | =head3 InsertObject | =head3 Delete |
1959 | ||
1960 | C<< my $ok = $database->InsertObject($objectType, \%fieldHash); >> | my $stats = $erdb->Delete($entityName, $objectID, %options); |
1961 | ||
1962 | Insert an object into the database. The object is defined by a type name and then a hash | Delete an entity instance from the database. The instance is deleted along with all entity and |
1963 | of field names to values. Field values in the primary relation are represented by scalars. | relationship instances dependent on it. The definition of I<dependence> is recursive. |
(Note that for relationships, the primary relation is the B<only> relation.) | ||
Field values for the other relations comprising the entity are always list references. For | ||
example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases | ||
C<ZP_00210270.1> and C<gi|46206278>. | ||
1964 | ||
1965 | C<< $database->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); >> | An object is always dependent on itself. An object is dependent if it is a 1-to-many or many-to-many |
1966 | relationship connected to a dependent entity or if it is the "to" entity connected to a 1-to-many | |
1967 | dependent relationship. | |
1968 | ||
1969 | The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and | =over 4 |
property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. | ||
1970 | ||
1971 | C<< $database->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence = 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); >> | =item entityName |
1972 | ||
1973 | =over 4 | Name of the entity type for the instance being deleted. |
1974 | ||
1975 | =item newObjectType | =item objectID |
1976 | ||
1977 | Type name of the object to insert. | ID of the entity instance to be deleted. If the ID contains a wild card character (C<%>), |
1978 | then it is presumed to by a LIKE pattern. | |
1979 | ||
1980 | =item fieldHash | =item options |
1981 | ||
1982 | Hash of field names to values. | A hash detailing the options for this delete operation. |
1983 | ||
1984 | =item RETURN | =item RETURN |
1985 | ||
1986 | Returns 1 if successful, 0 if an error occurred. | Returns a statistics object indicating how many records of each particular table were |
1987 | deleted. | |
1988 | ||
1989 | =back | =back |
1990 | ||
1991 | =cut | The permissible options for this method are as follows. |
1992 | ||
1993 | sub InsertObject { | =over 4 |
1994 | ||
1995 | =item testMode | |
1996 | ||
1997 | If TRUE, then the delete statements will be traced, but no changes will be made to the database. | |
1998 | ||
1999 | =item keepRoot | |
2000 | ||
2001 | If TRUE, then the entity instances will not be deleted, only the dependent records. | |
2002 | ||
2003 | =back | |
2004 | ||
2005 | =cut | |
2006 | #: Return Type $%; | |
2007 | sub Delete { | |
2008 | # Get the parameters. | # Get the parameters. |
2009 | my ($self, $newObjectType, $fieldHash) = @_; | my ($self, $entityName, $objectID, %options) = @_; |
2010 | # Denote that so far we appear successful. | # Declare the return variable. |
2011 | my $retVal = 1; | my $retVal = Stats->new(); |
2012 | # Get the database handle. | # Get the DBKernel object. |
2013 | my $dbh = $self->{_dbh}; | my $db = $self->{_dbh}; |
2014 | # Get the relation list. | # We're going to generate all the paths branching out from the starting entity. One of |
2015 | my $relationTable = $self->_GetRelationTable($newObjectType); | # the things we have to be careful about is preventing loops. We'll use a hash to |
2016 | # Loop through the relations. We'll build insert statements for each one. If a relation is | # determine if we've hit a loop. |
2017 | # secondary, we may end up generating multiple insert statements. If an error occurs, we | my %alreadyFound = (); |
2018 | # stop the loop. | # These next lists will serve as our result stack. We start by pushing object lists onto |
2019 | my @relationList = keys %{$relationTable}; | # the stack, and then popping them off to do the deletes. This means the deletes will |
2020 | for (my $i = 0; $retVal && $i <= $#relationList; $i++) { | # start with the longer paths before getting to the shorter ones. That, in turn, makes |
2021 | my $relationName = $relationList[$i]; | # sure we don't delete records that might be needed to forge relationships back to the |
2022 | my $relationDefinition = $relationTable->{$relationName}; | # original item. We have two lists-- one for TO-relationships, and one for |
2023 | # Get the relation's fields. For each field we will collect a value in the corresponding | # FROM-relationships and entities. |
2024 | # position of the @valueList array. If one of the fields is missing, we will add it to the | my @fromPathList = (); |
2025 | # @missing list. | my @toPathList = (); |
2026 | my @fieldList = @{$relationDefinition->{Fields}}; | # This final list is used to remember what work still needs to be done. We push paths |
2027 | my @fieldNameList = (); | # onto the list, then pop them off to extend the paths. We prime it with the starting |
2028 | my @valueList = (); | # point. Note that we will work hard to insure that the last item on a path in the |
2029 | my @missing = (); | # to-do list is always an entity. |
2030 | my $recordCount = 1; | my @todoList = ([$entityName]); |
2031 | for my $fieldDescriptor (@fieldList) { | while (@todoList) { |
2032 | # Get the field name and save it. Note we need to fix it up so the hyphens | # Get the current path. |
2033 | # are converted to underscores. | my $current = pop @todoList; |
2034 | my $fieldName = $fieldDescriptor->{name}; | # Copy it into a list. |
2035 | push @fieldNameList, _FixName($fieldName); | my @stackedPath = @{$current}; |
2036 | # Look for the named field in the incoming structure. Note that we are looking | # Pull off the last item on the path. It will always be an entity. |
2037 | # for the real field name, not the fixed-up one! | my $myEntityName = pop @stackedPath; |
2038 | if (exists $fieldHash->{$fieldName}) { | # Add it to the alreadyFound list. |
2039 | # Here we found the field. Stash it in the value list. | $alreadyFound{$myEntityName} = 1; |
2040 | my $value = $fieldHash->{$fieldName}; | # Figure out if we need to delete this entity. |
2041 | push @valueList, $value; | if ($myEntityName ne $entityName || ! $options{keepRoot}) { |
2042 | # If the value is a list, we may need to increment the record count. | # Get the entity data. |
2043 | if (ref $value eq "ARRAY") { | my $entityData = $self->_GetStructure($myEntityName); |
2044 | my $thisCount = @{$value}; | # Loop through the entity's relations. A DELETE command will be needed for each of them. |
2045 | if ($recordCount == 1) { | my $relations = $entityData->{Relations}; |
2046 | # Here we have our first list, so we save its count. | for my $relation (keys %{$relations}) { |
2047 | $recordCount = $thisCount; | my @augmentedList = (@stackedPath, $relation); |
2048 | } elsif ($recordCount != $thisCount) { | push @fromPathList, \@augmentedList; |
# Here we have a second list, so its length has to match the | ||
# previous lists. | ||
Trace("Field $value in new $newObjectType object has an invalid list length $thisCount. Expected $recordCount.") if T(0); | ||
$retVal = 0; | ||
2049 | } | } |
2050 | } | } |
2051 | # Now we need to look for relationships connected to this entity. | |
2052 | my $relationshipList = $self->{_metaData}->{Relationships}; | |
2053 | for my $relationshipName (keys %{$relationshipList}) { | |
2054 | my $relationship = $relationshipList->{$relationshipName}; | |
2055 | # Check the FROM field. We're only interested if it's us. | |
2056 | if ($relationship->{from} eq $myEntityName) { | |
2057 | # Add the path to this relationship. | |
2058 | my @augmentedList = (@stackedPath, $myEntityName, $relationshipName); | |
2059 | push @fromPathList, \@augmentedList; | |
2060 | # Check the arity. If it's MM we're done. If it's 1M | |
2061 | # and the target hasn't been seen yet, we want to | |
2062 | # stack the entity for future processing. | |
2063 | if ($relationship->{arity} eq '1M') { | |
2064 | my $toEntity = $relationship->{to}; | |
2065 | if (! exists $alreadyFound{$toEntity}) { | |
2066 | # Here we have a new entity that's dependent on | |
2067 | # the current entity, so we need to stack it. | |
2068 | my @stackList = (@augmentedList, $toEntity); | |
2069 | push @fromPathList, \@stackList; | |
2070 | } else { | } else { |
2071 | # Here the field is not present. Flag it as missing. | Trace("$toEntity ignored because it occurred previously.") if T(4); |
push @missing, $fieldName; | ||
2072 | } | } |
2073 | } | } |
# If we are the primary relation, add the new-record flag. | ||
if ($relationName eq $newObjectType) { | ||
push @valueList, 1; | ||
push @fieldNameList, "new_record"; | ||
2074 | } | } |
2075 | # Only proceed if there are no missing fields. | # Now check the TO field. In this case only the relationship needs |
2076 | if (@missing > 0) { | # deletion. |
2077 | Trace("Relation $relationName for $newObjectType skipped due to missing fields: " . | if ($relationship->{to} eq $myEntityName) { |
2078 | join(' ', @missing)) if T(1); | my @augmentedList = (@stackedPath, $myEntityName, $relationshipName); |
2079 | } else { | push @toPathList, \@augmentedList; |
# Build the INSERT statement. | ||
my $statement = "INSERT INTO $relationName (" . join (', ', @fieldNameList) . | ||
") VALUES ("; | ||
# Create a marker list of the proper size and put it in the statement. | ||
my @markers = (); | ||
while (@markers < @fieldNameList) { push @markers, '?'; } | ||
$statement .= join(', ', @markers) . ")"; | ||
# We have the insert statement, so prepare it. | ||
my $sth = $dbh->prepare_command($statement); | ||
Trace("Insert statement prepared: $statement") if T(3); | ||
# Now we loop through the values. If a value is scalar, we use it unmodified. If it's | ||
# a list, we use the current element. The values are stored in the @parameterList array. | ||
my $done = 0; | ||
for (my $i = 0; $i < $recordCount; $i++) { | ||
# Clear the parameter list array. | ||
my @parameterList = (); | ||
# Loop through the values. | ||
for my $value (@valueList) { | ||
# Check to see if this is a scalar value. | ||
if (ref $value eq "ARRAY") { | ||
# Here we have a list value. Pull the current entry. | ||
push @parameterList, $value->[$i]; | ||
} else { | ||
# Here we have a scalar value. Use it unmodified. | ||
push @parameterList, $value; | ||
2080 | } | } |
2081 | } | } |
# Execute the INSERT statement with the specified parameter list. | ||
$retVal = $sth->execute(@parameterList); | ||
if (!$retVal) { | ||
my $errorString = $sth->errstr(); | ||
Trace("Insert error: $errorString.") if T(0); | ||
2082 | } | } |
2083 | # Create the first qualifier for the WHERE clause. This selects the | |
2084 | # keys of the primary entity records to be deleted. When we're deleting | |
2085 | # from a dependent table, we construct a join path from the first qualifier | |
2086 | # to the table containing the dependent records to delete. | |
2087 | my $qualifier = ($objectID =~ /%/ ? "LIKE ?" : "= ?"); | |
2088 | # We need to make two passes. The first is through the to-list, and | |
2089 | # the second through the from-list. The from-list is second because | |
2090 | # the to-list may need to pass through some of the entities the | |
2091 | # from-list would delete. | |
2092 | my %stackList = ( from_link => \@fromPathList, to_link => \@toPathList ); | |
2093 | # Now it's time to do the deletes. We do it in two passes. | |
2094 | for my $keyName ('to_link', 'from_link') { | |
2095 | # Get the list for this key. | |
2096 | my @pathList = @{$stackList{$keyName}}; | |
2097 | Trace(scalar(@pathList) . " entries in path list for $keyName.") if T(3); | |
2098 | # Loop through this list. | |
2099 | while (my $path = pop @pathList) { | |
2100 | # Get the table whose rows are to be deleted. | |
2101 | my @pathTables = @{$path}; | |
2102 | # Start the DELETE statement. We need to call DBKernel because the | |
2103 | # syntax of a DELETE-USING varies among DBMSs. | |
2104 | my $target = $pathTables[$#pathTables]; | |
2105 | my $stmt = $db->SetUsing(@pathTables); | |
2106 | # Now start the WHERE. The first thing is the ID field from the starting table. That | |
2107 | # starting table will either be the entity relation or one of the entity's | |
2108 | # sub-relations. | |
2109 | $stmt .= " WHERE $pathTables[0].id $qualifier"; | |
2110 | # Now we run through the remaining entities in the path, connecting them up. | |
2111 | for (my $i = 1; $i <= $#pathTables; $i += 2) { | |
2112 | # Connect the current relationship to the preceding entity. | |
2113 | my ($entity, $rel) = @pathTables[$i-1,$i]; | |
2114 | # The style of connection depends on the direction of the relationship. | |
2115 | $stmt .= " AND $entity.id = $rel.$keyName"; | |
2116 | if ($i + 1 <= $#pathTables) { | |
2117 | # Here there's a next entity, so connect that to the relationship's | |
2118 | # to-link. | |
2119 | my $entity2 = $pathTables[$i+1]; | |
2120 | $stmt .= " AND $rel.to_link = $entity2.id"; | |
2121 | } | |
2122 | } | |
2123 | # Now we have our desired DELETE statement. | |
2124 | if ($options{testMode}) { | |
2125 | # Here the user wants to trace without executing. | |
2126 | Trace($stmt) if T(0); | |
2127 | } else { | |
2128 | # Here we can delete. Note that the SQL method dies with a confession | |
2129 | # if an error occurs, so we just go ahead and do it. | |
2130 | Trace("Executing delete from $target using '$objectID'.") if T(3); | |
2131 | my $rv = $db->SQL($stmt, 0, $objectID); | |
2132 | # Accumulate the statistics for this delete. The only rows deleted | |
2133 | # are from the target table, so we use its name to record the | |
2134 | # statistic. | |
2135 | $retVal->Add($target, $rv); | |
2136 | } | } |
2137 | } | } |
2138 | } | } |
2139 | # Return the success indicator. | # Return the result. |
2140 | return $retVal; | return $retVal; |
2141 | } | } |
2142 | ||
2143 | =head3 LoadTable | =head3 Disconnect |
2144 | ||
2145 | C<< my %results = $database->LoadTable($fileName, $relationName, $truncateFlag); >> | $erdb->Disconnect($relationshipName, $originEntityName, $originEntityID); |
2146 | ||
2147 | Load data from a tab-delimited file into a specified table, optionally re-creating the table | Disconnect an entity instance from all the objects to which it is related. This |
2148 | first. | will delete each relationship instance that connects to the specified entity. |
2149 | ||
2150 | =over 4 | =over 4 |
2151 | ||
2152 | =item fileName | =item relationshipName |
Name of the file from which the table data should be loaded. | ||
=item relationName | ||
2153 | ||
2154 | Name of the relation to be loaded. This is the same as the table name. | Name of the relationship whose instances are to be deleted. |
2155 | ||
2156 | =item truncateFlag | =item originEntityName |
2157 | ||
2158 | TRUE if the table should be dropped and re-created, else FALSE | Name of the entity that is to be disconnected. |
2159 | ||
2160 | =item RETURN | =item originEntityID |
2161 | ||
2162 | Returns a statistical object containing the number of records read and a list of | ID of the entity that is to be disconnected. |
the error messages. | ||
2163 | ||
2164 | =back | =back |
2165 | ||
2166 | =cut | =cut |
2167 | sub LoadTable { | |
2168 | sub Disconnect { | |
2169 | # Get the parameters. | # Get the parameters. |
2170 | my ($self, $fileName, $relationName, $truncateFlag) = @_; | my ($self, $relationshipName, $originEntityName, $originEntityID) = @_; |
2171 | # Create the statistical return object. | # Get the relationship descriptor. |
2172 | my $retVal = _GetLoadStats(); | my $structure = $self->_GetStructure($relationshipName); |
2173 | # Trace the fact of the load. | # Insure we have a relationship. |
2174 | Trace("Loading table $relationName from $fileName") if T(2); | if (! exists $structure->{from}) { |
2175 | Confess("$relationshipName is not a relationship in the database."); | |
2176 | } else { | |
2177 | # Get the database handle. | # Get the database handle. |
2178 | my $dbh = $self->{_dbh}; | my $dbh = $self->{_dbh}; |
2179 | # Get the relation data. | # We'll set this value to 1 if we find our entity. |
2180 | my $relation = $self->_FindRelation($relationName); | my $found = 0; |
2181 | # Check the truncation flag. | # Loop through the ends of the relationship. |
2182 | if ($truncateFlag) { | for my $dir ('from', 'to') { |
2183 | Trace("Creating table $relationName") if T(2); | if ($structure->{$dir} eq $originEntityName) { |
2184 | # Re-create the table without its index. | $found = 1; |
2185 | $self->CreateTable($relationName, 0); | # Here we want to delete all relationship instances on this side of the |
2186 | # If this is a pre-index DBMS, create the index here. | # entity instance. |
2187 | if ($dbh->{_preIndex}) { | Trace("Disconnecting in $dir direction with ID \"$originEntityID\"."); |
2188 | eval { | # We do this delete in batches to keep it from dragging down the |
2189 | $self->CreateIndex($relationName); | # server. |
2190 | }; | my $limitClause = ($FIG_Config::delete_limit ? "LIMIT $FIG_Config::delete_limit" : ""); |
2191 | if ($@) { | my $done = 0; |
2192 | $retVal->AddMessage($@); | while (! $done) { |
2193 | } | # Do the delete. |
2194 | my $rows = $dbh->SQL("DELETE FROM $relationshipName WHERE ${dir}_link = ? $limitClause", 0, $originEntityID); | |
2195 | # See if we're done. We're done if no rows were found or the delete is unlimited. | |
2196 | $done = ($rows == 0 || ! $limitClause); | |
2197 | } | } |
2198 | } | } |
# Determine whether or not this is a primary relation. Primary relations have an extra | ||
# field indicating whether or not a given object is new or was loaded from the flat files. | ||
my $primary = $self->_IsPrimary($relationName); | ||
# Get the number of fields in this relation. | ||
my @fieldList = @{$relation->{Fields}}; | ||
my $fieldCount = @fieldList; | ||
# Start a database transaction. | ||
$dbh->begin_tran; | ||
# Open the relation file. We need to create a cleaned-up copy before loading. | ||
open TABLEIN, '<', $fileName; | ||
my $tempName = "$fileName.tbl"; | ||
open TABLEOUT, '>', $tempName; | ||
my $inputCount = 0; | ||
# Loop through the file. | ||
while (<TABLEIN>) { | ||
$inputCount++; | ||
# Chop off the new-line character. | ||
my $record = Tracer::Strip($_); | ||
# Only proceed if the record is non-blank. | ||
if ($record) { | ||
# Escape all the backslashes found in the line. | ||
$record =~ s/\\/\\\\/g; | ||
# Insure the number of fields is correct. | ||
my @fields = split /\t/, $record; | ||
while (@fields > $fieldCount) { | ||
my $extraField = $fields[$#fields]; | ||
delete $fields[$#fields]; | ||
if ($extraField) { | ||
Trace("Nonblank extra field value \"$extraField\" deleted from record $inputCount of $fileName.") if T(1); | ||
} | ||
} | ||
while (@fields < $fieldCount) { | ||
push @fields, ""; | ||
} | ||
# If this is a primary relation, add a 0 for the new-record flag (indicating that | ||
# this record is not new, but part of the original load). | ||
if ($primary) { | ||
push @fields, "0"; | ||
} | ||
# Write the record. | ||
$record = join "\t", @fields; | ||
print TABLEOUT "$record\n"; | ||
# Count the record written. | ||
my $count = $retVal->Add('records'); | ||
my $len = length $record; | ||
Trace("Record $count written with $len characters.") if T(4); | ||
} else { | ||
# Here we have a blank record. | ||
$retVal->Add('skipped'); | ||
} | ||
} | ||
# Close the files. | ||
close TABLEIN; | ||
close TABLEOUT; | ||
Trace("Temporary file $tempName created.") if T(2); | ||
# Load the table. | ||
my $rv; | ||
eval { | ||
$rv = $dbh->load_table(file => $tempName, tbl => $relationName); | ||
}; | ||
if (!defined $rv) { | ||
$retVal->AddMessage($@) if ($@); | ||
$retVal->AddMessage("Table load failed for $relationName using $tempName."); | ||
Trace("Table load failed for $relationName.") if T(1); | ||
} else { | ||
# Here we successfully loaded the table. Trace the number of records loaded. | ||
Trace("$retVal->{records} records read for $relationName.") if T(2); | ||
# If we're rebuilding, we need to create the table indexes. | ||
if ($truncateFlag && ! $dbh->{_preIndex}) { | ||
eval { | ||
$self->CreateIndex($relationName); | ||
}; | ||
if ($@) { | ||
$retVal->AddMessage($@); | ||
2199 | } | } |
2200 | # Insure we found the entity on at least one end. | |
2201 | if (! $found) { | |
2202 | Confess("Entity \"$originEntityName\" does not use $relationshipName."); | |
2203 | } | } |
# Analyze the table to help optimize tables. | ||
2204 | } | } |
# Commit the database changes. | ||
$dbh->commit_tran; | ||
$dbh->vacuum_it($relationName); | ||
# Delete the temporary file. | ||
unlink $tempName; | ||
# Return the statistics. | ||
return $retVal; | ||
2205 | } | } |
2206 | ||
2207 | =head3 GenerateEntity | =head3 DeleteRow |
2208 | ||
2209 | C<< my $fieldHash = $database->GenerateEntity($id, $type, \%values); >> | $erdb->DeleteRow($relationshipName, $fromLink, $toLink, \%values); |
2210 | ||
2211 | Generate the data for a new entity instance. This method creates a field hash suitable for | Delete a row from a relationship. In most cases, only the from-link and to-link are |
2212 | passing as a parameter to L</InsertObject>. The ID is specified by the callr, but the rest | needed; however, for relationships with intersection data values can be specified |
2213 | of the fields are generated using information in the database schema. | for the other fields using a hash. |
Each data type has a default algorithm for generating random test data. This can be overridden | ||
by including a B<DataGen> element in the field. If this happens, the content of the element is | ||
executed as a PERL program in the context of this module. The element may make use of a C<$this> | ||
variable which contains the field hash as it has been built up to the current point. If any | ||
fields are dependent on other fields, the C<pass> attribute can be used to control the order | ||
in which the fields are generated. A field with a high data pass number will be generated after | ||
a field with a lower one. If any external values are needed, they should be passed in via the | ||
optional third parameter, which will be available to the data generation script under the name | ||
C<$value>. Several useful utility methods are provided for generating random values, including | ||
L</IntGen>, L</StringGen>, L</FloatGen>, and L</DateGen>. Note that dates are stored and generated | ||
in the form of a timestamp number rather than a string. | ||
2214 | ||
2215 | =over 4 | =over 4 |
2216 | ||
2217 | =item id | =item relationshipName |
2218 | ||
2219 | ID to assign to the new entity. | Name of the relationship from which the row is to be deleted. |
2220 | ||
2221 | =item type | =item fromLink |
2222 | ||
2223 | ID of the entity instance in the From direction. | |
2224 | ||
2225 | =item toLink | |
2226 | ||
2227 | Type name for the new entity. | ID of the entity instance in the To direction. |
2228 | ||
2229 | =item values | =item values |
2230 | ||
2231 | Hash containing additional values that might be needed by the data generation methods (optional). | Reference to a hash of other values to be used for filtering the delete. |
2232 | ||
2233 | =back | =back |
2234 | ||
2235 | =cut | =cut |
2236 | ||
2237 | sub GenerateEntity { | sub DeleteRow { |
2238 | # Get the parameters. | # Get the parameters. |
2239 | my ($self, $id, $type, $values) = @_; | my ($self, $relationshipName, $fromLink, $toLink, $values) = @_; |
2240 | # Create the return hash. | # Create a hash of all the filter information. |
2241 | my $this = { id => $id }; | my %filter = ('from-link' => $fromLink, 'to-link' => $toLink); |
2242 | # Get the metadata structure. | if (defined $values) { |
2243 | my $metadata = $self->{_metaData}; | for my $key (keys %{$values}) { |
2244 | # Get this entity's list of fields. | $filter{$key} = $values->{$key}; |
2245 | if (!exists $metadata->{Entities}->{$type}) { | } |
2246 | Confess("Unrecognized entity type $type in GenerateEntity."); | } |
2247 | } else { | # Build an SQL statement out of the hash. |
2248 | my $entity = $metadata->{Entities}->{$type}; | my @filters = (); |
2249 | my $fields = $entity->{Fields}; | my @parms = (); |
2250 | # Generate data from the fields. | for my $key (keys %filter) { |
2251 | _GenerateFields($this, $fields, $type, $values); | push @filters, _FixName($key) . " = ?"; |
2252 | push @parms, $filter{$key}; | |
2253 | } | } |
2254 | # Return the hash created. | Trace("Parms for delete row are " . join(", ", map { "\"$_\"" } @parms) . ".") if T(SQL => 4); |
2255 | return $this; | my $command = "DELETE FROM $relationshipName WHERE " . |
2256 | join(" AND ", @filters); | |
2257 | # Execute it. | |
2258 | my $dbh = $self->{_dbh}; | |
2259 | $dbh->SQL($command, undef, @parms); | |
2260 | } | } |
2261 | ||
2262 | =head3 GetEntity | =head3 DeleteLike |
2263 | ||
2264 | C<< my $entityObject = $sprout->GetEntity($entityType, $ID); >> | my $deleteCount = $erdb->DeleteLike($relName, $filter, \@parms); |
2265 | ||
2266 | Return an object describing the entity instance with a specified ID. | Delete all the relationship rows that satisfy a particular filter condition. Unlike a normal |
2267 | filter, only fields from the relationship itself can be used. | |
2268 | ||
2269 | =over 4 | =over 4 |
2270 | ||
2271 | =item entityType | =item relName |
2272 | ||
2273 | Name of the relationship whose records are to be deleted. | |
2274 | ||
2275 | =item filter | |
2276 | ||
2277 | A filter clause (L</Get>-style) for the delete query. | |
2278 | ||
2279 | =item parms | |
2280 | ||
2281 | Reference to a list of parameters for the filter clause. | |
2282 | ||
2283 | =item RETURN | |
2284 | ||
2285 | Returns a count of the number of rows deleted. | |
2286 | ||
2287 | =back | |
2288 | ||
2289 | =cut | |
2290 | ||
2291 | sub DeleteLike { | |
2292 | # Get the parameters. | |
2293 | my ($self, $objectName, $filter, $parms) = @_; | |
2294 | # Declare the return variable. | |
2295 | my $retVal; | |
2296 | # Insure the parms argument is an array reference if the caller left it off. | |
2297 | if (! defined($parms)) { | |
2298 | $parms = []; | |
2299 | } | |
2300 | # Insure we have a relationship. The main reason for this is if we delete an entity | |
2301 | # instance we have to yank out a bunch of other stuff with it. | |
2302 | if ($self->IsEntity($objectName)) { | |
2303 | Confess("Cannot use DeleteLike on $objectName, because it is not a relationship."); | |
2304 | } else { | |
2305 | # Create the SQL command suffix to get the desierd records. | |
2306 | my ($suffix) = $self->_SetupSQL([$objectName], $filter); | |
2307 | # Convert it to a DELETE command. | |
2308 | my $command = "DELETE $suffix"; | |
2309 | # Execute the command. | |
2310 | my $dbh = $self->{_dbh}; | |
2311 | my $result = $dbh->SQL($command, 0, @{$parms}); | |
2312 | # Check the results. Note we convert the "0D0" result to a real zero. | |
2313 | # A failure causes an abnormal termination, so the caller isn't going to | |
2314 | # worry about it. | |
2315 | if (! defined $result) { | |
2316 | Confess("Error deleting from $objectName: " . $dbh->errstr()); | |
2317 | } elsif ($result == 0) { | |
2318 | $retVal = 0; | |
2319 | } else { | |
2320 | $retVal = $result; | |
2321 | } | |
2322 | } | |
2323 | # Return the result count. | |
2324 | return $retVal; | |
2325 | } | |
2326 | ||
2327 | =head3 SortNeeded | |
2328 | ||
2329 | my $parms = $erdb->SortNeeded($relationName); | |
2330 | ||
2331 | Return the pipe command for the sort that should be applied to the specified | |
2332 | relation when creating the load file. | |
2333 | ||
2334 | For example, if the load file should be sorted ascending by the first | |
2335 | field, this method would return | |
2336 | ||
2337 | sort -k1 -t"\t" | |
2338 | ||
2339 | If the first field is numeric, the method would return | |
2340 | ||
2341 | sort -k1n -t"\t" | |
2342 | ||
2343 | Unfortunately, due to a bug in the C<sort> command, we cannot eliminate duplicate | |
2344 | keys using a sort. | |
2345 | ||
2346 | =over 4 | |
2347 | ||
2348 | =item relationName | |
2349 | ||
2350 | Name of the relation to be examined. | |
2351 | ||
2352 | =item | |
2353 | ||
2354 | Returns the sort command to use for sorting the relation, suitable for piping. | |
2355 | ||
2356 | =back | |
2357 | ||
2358 | =cut | |
2359 | #: Return Type $; | |
2360 | sub SortNeeded { | |
2361 | # Get the parameters. | |
2362 | my ($self, $relationName) = @_; | |
2363 | # Declare a descriptor to hold the names of the key fields. | |
2364 | my @keyNames = (); | |
2365 | # Get the relation structure. | |
2366 | my $relationData = $self->_FindRelation($relationName); | |
2367 | # Find out if the relation is a primary entity relation, | |
2368 | # a relationship relation, or a secondary entity relation. | |
2369 | my $entityTable = $self->{_metaData}->{Entities}; | |
2370 | my $relationshipTable = $self->{_metaData}->{Relationships}; | |
2371 | if (exists $entityTable->{$relationName}) { | |
2372 | # Here we have a primary entity relation. | |
2373 | push @keyNames, "id"; | |
2374 | } elsif (exists $relationshipTable->{$relationName}) { | |
2375 | # Here we have a relationship. We sort using the FROM index. | |
2376 | my $relationshipData = $relationshipTable->{$relationName}; | |
2377 | my $index = $relationData->{Indexes}->{idxFrom}; | |
2378 | push @keyNames, @{$index->{IndexFields}}; | |
2379 | } else { | |
2380 | # Here we have a secondary entity relation, so we have a sort on the ID field. | |
2381 | push @keyNames, "id"; | |
2382 | } | |
2383 | # Now we parse the key names into sort parameters. First, we prime the return | |
2384 | # string. | |
2385 | my $retVal = "sort -S 1G -T\"$FIG_Config::temp\" -t\"\t\" "; | |
2386 | # Get the relation's field list. | |
2387 | my @fields = @{$relationData->{Fields}}; | |
2388 | # Loop through the keys. | |
2389 | for my $keyData (@keyNames) { | |
2390 | # Get the key and the ordering. | |
2391 | my ($keyName, $ordering); | |
2392 | if ($keyData =~ /^([^ ]+) DESC/) { | |
2393 | ($keyName, $ordering) = ($1, "descending"); | |
2394 | } else { | |
2395 | ($keyName, $ordering) = ($keyData, "ascending"); | |
2396 | } | |
2397 | # Find the key's position and type. | |
2398 | my $fieldSpec; | |
2399 | for (my $i = 0; $i <= $#fields && ! $fieldSpec; $i++) { | |
2400 | my $thisField = $fields[$i]; | |
2401 | if ($thisField->{name} eq $keyName) { | |
2402 | # Get the sort modifier for this field type. The modifier | |
2403 | # decides whether we're using a character, numeric, or | |
2404 | # floating-point sort. | |
2405 | my $modifier = $TypeTable{$thisField->{type}}->{sort}; | |
2406 | # If the index is descending for this field, denote we want | |
2407 | # to reverse the sort order on this field. | |
2408 | if ($ordering eq 'descending') { | |
2409 | $modifier .= "r"; | |
2410 | } | |
2411 | # Store the position and modifier into the field spec, which | |
2412 | # will stop the inner loop. Note that the field number is | |
2413 | # 1-based in the sort command, so we have to increment the | |
2414 | # index. | |
2415 | my $realI = $i + 1; | |
2416 | $fieldSpec = "$realI,$realI$modifier"; | |
2417 | } | |
2418 | } | |
2419 | # Add this field to the sort command. | |
2420 | $retVal .= " -k$fieldSpec"; | |
2421 | } | |
2422 | # Return the result. | |
2423 | return $retVal; | |
2424 | } | |
2425 | ||
2426 | =head3 GetList | |
2427 | ||
2428 | my @dbObjects = $erdb->GetList(\@objectNames, $filterClause, \@params); | |
2429 | ||
2430 | Return a list of object descriptors for the specified objects as determined by the | |
2431 | specified filter clause. | |
2432 | ||
2433 | This method is essentially the same as L</Get> except it returns a list of objects rather | |
2434 | than a query object that can be used to get the results one record at a time. | |
2435 | ||
2436 | =over 4 | |
2437 | ||
2438 | =item objectNames | |
2439 | ||
2440 | List containing the names of the entity and relationship objects to be retrieved. | |
2441 | ||
2442 | =item filterClause | |
2443 | ||
2444 | WHERE clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | |
2445 | be parameterized with parameter markers (C<?>). Each field used in the WHERE clause must be | |
2446 | specified in the standard form B<I<objectName>(I<fieldName>)>. Any parameters specified | |
2447 | in the filter clause should be added to the parameter list as additional parameters. The | |
2448 | fields in a filter clause can come from primary entity relations, relationship relations, | |
2449 | or secondary entity relations; however, all of the entities and relationships involved must | |
2450 | be included in the list of object names. | |
2451 | ||
2452 | The filter clause can also specify a sort order. To do this, simply follow the filter string | |
2453 | with an ORDER BY clause. For example, the following filter string gets all genomes for a | |
2454 | particular genus and sorts them by species name. | |
2455 | ||
2456 | "Genome(genus) = ? ORDER BY Genome(species)" | |
2457 | ||
2458 | The rules for field references in a sort order are the same as those for field references in the | |
2459 | filter clause in general; however, odd things may happen if a sort field is from a secondary | |
2460 | relation. | |
2461 | ||
2462 | =item params | |
2463 | ||
2464 | Reference to a list of parameter values to be substituted into the filter clause. | |
2465 | ||
2466 | =item RETURN | |
2467 | ||
2468 | Returns a list of B<ERDBObject>s that satisfy the query conditions. | |
2469 | ||
2470 | =back | |
2471 | ||
2472 | =cut | |
2473 | #: Return Type @% | |
2474 | sub GetList { | |
2475 | # Get the parameters. | |
2476 | my ($self, $objectNames, $filterClause, $params) = @_; | |
2477 | # Declare the return variable. | |
2478 | my @retVal = (); | |
2479 | # Perform the query. | |
2480 | my $query = $self->Get($objectNames, $filterClause, $params); | |
2481 | # Loop through the results. | |
2482 | while (my $object = $query->Fetch) { | |
2483 | push @retVal, $object; | |
2484 | } | |
2485 | # Return the result. | |
2486 | return @retVal; | |
2487 | } | |
2488 | ||
2489 | =head3 GetCount | |
2490 | ||
2491 | my $count = $erdb->GetCount(\@objectNames, $filter, \@params); | |
2492 | ||
2493 | Return the number of rows found by a specified query. This method would | |
2494 | normally be used to count the records in a single table. For example, in a | |
2495 | genetics database | |
2496 | ||
2497 | my $count = $erdb->GetCount(['Genome'], 'Genome(genus-species) LIKE ?', ['homo %']); | |
2498 | ||
2499 | would return the number of genomes for the genus I<homo>. It is conceivable, however, | |
2500 | to use it to return records based on a join. For example, | |
2501 | ||
2502 | my $count = $erdb->GetCount(['HasFeature', 'Genome'], 'Genome(genus-species) LIKE ?', | |
2503 | ['homo %']); | |
2504 | ||
2505 | would return the number of features for genomes in the genus I<homo>. Note that | |
2506 | only the rows from the first table are counted. If the above command were | |
2507 | ||
2508 | my $count = $erdb->GetCount(['Genome', 'Feature'], 'Genome(genus-species) LIKE ?', | |
2509 | ['homo %']); | |
2510 | ||
2511 | it would return the number of genomes, not the number of genome/feature pairs. | |
2512 | ||
2513 | =over 4 | |
2514 | ||
2515 | =item objectNames | |
2516 | ||
2517 | Reference to a list of the objects (entities and relationships) included in the | |
2518 | query. | |
2519 | ||
2520 | =item filter | |
2521 | ||
2522 | A filter clause for restricting the query. The rules are the same as for the L</Get> | |
2523 | method. | |
2524 | ||
2525 | =item params | |
2526 | ||
2527 | Reference to a list of the parameter values to be substituted for the parameter marks | |
2528 | in the filter. | |
2529 | ||
2530 | =item RETURN | |
2531 | ||
2532 | Returns a count of the number of records in the first table that would satisfy | |
2533 | the query. | |
2534 | ||
2535 | =back | |
2536 | ||
2537 | =cut | |
2538 | ||
2539 | sub GetCount { | |
2540 | # Get the parameters. | |
2541 | my ($self, $objectNames, $filter, $params) = @_; | |
2542 | # Insure the params argument is an array reference if the caller left it off. | |
2543 | if (! defined($params)) { | |
2544 | $params = []; | |
2545 | } | |
2546 | # Declare the return variable. | |
2547 | my $retVal; | |
2548 | # Find out if we're counting an entity or a relationship. | |
2549 | my $countedField; | |
2550 | if ($self->IsEntity($objectNames->[0])) { | |
2551 | $countedField = "id"; | |
2552 | } else { | |
2553 | # For a relationship we count the to-link because it's usually more | |
2554 | # numerous. Note we're automatically converting to the SQL form | |
2555 | # of the field name (to_link vs. to-link). | |
2556 | $countedField = "to_link"; | |
2557 | } | |
2558 | # Create the SQL command suffix to get the desired records. | |
2559 | my ($suffix, $mappedNameListRef, $mappedNameHashRef) = $self->_SetupSQL($objectNames, | |
2560 | $filter); | |
2561 | # Prefix it with text telling it we want a record count. | |
2562 | my $firstObject = $mappedNameListRef->[0]; | |
2563 | my $command = "SELECT COUNT($firstObject.$countedField) $suffix"; | |
2564 | # Prepare and execute the command. | |
2565 | my $sth = $self->_GetStatementHandle($command, $params); | |
2566 | # Get the count value. | |
2567 | ($retVal) = $sth->fetchrow_array(); | |
2568 | # Check for a problem. | |
2569 | if (! defined($retVal)) { | |
2570 | if ($sth->err) { | |
2571 | # Here we had an SQL error. | |
2572 | Confess("Error retrieving row count: " . $sth->errstr()); | |
2573 | } else { | |
2574 | # Here we have no result. | |
2575 | Confess("No result attempting to retrieve row count."); | |
2576 | } | |
2577 | } | |
2578 | # Return the result. | |
2579 | return $retVal; | |
2580 | } | |
2581 | ||
2582 | =head3 ComputeObjectSentence | |
2583 | ||
2584 | my $sentence = $erdb->ComputeObjectSentence($objectName); | |
2585 | ||
2586 | Check an object name, and if it is a relationship convert it to a relationship sentence. | |
2587 | ||
2588 | =over 4 | |
2589 | ||
2590 | =item objectName | |
2591 | ||
2592 | Name of the entity or relationship. | |
2593 | ||
2594 | =item RETURN | |
2595 | ||
2596 | Returns a string containing the entity name or a relationship sentence. | |
2597 | ||
2598 | =back | |
2599 | ||
2600 | =cut | |
2601 | ||
2602 | sub ComputeObjectSentence { | |
2603 | # Get the parameters. | |
2604 | my ($self, $objectName) = @_; | |
2605 | # Set the default return value. | |
2606 | my $retVal = $objectName; | |
2607 | # Look for the object as a relationship. | |
2608 | my $relTable = $self->{_metaData}->{Relationships}; | |
2609 | if (exists $relTable->{$objectName}) { | |
2610 | # Get the relationship sentence. | |
2611 | $retVal = _ComputeRelationshipSentence($objectName, $relTable->{$objectName}); | |
2612 | } | |
2613 | # Return the result. | |
2614 | return $retVal; | |
2615 | } | |
2616 | ||
2617 | =head3 DumpRelations | |
2618 | ||
2619 | $erdb->DumpRelations($outputDirectory); | |
2620 | ||
2621 | Write the contents of all the relations to tab-delimited files in the specified directory. | |
2622 | Each file will have the same name as the relation dumped, with an extension of DTX. | |
2623 | ||
2624 | =over 4 | |
2625 | ||
2626 | =item outputDirectory | |
2627 | ||
2628 | Name of the directory into which the relation files should be dumped. | |
2629 | ||
2630 | =back | |
2631 | ||
2632 | =cut | |
2633 | ||
2634 | sub DumpRelations { | |
2635 | # Get the parameters. | |
2636 | my ($self, $outputDirectory) = @_; | |
2637 | # Now we need to run through all the relations. First, we loop through the entities. | |
2638 | my $metaData = $self->{_metaData}; | |
2639 | my $entities = $metaData->{Entities}; | |
2640 | for my $entityName (keys %{$entities}) { | |
2641 | my $entityStructure = $entities->{$entityName}; | |
2642 | # Get the entity's relations. | |
2643 | my $relationList = $entityStructure->{Relations}; | |
2644 | # Loop through the relations, dumping them. | |
2645 | for my $relationName (keys %{$relationList}) { | |
2646 | my $relation = $relationList->{$relationName}; | |
2647 | $self->_DumpRelation($outputDirectory, $relationName, $relation); | |
2648 | } | |
2649 | } | |
2650 | # Next, we loop through the relationships. | |
2651 | my $relationships = $metaData->{Relationships}; | |
2652 | for my $relationshipName (keys %{$relationships}) { | |
2653 | my $relationshipStructure = $relationships->{$relationshipName}; | |
2654 | # Dump this relationship's relation. | |
2655 | $self->_DumpRelation($outputDirectory, $relationshipName, $relationshipStructure->{Relations}->{$relationshipName}); | |
2656 | } | |
2657 | } | |
2658 | ||
2659 | =head3 InsertValue | |
2660 | ||
2661 | $erdb->InsertValue($entityID, $fieldName, $value); | |
2662 | ||
2663 | This method will insert a new value into the database. The value must be one | |
2664 | associated with a secondary relation, since primary values cannot be inserted: | |
2665 | they occur exactly once. Secondary values, on the other hand, can be missing | |
2666 | or multiply-occurring. | |
2667 | ||
2668 | =over 4 | |
2669 | ||
2670 | =item entityID | |
2671 | ||
2672 | ID of the object that is to receive the new value. | |
2673 | ||
2674 | =item fieldName | |
2675 | ||
2676 | Field name for the new value-- this includes the entity name, since | |
2677 | field names are of the format I<objectName>C<(>I<fieldName>C<)>. | |
2678 | ||
2679 | =item value | |
2680 | ||
2681 | New value to be put in the field. | |
2682 | ||
2683 | =back | |
2684 | ||
2685 | =cut | |
2686 | ||
2687 | sub InsertValue { | |
2688 | # Get the parameters. | |
2689 | my ($self, $entityID, $fieldName, $value) = @_; | |
2690 | # Parse the entity name and the real field name. | |
2691 | if ($fieldName =~ /^([^(]+)\(([^)]+)\)/) { | |
2692 | my $entityName = $1; | |
2693 | my $fieldTitle = $2; | |
2694 | # Get its descriptor. | |
2695 | if (!$self->IsEntity($entityName)) { | |
2696 | Confess("$entityName is not a valid entity."); | |
2697 | } else { | |
2698 | my $entityData = $self->{_metaData}->{Entities}->{$entityName}; | |
2699 | # Find the relation containing this field. | |
2700 | my $fieldHash = $entityData->{Fields}; | |
2701 | if (! exists $fieldHash->{$fieldTitle}) { | |
2702 | Confess("$fieldTitle not found in $entityName."); | |
2703 | } else { | |
2704 | my $relation = $fieldHash->{$fieldTitle}->{relation}; | |
2705 | if ($relation eq $entityName) { | |
2706 | Confess("Cannot do InsertValue on primary field $fieldTitle of $entityName."); | |
2707 | } else { | |
2708 | # Now we can create an INSERT statement. | |
2709 | my $dbh = $self->{_dbh}; | |
2710 | my $fixedName = _FixName($fieldTitle); | |
2711 | my $statement = "INSERT INTO $relation (id, $fixedName) VALUES(?, ?)"; | |
2712 | # Execute the command. | |
2713 | $dbh->SQL($statement, 0, $entityID, $value); | |
2714 | } | |
2715 | } | |
2716 | } | |
2717 | } else { | |
2718 | Confess("$fieldName is not a valid field name."); | |
2719 | } | |
2720 | } | |
2721 | ||
2722 | =head3 InsertObject | |
2723 | ||
2724 | $erdb->InsertObject($objectType, \%fieldHash); | |
2725 | ||
2726 | Insert an object into the database. The object is defined by a type name and then a hash | |
2727 | of field names to values. Field values in the primary relation are represented by scalars. | |
2728 | (Note that for relationships, the primary relation is the B<only> relation.) | |
2729 | Field values for the other relations comprising the entity are always list references. For | |
2730 | example, the following line inserts an inactive PEG feature named C<fig|188.1.peg.1> with aliases | |
2731 | C<ZP_00210270.1> and C<gi|46206278>. | |
2732 | ||
2733 | $erdb->InsertObject('Feature', { id => 'fig|188.1.peg.1', active => 0, feature-type => 'peg', alias => ['ZP_00210270.1', 'gi|46206278']}); | |
2734 | ||
2735 | The next statement inserts a C<HasProperty> relationship between feature C<fig|158879.1.peg.1> and | |
2736 | property C<4> with an evidence URL of C<http://seedu.uchicago.edu/query.cgi?article_id=142>. | |
2737 | ||
2738 | $erdb->InsertObject('HasProperty', { 'from-link' => 'fig|158879.1.peg.1', 'to-link' => 4, evidence => 'http://seedu.uchicago.edu/query.cgi?article_id=142'}); | |
2739 | ||
2740 | =over 4 | |
2741 | ||
2742 | =item newObjectType | |
2743 | ||
2744 | Type name of the object to insert. | |
2745 | ||
2746 | =item fieldHash | |
2747 | ||
2748 | Hash of field names to values. | |
2749 | ||
2750 | =back | |
2751 | ||
2752 | =cut | |
2753 | ||
2754 | sub InsertObject { | |
2755 | # Get the parameters. | |
2756 | my ($self, $newObjectType, $fieldHash) = @_; | |
2757 | # Denote that so far we appear successful. | |
2758 | my $retVal = 1; | |
2759 | # Get the database handle. | |
2760 | my $dbh = $self->{_dbh}; | |
2761 | # Get the relation list. | |
2762 | my $relationTable = $self->_GetRelationTable($newObjectType); | |
2763 | # Loop through the relations. We'll build insert statements for each one. If a relation is | |
2764 | # secondary, we may end up generating multiple insert statements. If an error occurs, we | |
2765 | # stop the loop. | |
2766 | my @relationList = keys %{$relationTable}; | |
2767 | for (my $i = 0; $retVal && $i <= $#relationList; $i++) { | |
2768 | my $relationName = $relationList[$i]; | |
2769 | my $relationDefinition = $relationTable->{$relationName}; | |
2770 | # Get the relation's fields. For each field we will collect a value in the corresponding | |
2771 | # position of the @valueList array. If one of the fields is missing, we will add it to the | |
2772 | # @missing list. | |
2773 | my @fieldList = @{$relationDefinition->{Fields}}; | |
2774 | my @fieldNameList = (); | |
2775 | my @valueList = (); | |
2776 | my @missing = (); | |
2777 | my $recordCount = 1; | |
2778 | for my $fieldDescriptor (@fieldList) { | |
2779 | # Get the field name and save it. Note we need to fix it up so the hyphens | |
2780 | # are converted to underscores. | |
2781 | my $fieldName = $fieldDescriptor->{name}; | |
2782 | push @fieldNameList, _FixName($fieldName); | |
2783 | # Look for the named field in the incoming structure. Note that we are looking | |
2784 | # for the real field name, not the fixed-up one! | |
2785 | if (exists $fieldHash->{$fieldName}) { | |
2786 | # Here we found the field. Stash it in the value list. | |
2787 | my $value = $fieldHash->{$fieldName}; | |
2788 | push @valueList, $value; | |
2789 | # If the value is a list, we may need to increment the record count. | |
2790 | if (ref $value eq "ARRAY") { | |
2791 | my $thisCount = @{$value}; | |
2792 | if ($recordCount == 1) { | |
2793 | # Here we have our first list, so we save its count. | |
2794 | $recordCount = $thisCount; | |
2795 | } elsif ($recordCount != $thisCount) { | |
2796 | # Here we have a second list, so its length has to match the | |
2797 | # previous lists. | |
2798 | Trace("Field $value in new $newObjectType object has an invalid list length $thisCount. Expected $recordCount.") if T(0); | |
2799 | $retVal = 0; | |
2800 | } | |
2801 | } | |
2802 | } else { | |
2803 | # Here the field is not present. Flag it as missing. | |
2804 | push @missing, $fieldName; | |
2805 | } | |
2806 | } | |
2807 | # Only proceed if there are no missing fields. | |
2808 | if (@missing > 0) { | |
2809 | Trace("Relation $relationName for $newObjectType skipped due to missing fields: " . | |
2810 | join(' ', @missing)) if T(1); | |
2811 | } else { | |
2812 | # Build the INSERT statement. | |
2813 | my $statement = "INSERT INTO $relationName (" . join (', ', @fieldNameList) . | |
2814 | ") VALUES ("; | |
2815 | # Create a marker list of the proper size and put it in the statement. | |
2816 | my @markers = (); | |
2817 | while (@markers < @fieldNameList) { push @markers, '?'; } | |
2818 | $statement .= join(', ', @markers) . ")"; | |
2819 | # We have the insert statement, so prepare it. | |
2820 | my $sth = $dbh->prepare_command($statement); | |
2821 | Trace("Insert statement prepared: $statement") if T(3); | |
2822 | # Now we loop through the values. If a value is scalar, we use it unmodified. If it's | |
2823 | # a list, we use the current element. The values are stored in the @parameterList array. | |
2824 | my $done = 0; | |
2825 | for (my $i = 0; $i < $recordCount; $i++) { | |
2826 | # Clear the parameter list array. | |
2827 | my @parameterList = (); | |
2828 | # Loop through the values. | |
2829 | for my $value (@valueList) { | |
2830 | # Check to see if this is a scalar value. | |
2831 | if (ref $value eq "ARRAY") { | |
2832 | # Here we have a list value. Pull the current entry. | |
2833 | push @parameterList, $value->[$i]; | |
2834 | } else { | |
2835 | # Here we have a scalar value. Use it unmodified. | |
2836 | push @parameterList, $value; | |
2837 | } | |
2838 | } | |
2839 | # Execute the INSERT statement with the specified parameter list. | |
2840 | $retVal = $sth->execute(@parameterList); | |
2841 | if (!$retVal) { | |
2842 | my $errorString = $sth->errstr(); | |
2843 | Confess("Error inserting into $relationName: $errorString"); | |
2844 | } else { | |
2845 | Trace("Insert successful using $parameterList[0].") if T(3); | |
2846 | } | |
2847 | } | |
2848 | } | |
2849 | } | |
2850 | # Return a 1 for backward compatability. | |
2851 | return 1; | |
2852 | } | |
2853 | ||
2854 | =head3 UpdateEntity | |
2855 | ||
2856 | $erdb->UpdateEntity($entityName, $id, \%fields); | |
2857 | ||
2858 | Update the values of an entity. This is an unprotected update, so it should only be | |
2859 | done if the database resides on a database server. | |
2860 | ||
2861 | =over 4 | |
2862 | ||
2863 | =item entityName | |
2864 | ||
2865 | Name of the entity to update. (This is the entity type.) | |
2866 | ||
2867 | =item id | |
2868 | ||
2869 | ID of the entity to update. If no entity exists with this ID, an error will be thrown. | |
2870 | ||
2871 | =item fields | |
2872 | ||
2873 | Reference to a hash mapping field names to their new values. All of the fields named | |
2874 | must be in the entity's primary relation, and they cannot any of them be the ID field. | |
2875 | ||
2876 | =back | |
2877 | ||
2878 | =cut | |
2879 | ||
2880 | sub UpdateEntity { | |
2881 | # Get the parameters. | |
2882 | my ($self, $entityName, $id, $fields) = @_; | |
2883 | # Get a list of the field names being updated. | |
2884 | my @fieldList = keys %{$fields}; | |
2885 | # Verify that the fields exist. | |
2886 | my $checker = $self->GetFieldTable($entityName); | |
2887 | for my $field (@fieldList) { | |
2888 | if ($field eq 'id') { | |
2889 | Confess("Cannot update the ID field for entity $entityName."); | |
2890 | } elsif ($checker->{$field}->{relation} ne $entityName) { | |
2891 | Confess("Cannot find $field in primary relation of $entityName."); | |
2892 | } | |
2893 | } | |
2894 | # Build the SQL statement. | |
2895 | my @sets = (); | |
2896 | my @valueList = (); | |
2897 | for my $field (@fieldList) { | |
2898 | push @sets, _FixName($field) . " = ?"; | |
2899 | push @valueList, $fields->{$field}; | |
2900 | } | |
2901 | my $command = "UPDATE $entityName SET " . join(", ", @sets) . " WHERE id = ?"; | |
2902 | # Add the ID to the list of binding values. | |
2903 | push @valueList, $id; | |
2904 | # Call SQL to do the work. | |
2905 | my $rows = $self->{_dbh}->SQL($command, 0, @valueList); | |
2906 | # Check for errors. | |
2907 | if ($rows == 0) { | |
2908 | Confess("Entity $id of type $entityName not found."); | |
2909 | } | |
2910 | } | |
2911 | ||
2912 | =head3 LoadTable | |
2913 | ||
2914 | my $results = $erdb->LoadTable($fileName, $relationName, %options); | |
2915 | ||
2916 | Load data from a tab-delimited file into a specified table, optionally re-creating the table | |
2917 | first. | |
2918 | ||
2919 | =over 4 | |
2920 | ||
2921 | =item fileName | |
2922 | ||
2923 | Name of the file from which the table data should be loaded. | |
2924 | ||
2925 | =item relationName | |
2926 | ||
2927 | Name of the relation to be loaded. This is the same as the table name. | |
2928 | ||
2929 | =item options | |
2930 | ||
2931 | A hash of load options. | |
2932 | ||
2933 | =item RETURN | |
2934 | ||
2935 | Returns a statistical object containing a list of the error messages. | |
2936 | ||
2937 | =back | |
2938 | ||
2939 | The permissible options are as follows. | |
2940 | ||
2941 | =over 4 | |
2942 | ||
2943 | =item truncate | |
2944 | ||
2945 | If TRUE, then the table will be erased before loading. | |
2946 | ||
2947 | =item mode | |
2948 | ||
2949 | Mode in which the load should operate, either C<low_priority> or C<concurrent>. | |
2950 | This option is only applicable to a MySQL database. | |
2951 | ||
2952 | =item partial | |
2953 | ||
2954 | If TRUE, then it is assumed that this is a partial load, and the table will not | |
2955 | be analyzed and compacted at the end. | |
2956 | ||
2957 | =back | |
2958 | ||
2959 | =cut | |
2960 | sub LoadTable { | |
2961 | # Get the parameters. | |
2962 | my ($self, $fileName, $relationName, %options) = @_; | |
2963 | # Create the statistical return object. | |
2964 | my $retVal = _GetLoadStats(); | |
2965 | # Trace the fact of the load. | |
2966 | Trace("Loading table $relationName from $fileName") if T(2); | |
2967 | # Get the database handle. | |
2968 | my $dbh = $self->{_dbh}; | |
2969 | # Get the input file size. | |
2970 | my $fileSize = -s $fileName; | |
2971 | # Get the relation data. | |
2972 | my $relation = $self->_FindRelation($relationName); | |
2973 | # Check the truncation flag. | |
2974 | if ($options{truncate}) { | |
2975 | Trace("Creating table $relationName") if T(2); | |
2976 | # Compute the row count estimate. We take the size of the load file, | |
2977 | # divide it by the estimated row size, and then multiply by 2 to | |
2978 | # leave extra room. We postulate a minimum row count of 1000 to | |
2979 | # prevent problems with incoming empty load files. | |
2980 | my $rowSize = $self->EstimateRowSize($relationName); | |
2981 | my $estimate = $fileSize * 8 / $rowSize; | |
2982 | if ($estimate < 1000) { | |
2983 | $estimate = 1000; | |
2984 | } | |
2985 | # Re-create the table without its index. | |
2986 | $self->CreateTable($relationName, 0, $estimate); | |
2987 | # If this is a pre-index DBMS, create the index here. | |
2988 | if ($dbh->{_preIndex}) { | |
2989 | eval { | |
2990 | $self->CreateIndex($relationName); | |
2991 | }; | |
2992 | if ($@) { | |
2993 | $retVal->AddMessage($@); | |
2994 | } | |
2995 | } | |
2996 | } | |
2997 | # Load the table. | |
2998 | my $rv; | |
2999 | eval { | |
3000 | $rv = $dbh->load_table(file => $fileName, tbl => $relationName, style => $options{mode}); | |
3001 | }; | |
3002 | if (!defined $rv) { | |
3003 | $retVal->AddMessage($@) if ($@); | |
3004 | $retVal->AddMessage("Table load failed for $relationName using $fileName: " . $dbh->error_message); | |
3005 | Trace("Table load failed for $relationName.") if T(1); | |
3006 | } else { | |
3007 | # Here we successfully loaded the table. | |
3008 | $retVal->Add("tables"); | |
3009 | my $size = -s $fileName; | |
3010 | Trace("$size bytes loaded into $relationName.") if T(2); | |
3011 | $retVal->Add("bytes", $size); | |
3012 | # If we're rebuilding, we need to create the table indexes. | |
3013 | if ($options{truncate}) { | |
3014 | # Indexes are created here for PostGres. For PostGres, indexes are | |
3015 | # best built at the end. For MySQL, the reverse is true. | |
3016 | if (! $dbh->{_preIndex}) { | |
3017 | eval { | |
3018 | $self->CreateIndex($relationName); | |
3019 | }; | |
3020 | if ($@) { | |
3021 | $retVal->AddMessage($@); | |
3022 | } | |
3023 | } | |
3024 | # The full-text index (if any) is always built last, even for MySQL. | |
3025 | # First we need to see if this table has a full-text index. Only | |
3026 | # primary relations are allowed that privilege. | |
3027 | Trace("Checking for full-text index on $relationName.") if T(2); | |
3028 | if ($self->_IsPrimary($relationName)) { | |
3029 | $self->CreateSearchIndex($relationName); | |
3030 | } | |
3031 | } | |
3032 | } | |
3033 | # Analyze the table to improve performance. | |
3034 | if (! $options{partial}) { | |
3035 | Trace("Analyzing and compacting $relationName.") if T(3); | |
3036 | $self->Analyze($relationName); | |
3037 | } | |
3038 | Trace("$relationName load completed.") if T(3); | |
3039 | # Return the statistics. | |
3040 | return $retVal; | |
3041 | } | |
3042 | ||
3043 | =head3 Analyze | |
3044 | ||
3045 | $erdb->Analyze($tableName); | |
3046 | ||
3047 | Analyze and compact a table in the database. This is useful after a load | |
3048 | to improve the performance of the indexes. | |
3049 | ||
3050 | =over 4 | |
3051 | ||
3052 | =item tableName | |
3053 | ||
3054 | Name of the table to be analyzed and compacted. | |
3055 | ||
3056 | =back | |
3057 | ||
3058 | =cut | |
3059 | ||
3060 | sub Analyze { | |
3061 | # Get the parameters. | |
3062 | my ($self, $tableName) = @_; | |
3063 | # Analyze the table. | |
3064 | $self->{_dbh}->vacuum_it($tableName); | |
3065 | } | |
3066 | ||
3067 | =head3 TruncateTable | |
3068 | ||
3069 | $erdb->TruncateTable($table); | |
3070 | ||
3071 | Delete all rows from a table quickly. This uses the built-in SQL | |
3072 | C<TRUNCATE> statement, which effectively drops and re-creates a table | |
3073 | with all its settings intact. | |
3074 | ||
3075 | =over 4 | |
3076 | ||
3077 | =item table | |
3078 | ||
3079 | Name of the table to be cleared. | |
3080 | ||
3081 | =back | |
3082 | ||
3083 | =cut | |
3084 | ||
3085 | sub TruncateTable { | |
3086 | # Get the parameters. | |
3087 | my ($self, $table) = @_; | |
3088 | # Get the database handle. | |
3089 | my $dbh = $self->{_dbh}; | |
3090 | # Execute a truncation comment. | |
3091 | $dbh->SQL("TRUNCATE TABLE $table"); | |
3092 | } | |
3093 | ||
3094 | ||
3095 | =head3 CreateSearchIndex | |
3096 | ||
3097 | $erdb->CreateSearchIndex($objectName); | |
3098 | ||
3099 | Check for a full-text search index on the specified entity or relationship object, and | |
3100 | if one is required, rebuild it. | |
3101 | ||
3102 | =over 4 | |
3103 | ||
3104 | =item objectName | |
3105 | ||
3106 | Name of the entity or relationship to be indexed. | |
3107 | ||
3108 | =back | |
3109 | ||
3110 | =cut | |
3111 | ||
3112 | sub CreateSearchIndex { | |
3113 | # Get the parameters. | |
3114 | my ($self, $objectName) = @_; | |
3115 | # Get the relation's entity/relationship structure. | |
3116 | my $structure = $self->_GetStructure($objectName); | |
3117 | # Get the database handle. | |
3118 | my $dbh = $self->{_dbh}; | |
3119 | Trace("Checking for search fields in $objectName.") if T(3); | |
3120 | # Check for a searchable fields list. | |
3121 | if (exists $structure->{searchFields}) { | |
3122 | # Here we know that we need to create a full-text search index. | |
3123 | # Get an SQL-formatted field name list. | |
3124 | my $fields = join(", ", _FixNames(@{$structure->{searchFields}})); | |
3125 | # Create the index. If it already exists, it will be dropped. | |
3126 | $dbh->create_index(tbl => $objectName, idx => "search_idx", | |
3127 | flds => $fields, kind => 'fulltext'); | |
3128 | Trace("Index created for $fields in $objectName.") if T(2); | |
3129 | } | |
3130 | } | |
3131 | ||
3132 | =head3 DropRelation | |
3133 | ||
3134 | $erdb->DropRelation($relationName); | |
3135 | ||
3136 | Physically drop a relation from the database. | |
3137 | ||
3138 | =over 4 | |
3139 | ||
3140 | =item relationName | |
3141 | ||
3142 | Name of the relation to drop. If it does not exist, this method will have | |
3143 | no effect. | |
3144 | ||
3145 | =back | |
3146 | ||
3147 | =cut | |
3148 | ||
3149 | sub DropRelation { | |
3150 | # Get the parameters. | |
3151 | my ($self, $relationName) = @_; | |
3152 | # Get the database handle. | |
3153 | my $dbh = $self->{_dbh}; | |
3154 | # Drop the relation. The method used here has no effect if the relation | |
3155 | # does not exist. | |
3156 | Trace("Invoking DB Kernel to drop $relationName.") if T(3); | |
3157 | $dbh->drop_table(tbl => $relationName); | |
3158 | } | |
3159 | ||
3160 | =head3 MatchSqlPattern | |
3161 | ||
3162 | my $matched = ERDB::MatchSqlPattern($value, $pattern); | |
3163 | ||
3164 | Determine whether or not a specified value matches an SQL pattern. An SQL | |
3165 | pattern has two wild card characters: C<%> that matches multiple characters, | |
3166 | and C<_> that matches a single character. These can be escaped using a | |
3167 | backslash (C<\>). We pull this off by converting the SQL pattern to a | |
3168 | PERL regular expression. As per SQL rules, the match is case-insensitive. | |
3169 | ||
3170 | =over 4 | |
3171 | ||
3172 | =item value | |
3173 | ||
3174 | Value to be matched against the pattern. Note that an undefined or empty | |
3175 | value will not match anything. | |
3176 | ||
3177 | =item pattern | |
3178 | ||
3179 | SQL pattern against which to match the value. An undefined or empty pattern will | |
3180 | match everything. | |
3181 | ||
3182 | =item RETURN | |
3183 | ||
3184 | Returns TRUE if the value and pattern match, else FALSE. | |
3185 | ||
3186 | =back | |
3187 | ||
3188 | =cut | |
3189 | ||
3190 | sub MatchSqlPattern { | |
3191 | # Get the parameters. | |
3192 | my ($value, $pattern) = @_; | |
3193 | # Declare the return variable. | |
3194 | my $retVal; | |
3195 | # Insure we have a pattern. | |
3196 | if (! defined($pattern) || $pattern eq "") { | |
3197 | $retVal = 1; | |
3198 | } else { | |
3199 | # Break the pattern into pieces around the wildcard characters. Because we | |
3200 | # use parentheses in the split function's delimiter expression, we'll get | |
3201 | # list elements for the delimiters as well as the rest of the string. | |
3202 | my @pieces = split /([_%]|\\[_%])/, $pattern; | |
3203 | # Check some fast special cases. | |
3204 | if ($pattern eq '%') { | |
3205 | # A null pattern matches everything. | |
3206 | $retVal = 1; | |
3207 | } elsif (@pieces == 1) { | |
3208 | # No wildcards, so we have a literal comparison. Note we're case-insensitive. | |
3209 | $retVal = (lc($value) eq lc($pattern)); | |
3210 | } elsif (@pieces == 2 && $pieces[1] eq '%') { | |
3211 | # A wildcard at the end, so we have a substring match. This is also case-insensitive. | |
3212 | $retVal = (lc(substr($value, 0, length($pieces[0]))) eq lc($pieces[0])); | |
3213 | } else { | |
3214 | # Okay, we have to do it the hard way. Convert each piece to a PERL pattern. | |
3215 | my $realPattern = ""; | |
3216 | for my $piece (@pieces) { | |
3217 | # Determine the type of piece. | |
3218 | if ($piece eq "") { | |
3219 | # Empty pieces are ignored. | |
3220 | } elsif ($piece eq "%") { | |
3221 | # Here we have a multi-character wildcard. Note that it can match | |
3222 | # zero or more characters. | |
3223 | $realPattern .= ".*" | |
3224 | } elsif ($piece eq "_") { | |
3225 | # Here we have a single-character wildcard. | |
3226 | $realPattern .= "."; | |
3227 | } elsif ($piece eq "\\%" || $piece eq "\\_") { | |
3228 | # This is an escape sequence (which is a rare thing, actually). | |
3229 | $realPattern .= substr($piece, 1, 1); | |
3230 | } else { | |
3231 | # Here we have raw text. | |
3232 | $realPattern .= quotemeta($piece); | |
3233 | } | |
3234 | } | |
3235 | # Do the match. | |
3236 | $retVal = ($value =~ /^$realPattern$/i ? 1 : 0); | |
3237 | } | |
3238 | } | |
3239 | # Return the result. | |
3240 | return $retVal; | |
3241 | } | |
3242 | ||
3243 | =head3 GetEntity | |
3244 | ||
3245 | my $entityObject = $erdb->GetEntity($entityType, $ID); | |
3246 | ||
3247 | Return an object describing the entity instance with a specified ID. | |
3248 | ||
3249 | =over 4 | |
3250 | ||
3251 | =item entityType | |
3252 | ||
3253 | Entity type name. | |
3254 | ||
3255 | =item ID | |
3256 | ||
3257 | ID of the desired entity. | |
3258 | ||
3259 | =item RETURN | |
3260 | ||
3261 | Returns a B<ERDBObject> representing the desired entity instance, or an undefined value if no | |
3262 | instance is found with the specified key. | |
3263 | ||
3264 | =back | |
3265 | ||
3266 | =cut | |
3267 | ||
3268 | sub GetEntity { | |
3269 | # Get the parameters. | |
3270 | my ($self, $entityType, $ID) = @_; | |
3271 | # Create a query. | |
3272 | my $query = $self->Get([$entityType], "$entityType(id) = ?", [$ID]); | |
3273 | # Get the first (and only) object. | |
3274 | my $retVal = $query->Fetch(); | |
3275 | if (T(3)) { | |
3276 | if ($retVal) { | |
3277 | Trace("Entity $entityType \"$ID\" found."); | |
3278 | } else { | |
3279 | Trace("Entity $entityType \"$ID\" not found."); | |
3280 | } | |
3281 | } | |
3282 | # Return the result. | |
3283 | return $retVal; | |
3284 | } | |
3285 | ||
3286 | =head3 GetChoices | |
3287 | ||
3288 | my @values = $erdb->GetChoices($entityName, $fieldName); | |
3289 | ||
3290 | Return a list of all the values for the specified field that are represented in the | |
3291 | specified entity. | |
3292 | ||
3293 | Note that if the field is not indexed, then this will be a very slow operation. | |
3294 | ||
3295 | =over 4 | |
3296 | ||
3297 | =item entityName | |
3298 | ||
3299 | Name of an entity in the database. | |
3300 | ||
3301 | =item fieldName | |
3302 | ||
3303 | Name of a field belonging to the entity. This is a raw field name without | |
3304 | the standard parenthesized notation used in most calls. | |
3305 | ||
3306 | =item RETURN | |
3307 | ||
3308 | Returns a list of the distinct values for the specified field in the database. | |
3309 | ||
3310 | =back | |
3311 | ||
3312 | =cut | |
3313 | ||
3314 | sub GetChoices { | |
3315 | # Get the parameters. | |
3316 | my ($self, $entityName, $fieldName) = @_; | |
3317 | # Declare the return variable. | |
3318 | my @retVal; | |
3319 | # Get the entity data structure. | |
3320 | my $entityData = $self->_GetStructure($entityName); | |
3321 | # Get the field. | |
3322 | my $fieldHash = $entityData->{Fields}; | |
3323 | if (! exists $fieldHash->{$fieldName}) { | |
3324 | Confess("$fieldName not found in $entityName."); | |
3325 | } else { | |
3326 | # Get the name of the relation containing the field. | |
3327 | my $relation = $fieldHash->{$fieldName}->{relation}; | |
3328 | # Fix up the field name. | |
3329 | my $realName = _FixName($fieldName); | |
3330 | # Get the database handle. | |
3331 | my $dbh = $self->{_dbh}; | |
3332 | # Query the database. | |
3333 | my $results = $dbh->SQL("SELECT DISTINCT $realName FROM $relation"); | |
3334 | # Clean the results. They are stored as a list of lists, and we just want the one list. | |
3335 | @retVal = sort map { $_->[0] } @{$results}; | |
3336 | } | |
3337 | # Return the result. | |
3338 | return @retVal; | |
3339 | } | |
3340 | ||
3341 | =head3 GetEntityValues | |
3342 | ||
3343 | my @values = $erdb->GetEntityValues($entityType, $ID, \@fields); | |
3344 | ||
3345 | Return a list of values from a specified entity instance. If the entity instance | |
3346 | does not exist, an empty list is returned. | |
3347 | ||
3348 | =over 4 | |
3349 | ||
3350 | =item entityType | |
3351 | ||
3352 | Entity type name. | Entity type name. |
3353 | ||
3354 | =item ID | =item ID |
3355 | ||
3356 | ID of the desired entity. | |
3357 | ||
3358 | =item fields | |
3359 | ||
3360 | List of field names, each of the form I<objectName>C<(>I<fieldName>C<)>. | |
3361 | ||
3362 | =item RETURN | |
3363 | ||
3364 | Returns a flattened list of the values of the specified fields for the specified entity. | |
3365 | ||
3366 | =back | |
3367 | ||
3368 | =cut | |
3369 | ||
3370 | sub GetEntityValues { | |
3371 | # Get the parameters. | |
3372 | my ($self, $entityType, $ID, $fields) = @_; | |
3373 | # Get the specified entity. | |
3374 | my $entity = $self->GetEntity($entityType, $ID); | |
3375 | # Declare the return list. | |
3376 | my @retVal = (); | |
3377 | # If we found the entity, push the values into the return list. | |
3378 | if ($entity) { | |
3379 | push @retVal, $entity->Values($fields); | |
3380 | } | |
3381 | # Return the result. | |
3382 | return @retVal; | |
3383 | } | |
3384 | ||
3385 | =head3 GetAll | |
3386 | ||
3387 | my @list = $erdb->GetAll(\@objectNames, $filterClause, \@parameters, \@fields, $count); | |
3388 | ||
3389 | Return a list of values taken from the objects returned by a query. The first three | |
3390 | parameters correspond to the parameters of the L</Get> method. The final parameter is | |
3391 | a list of the fields desired from each record found by the query. The field name | |
3392 | syntax is the standard syntax used for fields in the B<ERDB> system-- | |
3393 | B<I<objectName>(I<fieldName>)>-- where I<objectName> is the name of the relevant entity | |
3394 | or relationship and I<fieldName> is the name of the field. | |
3395 | ||
3396 | The list returned will be a list of lists. Each element of the list will contain | |
3397 | the values returned for the fields specified in the fourth parameter. If one of the | |
3398 | fields specified returns multiple values, they are flattened in with the rest. For | |
3399 | example, the following call will return a list of the features in a particular | |
3400 | spreadsheet cell, and each feature will be represented by a list containing the | |
3401 | feature ID followed by all of its essentiality determinations. | |
3402 | ||
3403 | @query = $erdb->Get(['ContainsFeature', 'Feature'], "ContainsFeature(from-link) = ?", [$ssCellID], ['Feature(id)', 'Feature(essential)']); | |
3404 | ||
3405 | =over 4 | |
3406 | ||
3407 | =item objectNames | |
3408 | ||
3409 | List containing the names of the entity and relationship objects to be retrieved. | |
3410 | ||
3411 | =item filterClause | |
3412 | ||
3413 | WHERE/ORDER BY clause (without the WHERE) to be used to filter and sort the query. The WHERE clause can | |
3414 | be parameterized with parameter markers (C<?>). Each field used must be specified in the standard form | |
3415 | B<I<objectName>(I<fieldName>)>. Any parameters specified in the filter clause should be added to the | |
3416 | parameter list as additional parameters. The fields in a filter clause can come from primary | |
3417 | entity relations, relationship relations, or secondary entity relations; however, all of the | |
3418 | entities and relationships involved must be included in the list of object names. | |
3419 | ||
3420 | =item parameterList | |
3421 | ||
3422 | List of the parameters to be substituted in for the parameters marks in the filter clause. | |
3423 | ||
3424 | =item fields | |
3425 | ||
3426 | List of the fields to be returned in each element of the list returned. | |
3427 | ||
3428 | =item count | |
3429 | ||
3430 | Maximum number of records to return. If omitted or 0, all available records will be returned. | |
3431 | ||
3432 | =item RETURN | |
3433 | ||
3434 | Returns a list of list references. Each element of the return list contains the values for the | |
3435 | fields specified in the B<fields> parameter. | |
3436 | ||
3437 | =back | |
3438 | ||
3439 | =cut | |
3440 | #: Return Type @@; | |
3441 | sub GetAll { | |
3442 | # Get the parameters. | |
3443 | my ($self, $objectNames, $filterClause, $parameterList, $fields, $count) = @_; | |
3444 | # Translate the parameters from a list reference to a list. If the parameter | |
3445 | # list is a scalar we convert it into a singleton list. | |
3446 | my @parmList = (); | |
3447 | if (ref $parameterList eq "ARRAY") { | |
3448 | Trace("GetAll parm list is an array.") if T(4); | |
3449 | @parmList = @{$parameterList}; | |
3450 | } else { | |
3451 | Trace("GetAll parm list is a scalar: $parameterList.") if T(4); | |
3452 | push @parmList, $parameterList; | |
3453 | } | |
3454 | # Insure the counter has a value. | |
3455 | if (!defined $count) { | |
3456 | $count = 0; | |
3457 | } | |
3458 | # Add the row limit to the filter clause. | |
3459 | if ($count > 0) { | |
3460 | $filterClause .= " LIMIT $count"; | |
3461 | } | |
3462 | # Create the query. | |
3463 | my $query = $self->Get($objectNames, $filterClause, \@parmList); | |
3464 | # Set up a counter of the number of records read. | |
3465 | my $fetched = 0; | |
3466 | # Loop through the records returned, extracting the fields. Note that if the | |
3467 | # counter is non-zero, we stop when the number of records read hits the count. | |
3468 | my @retVal = (); | |
3469 | while (($count == 0 || $fetched < $count) && (my $row = $query->Fetch())) { | |
3470 | my @rowData = $row->Values($fields); | |
3471 | push @retVal, \@rowData; | |
3472 | $fetched++; | |
3473 | } | |
3474 | # Return the resulting list. | |
3475 | return @retVal; | |
3476 | } | |
3477 | ||
3478 | =head3 Exists | |
3479 | ||
3480 | my $found = $sprout->Exists($entityName, $entityID); | |
3481 | ||
3482 | Return TRUE if an entity exists, else FALSE. | |
3483 | ||
3484 | =over 4 | |
3485 | ||
3486 | =item entityName | |
3487 | ||
3488 | Name of the entity type (e.g. C<Feature>) relevant to the existence check. | |
3489 | ||
3490 | =item entityID | |
3491 | ||
3492 | ID of the entity instance whose existence is to be checked. | |
3493 | ||
3494 | =item RETURN | |
3495 | ||
3496 | Returns TRUE if the entity instance exists, else FALSE. | |
3497 | ||
3498 | =back | |
3499 | ||
3500 | =cut | |
3501 | #: Return Type $; | |
3502 | sub Exists { | |
3503 | # Get the parameters. | |
3504 | my ($self, $entityName, $entityID) = @_; | |
3505 | # Check for the entity instance. | |
3506 | Trace("Checking existence of $entityName with ID=$entityID.") if T(4); | |
3507 | my $testInstance = $self->GetEntity($entityName, $entityID); | |
3508 | # Return an existence indicator. | |
3509 | my $retVal = ($testInstance ? 1 : 0); | |
3510 | return $retVal; | |
3511 | } | |
3512 | ||
3513 | =head3 EstimateRowSize | |
3514 | ||
3515 | my $rowSize = $erdb->EstimateRowSize($relName); | |
3516 | ||
3517 | Estimate the row size of the specified relation. The estimated row size is computed by adding | |
3518 | up the average length for each data type. | |
3519 | ||
3520 | =over 4 | |
3521 | ||
3522 | =item relName | |
3523 | ||
3524 | Name of the relation whose estimated row size is desired. | |
3525 | ||
3526 | =item RETURN | |
3527 | ||
3528 | Returns an estimate of the row size for the specified relation. | |
3529 | ||
3530 | =back | |
3531 | ||
3532 | =cut | |
3533 | #: Return Type $; | |
3534 | sub EstimateRowSize { | |
3535 | # Get the parameters. | |
3536 | my ($self, $relName) = @_; | |
3537 | # Declare the return variable. | |
3538 | my $retVal = 0; | |
3539 | # Find the relation descriptor. | |
3540 | my $relation = $self->_FindRelation($relName); | |
3541 | # Get the list of fields. | |
3542 | for my $fieldData (@{$relation->{Fields}}) { | |
3543 | # Get the field type and add its length. | |
3544 | my $fieldLen = $TypeTable{$fieldData->{type}}->{avgLen}; | |
3545 | $retVal += $fieldLen; | |
3546 | } | |
3547 | # Return the result. | |
3548 | return $retVal; | |
3549 | } | |
3550 | ||
3551 | =head3 GetFieldTable | |
3552 | ||
3553 | my $fieldHash = $self->GetFieldTable($objectnName); | |
3554 | ||
3555 | Get the field structure for a specified entity or relationship. | |
3556 | ||
3557 | =over 4 | |
3558 | ||
3559 | =item objectName | |
3560 | ||
3561 | Name of the desired entity or relationship. | |
3562 | ||
3563 | =item RETURN | |
3564 | ||
3565 | The table containing the field descriptors for the specified object. | |
3566 | ||
3567 | =back | |
3568 | ||
3569 | =cut | |
3570 | ||
3571 | sub GetFieldTable { | |
3572 | # Get the parameters. | |
3573 | my ($self, $objectName) = @_; | |
3574 | # Get the descriptor from the metadata. | |
3575 | my $objectData = $self->_GetStructure($objectName); | |
3576 | # Return the object's field table. | |
3577 | return $objectData->{Fields}; | |
3578 | } | |
3579 | ||
3580 | =head3 SplitKeywords | |
3581 | ||
3582 | my @keywords = ERDB::SplitKeywords($keywordString); | |
3583 | ||
3584 | This method returns a list of the positive keywords in the specified | |
3585 | keyword string. All of the operators will have been stripped off, | |
3586 | and if the keyword is preceded by a minus operator (C<->), it will | |
3587 | not be in the list returned. The idea here is to get a list of the | |
3588 | keywords the user wants to see. The list will be processed to remove | |
3589 | duplicates. | |
3590 | ||
3591 | It is possible to create a string that confuses this method. For example | |
3592 | ||
3593 | frog toad -frog | |
3594 | ||
3595 | would return both C<frog> and C<toad>. If this is a problem we can deal | |
3596 | with it later. | |
3597 | ||
3598 | =over 4 | |
3599 | ||
3600 | =item keywordString | |
3601 | ||
3602 | The keyword string to be parsed. | |
3603 | ||
3604 | =item RETURN | |
3605 | ||
3606 | Returns a list of the words in the keyword string the user wants to | |
3607 | see. | |
3608 | ||
3609 | =back | |
3610 | ||
3611 | =cut | |
3612 | ||
3613 | sub SplitKeywords { | |
3614 | # Get the parameters. | |
3615 | my ($keywordString) = @_; | |
3616 | # Make a safety copy of the string. (This helps during debugging.) | |
3617 | my $workString = $keywordString; | |
3618 | # Convert operators we don't care about to spaces. | |
3619 | $workString =~ tr/+"()<>/ /; | |
3620 | # Split the rest of the string along space boundaries. Note that we | |
3621 | # eliminate any words that are zero length or begin with a minus sign. | |
3622 | my @wordList = grep { $_ && substr($_, 0, 1) ne "-" } split /\s+/, $workString; | |
3623 | # Use a hash to remove duplicates. | |
3624 | my %words = map { $_ => 1 } @wordList; | |
3625 | # Return the result. | |
3626 | return sort keys %words; | |
3627 | } | |
3628 | ||
3629 | =head3 ValidateFieldName | |
3630 | ||
3631 | my $okFlag = ERDB::ValidateFieldName($fieldName); | |
3632 | ||
3633 | Return TRUE if the specified field name is valid, else FALSE. Valid field names must | |
3634 | be hyphenated words subject to certain restrictions. | |
3635 | ||
3636 | =over 4 | |
3637 | ||
3638 | =item fieldName | |
3639 | ||
3640 | Field name to be validated. | |
3641 | ||
3642 | =item RETURN | |
3643 | ||
3644 | Returns TRUE if the field name is valid, else FALSE. | |
3645 | ||
3646 | =back | |
3647 | ||
3648 | =cut | |
3649 | ||
3650 | sub ValidateFieldName { | |
3651 | # Get the parameters. | |
3652 | my ($fieldName) = @_; | |
3653 | # Declare the return variable. The field name is valid until we hear | |
3654 | # differently. | |
3655 | my $retVal = 1; | |
3656 | # Compute the maximum name length. | |
3657 | my $maxLen = $TypeTable{'name-string'}->{maxLen}; | |
3658 | # Look for bad stuff in the name. | |
3659 | if ($fieldName =~ /--/) { | |
3660 | # Here we have a doubled minus sign. | |
3661 | Trace("Field name $fieldName has a doubled hyphen.") if T(1); | |
3662 | $retVal = 0; | |
3663 | } elsif ($fieldName !~ /^[A-Za-z]/) { | |
3664 | # Here the field name is missing the initial letter. | |
3665 | Trace("Field name $fieldName does not begin with a letter.") if T(1); | |
3666 | $retVal = 0; | |
3667 | } elsif (length($fieldName) > $maxLen) { | |
3668 | # Here the field name is too long. | |
3669 | Trace("Maximum field name length is $maxLen. Field name must be truncated to " . substr($fieldName,0, $maxLen) . "."); | |
3670 | } else { | |
3671 | # Strip out the minus signs. Everything remaining must be a letter, | |
3672 | # underscore, or digit. | |
3673 | my $strippedName = $fieldName; | |
3674 | $strippedName =~ s/-//g; | |
3675 | if ($strippedName !~ /^(\w|\d)+$/) { | |
3676 | Trace("Field name $fieldName contains illegal characters.") if T(1); | |
3677 | $retVal = 0; | |
3678 | } | |
3679 | } | |
3680 | # Return the result. | |
3681 | return $retVal; | |
3682 | } | |
3683 | ||
3684 | =head3 ReadMetaXML | |
3685 | ||
3686 | my $rawMetaData = ERDB::ReadDBD($fileName); | |
3687 | ||
3688 | This method reads a raw database definition XML file and returns it. | |
3689 | Normally, the metadata used by the ERDB system has been processed and | |
3690 | modified to make it easier to load and retrieve the data; however, | |
3691 | this method can be used to get the data in its raw form. | |
3692 | ||
3693 | =over 4 | |
3694 | ||
3695 | =item fileName | |
3696 | ||
3697 | Name of the XML file to read. | |
3698 | ||
3699 | =item RETURN | |
3700 | ||
3701 | Returns a hash reference containing the raw XML data from the specified file. | |
3702 | ||
3703 | =back | |
3704 | ||
3705 | =cut | |
3706 | ||
3707 | sub ReadMetaXML { | |
3708 | # Get the parameters. | |
3709 | my ($fileName) = @_; | |
3710 | # Read the XML. | |
3711 | my $retVal = XML::Simple::XMLin($fileName, %XmlOptions, %XmlInOpts); | |
3712 | Trace("XML metadata loaded from file $fileName.") if T(1); | |
3713 | # Return the result. | |
3714 | return $retVal; | |
3715 | } | |
3716 | ||
3717 | =head3 GetEntityFieldHash | |
3718 | ||
3719 | my $fieldHashRef = ERDB::GetEntityFieldHash($structure, $entityName); | |
3720 | ||
3721 | Get the field hash of the named entity in the specified raw XML structure. | |
3722 | The field hash may not exist, in which case we need to create it. | |
3723 | ||
3724 | =over 4 | |
3725 | ||
3726 | =item structure | |
3727 | ||
3728 | Raw XML structure defininng the database. This is not the run-time XML used by | |
3729 | an ERDB object, since that has all sorts of optimizations built-in. | |
3730 | ||
3731 | =item entityName | |
3732 | ||
3733 | Name of the entity whose field structure is desired. | |
3734 | ||
3735 | =item RETURN | |
3736 | ||
3737 | Returns the field hash used to define the entity's fields. | |
3738 | ||
3739 | =back | |
3740 | ||
3741 | =cut | |
3742 | ||
3743 | sub GetEntityFieldHash { | |
3744 | # Get the parameters. | |
3745 | my ($structure, $entityName) = @_; | |
3746 | # Get the entity structure. | |
3747 | my $entityData = $structure->{Entities}->{$entityName}; | |
3748 | # Look for a field structure. | |
3749 | my $retVal = $entityData->{Fields}; | |
3750 | # If it doesn't exist, create it. | |
3751 | if (! defined($retVal)) { | |
3752 | $entityData->{Fields} = {}; | |
3753 | $retVal = $entityData->{Fields}; | |
3754 | } | |
3755 | # Return the result. | |
3756 | return $retVal; | |
3757 | } | |
3758 | ||
3759 | =head3 WriteMetaXML | |
3760 | ||
3761 | ERDB::WriteMetaXML($structure, $fileName); | |
3762 | ||
3763 | Write the metadata XML to a file. This method is the reverse of L</ReadMetaXML>, and is | |
3764 | used to update the database definition. It must be used with care, however, since it | |
3765 | will only work on a raw structure, not on the processed structure created by an ERDB | |
3766 | constructor. | |
3767 | ||
3768 | =over 4 | |
3769 | ||
3770 | =item structure | |
3771 | ||
3772 | ID of the desired entity. | XML structure to be written to the file. |
3773 | ||
3774 | =item fileName | |
3775 | ||
3776 | Name of the output file to which the updated XML should be stored. | |
3777 | ||
3778 | =back | |
3779 | ||
3780 | =cut | |
3781 | ||
3782 | sub WriteMetaXML { | |
3783 | # Get the parameters. | |
3784 | my ($structure, $fileName) = @_; | |
3785 | # Compute the output. | |
3786 | my $fileString = XML::Simple::XMLout($structure, %XmlOptions, %XmlOutOpts); | |
3787 | # Write it to the file. | |
3788 | my $xmlOut = Open(undef, ">$fileName"); | |
3789 | print $xmlOut $fileString; | |
3790 | } | |
3791 | ||
3792 | ||
3793 | =head3 HTMLNote | |
3794 | ||
3795 | Convert a note or comment to HTML by replacing some bulletin-board codes with HTML. The codes | |
3796 | supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a new paragraph. | |
3797 | Except for C<[p]>, all the codes are closed by slash-codes. So, for | |
3798 | example, C<[b]Feature[/b]> displays the string C<Feature> in boldface. | |
3799 | ||
3800 | my $realHtml = ERDB::HTMLNote($dataString); | |
3801 | ||
3802 | =over 4 | |
3803 | ||
3804 | =item dataString | |
3805 | ||
3806 | String to convert to HTML. | |
3807 | ||
3808 | =item RETURN | =item RETURN |
3809 | ||
3810 | Returns a B<DBObject> representing the desired entity instance, or an undefined value if no | An HTML string derived from the input string. |
instance is found with the specified key. | ||
3811 | ||
3812 | =back | =back |
3813 | ||
3814 | =cut | =cut |
3815 | ||
3816 | sub GetEntity { | sub HTMLNote { |
3817 | # Get the parameter. | |
3818 | my ($dataString) = @_; | |
3819 | # HTML-escape the text. | |
3820 | my $retVal = CGI::escapeHTML($dataString); | |
3821 | # Substitute the bulletin board codes. | |
3822 | $retVal =~ s!\[(/?[bi])\]!<$1>!g; | |
3823 | $retVal =~ s!\[p\]!</p><p>!g; | |
3824 | $retVal =~ s!\[link\s+([^\]]+)\]!<a href="$1">!g; | |
3825 | $retVal =~ s!\[/link\]!</a>!g; | |
3826 | # Return the result. | |
3827 | return $retVal; | |
3828 | } | |
3829 | ||
3830 | =head3 WikiNote | |
3831 | ||
3832 | Convert a note or comment to Wiki text by replacing some bulletin-board codes with HTML. The codes | |
3833 | supported are C<[b]> for B<bold>, C<[i]> for I<italics>, and C<[p]> for a |