[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

Annotation of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.8 - (view) (download) (as text)

1 : parrello 1.1 #!/usr/bin/perl -w
2 :    
3 :     package CustomAttributes;
4 :    
5 :     require Exporter;
6 :     use ERDB;
7 : parrello 1.4 @ISA = qw(ERDB);
8 : parrello 1.1 use strict;
9 :     use Tracer;
10 :     use ERDBLoad;
11 :    
12 :     =head1 Custom SEED Attribute Manager
13 :    
14 :     =head2 Introduction
15 :    
16 :     The Custom SEED Attributes Manager allows the user to upload and retrieve
17 :     custom data for SEED objects. It uses the B<ERDB> database system to
18 :     store the attributes, which are implemented as multi-valued fields
19 :     of ERDB entities.
20 :    
21 :     The full suite of ERDB retrieval capabilities is provided. In addition,
22 :     custom methods are provided specific to this application. To get all
23 : parrello 1.6 the values of the attribute C<essential> in a specified B<Feature>, you
24 : parrello 1.1 would code
25 :    
26 : parrello 1.6 my @values = $attrDB->GetAttributes([Feature => $fid], 'essential');
27 : parrello 1.1
28 : parrello 1.2 where I<$fid> contains the ID of the desired feature. Each attribute has
29 :     an alternate index to allow searching for attributes by value.
30 : parrello 1.1
31 :     New attributes are introduced by updating the database definition at
32 :     run-time. Attribute values are stored by uploading data from files.
33 :     A web interface is provided for both these activities.
34 :    
35 :     =head2 FIG_Config Parameters
36 :    
37 :     The following configuration parameters are used to manage custom attributes.
38 :    
39 :     =over 4
40 :    
41 :     =item attrDbms
42 :    
43 :     Type of database manager used: C<mysql> for MySQL or C<pg> for PostGres.
44 :    
45 :     =item attrDbName
46 :    
47 :     Name of the attribute database.
48 :    
49 :     =item attrHost
50 :    
51 :     Name of the host server for the database. If omitted, the current host
52 :     is used.
53 :    
54 :     =item attrUser
55 :    
56 :     User name for logging in to the database.
57 :    
58 :     =item attrPass
59 :    
60 :     Password for logging in to the database.
61 :    
62 :     =item attrPort
63 :    
64 :     TCP/IP port for accessing the database.
65 :    
66 :     =item attrSock
67 :    
68 :     Socket name used to access the database. If omitted, the default socket
69 :     will be used.
70 :    
71 :     =item attrDBD
72 :    
73 :     Fully-qualified file name for the database definition XML file. This file
74 :     functions as data to the attribute management process, so if the data is
75 :     moved, this file must go with it.
76 :    
77 :     =back
78 :    
79 : parrello 1.6 The DBD file is critical, and must have reasonable contents before we can
80 :     begin using the system. In the old system, attributes were only provided
81 :     for Genomes and Features, so the initial XML file was the following.
82 :    
83 :     <Database>
84 :     <Title>SEED Custom Attribute Database</Title>
85 :     <Entities>
86 :     <Entity name="Feature" keyType="id-string">
87 :     <Notes>A [i]feature[/i] is a part of the genome
88 :     that is of special interest. Features may be spread
89 :     across multiple contigs of a genome, but never across
90 :     more than one genome. Features can be assigned to roles
91 :     via spreadsheet cells, and are the targets of
92 :     annotation.</Notes>
93 :     </Entity>
94 :     <Entity name="Genome" keyType="name-string">
95 :     <Notes>A [i]genome[/i] describes a particular individual
96 :     organism's DNA.</Notes>
97 :     </Entity>
98 :     </Entities>
99 :     </Database>
100 :    
101 :     It is not necessary to put any tables into the database; however, you should
102 :     run
103 :    
104 :     AttrDBRefresh
105 :    
106 :     periodically to insure it has the correct Genomes and Features in it. When
107 :     converting from the old system, use
108 :    
109 :     AttrDBRefresh -migrate
110 :    
111 :     to initialize the database and migrate the legacy data. You should only need
112 :     to do that once.
113 :    
114 : parrello 1.3 =head2 Implementation Note
115 : parrello 1.1
116 :     The L</Refresh> method reloads the entities in the database. If new
117 :     entity types are added, that method will need to be adjusted accordingly.
118 :    
119 :     =head2 Public Methods
120 :    
121 :     =head3 new
122 :    
123 : parrello 1.3 C<< my $attrDB = CustomAttributes->new($splitter); >>
124 : parrello 1.1
125 : parrello 1.3 Construct a new CustomAttributes object. This object cannot be used to add or
126 :     delete keys because that requires modifying the database design. To do that,
127 :     you need to use the static L</StoreAttributeKey> or L</DeleteAttributeKey>
128 :     methods.
129 :    
130 :     =over 4
131 :    
132 :     =item splitter
133 :    
134 :     Value to be used to split attribute values into sections in the
135 :     L</Fig Replacement Methods>. The default is a double colon C<::>.
136 :     If you do not use the replacement methods, you do not need to
137 :     worry about this parameter.
138 :    
139 :     =back
140 : parrello 1.1
141 :     =cut
142 :    
143 :     sub new {
144 :     # Get the parameters.
145 : parrello 1.3 my ($class, $splitter) = @_;
146 : parrello 1.1 # Connect to the database.
147 :     my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
148 :     $FIG_Config::attrUser, $FIG_Config::attrPass,
149 :     $FIG_Config::attrPort, $FIG_Config::attrHost,
150 :     $FIG_Config::attrSock);
151 :     # Create the ERDB object.
152 :     my $xmlFileName = $FIG_Config::attrDBD;
153 :     my $retVal = ERDB::new($class, $dbh, $xmlFileName);
154 : parrello 1.3 # Store the splitter value.
155 :     $retVal->{splitter} = (defined($splitter) ? $splitter : '::');
156 : parrello 1.1 # Return the result.
157 :     return $retVal;
158 :     }
159 :    
160 : parrello 1.3 =head3 StoreAttributeKey
161 : parrello 1.1
162 : parrello 1.3 C<< my $attrDB = CustomAttributes::StoreAttributeKey($entityName, $attributeName, $type, $notes); >>
163 : parrello 1.1
164 :     Create or update an attribute for the database. This method will update the database definition
165 :     XML, but it will not create the table. It will connect to the database so that the caller
166 :     can upload the attribute values.
167 :    
168 :     =over 4
169 :    
170 :     =item entityName
171 :    
172 :     Name of the entity containing the attribute. The entity must exist.
173 :    
174 :     =item attributeName
175 :    
176 :     Name of the attribute. It must be a valid ERDB field name, consisting entirely of
177 :     letters, digits, and hyphens, with a letter at the beginning. If it does not
178 :     exist already, it will be created.
179 :    
180 :     =item type
181 :    
182 :     Data type of the attribute. This must be a valid ERDB data type name.
183 :    
184 :     =item notes
185 :    
186 :     Descriptive notes about the attribute. It is presumed to be raw text, not HTML.
187 :    
188 :     =item RETURN
189 :    
190 :     Returns a Custom Attribute Database object if successful. If unsuccessful, an
191 :     error will be thrown.
192 :    
193 :     =back
194 :    
195 :     =cut
196 :    
197 : parrello 1.3 sub StoreAttributeKey {
198 : parrello 1.1 # Get the parameters.
199 :     my ($entityName, $attributeName, $type, $notes) = @_;
200 : parrello 1.8 # Declare the return variable.
201 :     my $retVal;
202 : parrello 1.1 # Get the data type hash.
203 :     my %types = ERDB::GetDataTypes();
204 :     # Validate the initial input values.
205 :     if (! ERDB::ValidateFieldName($attributeName)) {
206 :     Confess("Invalid attribute name \"$attributeName\" specified.");
207 :     } elsif (! $notes || length($notes) < 25) {
208 :     Confess("Missing or incomplete description for $attributeName.");
209 :     } elsif (! exists $types{$type}) {
210 :     Confess("Invalid data type \"$type\" for $attributeName.");
211 :     }
212 :     # Our next step is to read in the XML for the database defintion. We
213 :     # need to verify that the named entity exists.
214 :     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
215 :     my $entityHash = $metadata->{Entities};
216 :     if (! exists $entityHash->{$entityName}) {
217 :     Confess("Entity $entityName not found.");
218 :     } else {
219 : parrello 1.2 # Okay, we're ready to begin. Get the entity hash and the field hash.
220 :     my $entityData = $entityHash->{$entityName};
221 : parrello 1.1 my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
222 : parrello 1.8 # Compare the old attribute data to the new data.
223 :     my $bigChange = 1;
224 :     if (exists $fieldHash->{$attributeName} && $fieldHash->{$attributeName}->{type} eq $type) {
225 :     $bigChange = 0;
226 :     }
227 : parrello 1.1 # Compute the attribute's relation name.
228 : parrello 1.3 my $relName = join("", $entityName, map { ucfirst $_ } split(/-|_/, $attributeName));
229 : parrello 1.1 # Store the attribute's field data. Note the use of the "content" hash for
230 :     # the notes. This is how the XML writer knows Notes is a text tag instead of
231 :     # an attribute.
232 :     $fieldHash->{$attributeName} = { type => $type, relation => $relName,
233 :     Notes => { content => $notes } };
234 : parrello 1.2 # Insure we have an index for this attribute.
235 :     my $index = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
236 :     if (! defined($index)) {
237 :     push @{$entityData->{Indexes}}, { IndexFields => [ { name => $attributeName, order => 'ascending' } ],
238 :     Notes => "Alternate index provided for access by $attributeName." };
239 :     }
240 : parrello 1.1 # Write the XML back out.
241 :     ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
242 : parrello 1.8 # Open a database with the new XML.
243 :     $retVal = CustomAttributes->new();
244 :     # Create the table if there has been a significant change.
245 :     if ($bigChange) {
246 :     $retVal->CreateTable($relName);
247 :     }
248 : parrello 1.1 }
249 :     return $retVal;
250 :     }
251 :    
252 :     =head3 Refresh
253 :    
254 : parrello 1.3 C<< $attrDB->Refresh($fig); >>
255 : parrello 1.1
256 :     Refresh the primary entity tables from the FIG data store. This method basically
257 :     drops and reloads the main tables of the custom attributes database.
258 :    
259 : parrello 1.3 =over 4
260 :    
261 :     =item fig
262 :    
263 :     FIG-like object that can be used to find genomes and features.
264 :    
265 :     =back
266 :    
267 : parrello 1.1 =cut
268 :    
269 :     sub Refresh {
270 :     # Get the parameters.
271 : parrello 1.3 my ($self, $fig) = @_;
272 : parrello 1.1 # Create load objects for the genomes and the features.
273 :     my $loadGenome = ERDBLoad->new($self, 'Genome', $FIG_Config::temp);
274 :     my $loadFeature = ERDBLoad->new($self, 'Feature', $FIG_Config::temp);
275 :     # Get the genome list.
276 :     my @genomes = $fig->genomes();
277 :     # Loop through the genomes.
278 :     for my $genomeID (@genomes) {
279 :     # Put this genome in the genome table.
280 :     $loadGenome->Put($genomeID);
281 :     Trace("Processing Genome $genomeID") if T(3);
282 :     # Put its features into the feature table. Note we have to use a hash to
283 :     # remove duplicates.
284 :     my %featureList = map { $_ => 1 } $fig->all_features($genomeID);
285 :     for my $fid (keys %featureList) {
286 :     $loadFeature->Put($fid);
287 :     }
288 :     }
289 :     # Get a variable for holding statistics objects.
290 :     my $stats;
291 :     # Finish the genome load.
292 :     Trace("Loading Genome relation.") if T(2);
293 :     $stats = $loadGenome->FinishAndLoad();
294 :     Trace("Genome table load statistics:\n" . $stats->Show()) if T(3);
295 :     # Finish the feature load.
296 :     Trace("Loading Feature relation.") if T(2);
297 :     $stats = $loadFeature->FinishAndLoad();
298 :     Trace("Feature table load statistics:\n" . $stats->Show()) if T(3);
299 :     }
300 :    
301 : parrello 1.3 =head3 LoadAttributeKey
302 : parrello 1.1
303 : parrello 1.3 C<< my $stats = $attrDB->LoadAttributeKey($entityName, $fieldName, $fh, $keyCol, $dataCol); >>
304 : parrello 1.1
305 :     Load the specified attribute from the specified file. The file should be a
306 :     tab-delimited file with internal tab and new-line characters escaped. This is
307 :     the typical TBL-style file used by most FIG applications. One of the columns
308 :     in the input file must contain the appropriate key value and the other the
309 :     corresponding attribute value.
310 :    
311 :     =over 4
312 :    
313 :     =item entityName
314 :    
315 :     Name of the entity containing the attribute.
316 :    
317 :     =item fieldName
318 :    
319 :     Name of the actual attribute.
320 :    
321 :     =item fh
322 :    
323 :     Open file handle for the input file.
324 :    
325 :     =item keyCol
326 :    
327 :     Index (0-based) of the column containing the key field. The key field should
328 :     contain the ID of an instance of the named entity.
329 :    
330 :     =item dataCol
331 :    
332 :     Index (0-based) of the column containing the data value field.
333 :    
334 :     =item RETURN
335 :    
336 :     Returns a statistics object for the load process.
337 :    
338 :     =back
339 :    
340 :     =cut
341 :    
342 : parrello 1.3 sub LoadAttributeKey {
343 : parrello 1.1 # Get the parameters.
344 :     my ($self, $entityName, $fieldName, $fh, $keyCol, $dataCol) = @_;
345 :     # Create the return variable.
346 :     my $retVal;
347 :     # Insure the entity exists.
348 :     my $found = grep { $_ eq $entityName } $self->GetEntityTypes();
349 :     if (! $found) {
350 :     Confess("Entity \"$entityName\" not found in database.");
351 :     } else {
352 :     # Get the field structure for the named entity.
353 :     my $fieldHash = $self->GetFieldTable($entityName);
354 :     # Verify that the attribute exists.
355 :     if (! exists $fieldHash->{$fieldName}) {
356 : parrello 1.3 Confess("Attribute key \"$fieldName\" does not exist in entity $entityName.");
357 : parrello 1.1 } else {
358 :     # Create a loader for the specified attribute. We need the
359 :     # relation name first.
360 :     my $relName = $fieldHash->{$fieldName}->{relation};
361 :     my $loadAttribute = ERDBLoad->new($self, $relName, $FIG_Config::temp);
362 :     # Loop through the input file.
363 :     while (! eof $fh) {
364 :     # Get the next line of the file.
365 :     my @fields = Tracer::GetLine($fh);
366 :     $loadAttribute->Add("lineIn");
367 :     # Now we need to validate the line.
368 :     if ($#fields < $dataCol) {
369 :     $loadAttribute->Add("shortLine");
370 :     } elsif (! $self->Exists($entityName, $fields[$keyCol])) {
371 :     $loadAttribute->Add("badKey");
372 :     } else {
373 :     # It's valid,so send it to the loader.
374 :     $loadAttribute->Put($fields[$keyCol], $fields[$dataCol]);
375 :     $loadAttribute->Add("lineUsed");
376 :     }
377 :     }
378 :     # Finish the load.
379 :     $retVal = $loadAttribute->FinishAndLoad();
380 :     }
381 :     }
382 :     # Return the statistics.
383 :     return $retVal;
384 :     }
385 :    
386 :    
387 : parrello 1.3 =head3 DeleteAttributeKey
388 :    
389 :     C<< CustomAttributes::DeleteAttributeKey($entityName, $attributeName); >>
390 : parrello 1.1
391 :     Delete an attribute from the custom attributes database.
392 :    
393 :     =over 4
394 :    
395 :     =item entityName
396 :    
397 :     Name of the entity possessing the attribute.
398 :    
399 :     =item attributeName
400 :    
401 :     Name of the attribute to delete.
402 :    
403 :     =back
404 :    
405 :     =cut
406 :    
407 : parrello 1.3 sub DeleteAttributeKey {
408 : parrello 1.1 # Get the parameters.
409 :     my ($entityName, $attributeName) = @_;
410 :     # Read in the XML for the database defintion. We need to verify that
411 :     # the named entity exists and it has the named attribute.
412 :     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
413 :     my $entityHash = $metadata->{Entities};
414 :     if (! exists $entityHash->{$entityName}) {
415 :     Confess("Entity \"$entityName\" not found.");
416 :     } else {
417 :     # Get the field hash.
418 :     my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
419 :     if (! exists $fieldHash->{$attributeName}) {
420 : parrello 1.3 Confess("Attribute key \"$attributeName\" not found in entity $entityName.");
421 : parrello 1.1 } else {
422 :     # Get the attribute's relation name.
423 :     my $relName = $fieldHash->{$attributeName}->{relation};
424 : parrello 1.2 # Check for an index.
425 :     my $indexIdx = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
426 :     if (defined($indexIdx)) {
427 :     Trace("Index for $attributeName found at position $indexIdx for $entityName.") if T(3);
428 :     delete $entityHash->{$entityName}->{Indexes}->[$indexIdx];
429 :     }
430 : parrello 1.1 # Delete the attribute from the field hash.
431 :     Trace("Deleting attribute $attributeName from $entityName.") if T(3);
432 :     delete $fieldHash->{$attributeName};
433 :     # Write the XML back out.
434 :     ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
435 :     # Insure the relation does not exist in the database. This requires connecting
436 :     # since we may have to do a table drop.
437 :     my $attrDB = CustomAttributes->new();
438 : parrello 1.2 Trace("Dropping table $relName.") if T(3);
439 : parrello 1.1 $attrDB->DropRelation($relName);
440 :     }
441 :     }
442 :     }
443 :    
444 :     =head3 ControlForm
445 :    
446 :     C<< my $formHtml = $attrDB->ControlForm($cgi, $name); >>
447 :    
448 :     Return a form that can be used to control the creation and modification of
449 :     attributes.
450 :    
451 :     =over 4
452 :    
453 :     =item cgi
454 :    
455 :     CGI query object used to create HTML.
456 :    
457 :     =item name
458 :    
459 :     Name to give to the form. This should be unique for the web page.
460 :    
461 :     =item RETURN
462 :    
463 :     Returns the HTML for a form that submits instructions to the C<Attributes.cgi> script
464 :     for loading, creating, or deleting an attribute.
465 :    
466 :     =back
467 :    
468 :     =cut
469 :    
470 :     sub ControlForm {
471 :     # Get the parameters.
472 :     my ($self, $cgi, $name) = @_;
473 :     # Declare the return list.
474 :     my @retVal = ();
475 :     # Start the form. We use multipart to support the upload control.
476 :     push @retVal, $cgi->start_multipart_form(-name => $name);
477 :     # We'll put the controls in a table. Nothing else ever seems to look nice.
478 :     push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
479 :     # The first row is for selecting the field name.
480 :     push @retVal, $cgi->Tr($cgi->th("Select a Field"),
481 :     $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', 1,
482 :     "document.$name.notes.value",
483 :     "document.$name.dataType.value")));
484 :     # Now we set up a dropdown for the data types. The values will be the
485 :     # data type names, and the labels will be the descriptions.
486 :     my %types = ERDB::GetDataTypes();
487 :     my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
488 :     my $typeMenu = $cgi->popup_menu(-name => 'dataType',
489 :     -values => [sort keys %types],
490 :     -labels => \%labelMap);
491 :     push @retVal, $cgi->Tr($cgi->th("Data type"),
492 :     $cgi->td($typeMenu));
493 :     # The next row is for the notes.
494 :     push @retVal, $cgi->Tr($cgi->th("Description"),
495 :     $cgi->td($cgi->textarea(-name => 'notes',
496 :     -rows => 6,
497 :     -columns => 80))
498 :     );
499 :     # Allow the user to specify a new field name. This is required if the
500 :     # user has selected one of the "(new)" markers.
501 :     push @retVal, $cgi->Tr($cgi->th("New Field Name"),
502 :     $cgi->td($cgi->textfield(-name => 'newName',
503 :     -size => 30)),
504 :     );
505 :     # If the user wants to upload new values for the field, then we have
506 :     # an upload file name and column indicators.
507 :     push @retVal, $cgi->Tr($cgi->th("Upload Values"),
508 :     $cgi->td($cgi->filefield(-name => 'newValueFile',
509 :     -size => 20) .
510 :     " Key&nbsp;" .
511 :     $cgi->textfield(-name => 'keyCol',
512 :     -size => 3,
513 :     -default => 0) .
514 :     " Value&nbsp;" .
515 :     $cgi->textfield(-name => 'valueCol',
516 :     -size => 3,
517 :     -default => 1)
518 :     ),
519 :     );
520 : parrello 1.7 # Now the three buttons: UPDATE, SHOW, and DELETE.
521 : parrello 1.1 push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
522 :     $cgi->td({align => 'center'},
523 :     $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .
524 : parrello 1.7 $cgi->submit(-name => 'Store', -value => 'STORE') . " " .
525 :     $cgi->submit(-name => 'Show', -value => 'SHOW')
526 : parrello 1.1 )
527 :     );
528 :     # Close the table and the form.
529 :     push @retVal, $cgi->end_table();
530 :     push @retVal, $cgi->end_form();
531 :     # Return the assembled HTML.
532 :     return join("\n", @retVal, "");
533 :     }
534 :    
535 :     =head3 FieldMenu
536 :    
537 :     C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $newFlag, $noteControl, $typeControl); >>
538 :    
539 :     Return the HTML for a menu to select an attribute field. The menu will
540 :     be a standard SELECT/OPTION thing which is called "popup menu" in the
541 :     CGI package, but actually looks like a list. The list will contain
542 :     one selectable row per field, grouped by entity.
543 :    
544 :     =over 4
545 :    
546 :     =item cgi
547 :    
548 :     CGI query object used to generate HTML.
549 :    
550 :     =item height
551 :    
552 :     Number of lines to display in the list.
553 :    
554 :     =item name
555 :    
556 :     Name to give to the menu. This is the name under which the value will
557 :     appear when the form is submitted.
558 :    
559 :     =item newFlag (optional)
560 :    
561 :     If TRUE, then extra rows will be provided to allow the user to select
562 :     a new attribute. In other words, the user can select an existing
563 :     attribute, or can choose a C<(new)> marker to indicate a field to
564 :     be created in the parent entity.
565 :    
566 :     =item noteControl (optional)
567 :    
568 :     If specified, the name of a variable for displaying the notes attached
569 :     to the field. This must be in Javascript form ready for assignment.
570 :     So, for example, if you have a variable called C<notes> that
571 :     represents a paragraph element, you should code C<notes.innerHTML>.
572 :     If it actually represents a form field you should code C<notes.value>.
573 :     If an C<innerHTML> coding is used, the text will be HTML-escaped before
574 :     it is copied in. Specifying this parameter generates Javascript for
575 :     displaying the field description when a field is selected.
576 :    
577 :     =item typeControl (optional)
578 :    
579 :     If specified, the name of a variable for displaying the field's
580 :     data type. Data types are a much more controlled vocabulary than
581 :     notes, so there is no worry about HTML translation. Instead, the
582 :     raw value is put into the specified variable. Otherwise, the same
583 :     rules apply to this value that apply to I<$noteControl>.
584 :    
585 :     =item RETURN
586 :    
587 :     Returns the HTML to create a form field that can be used to select an
588 :     attribute from the custom attributes system.
589 :    
590 :     =back
591 :    
592 :     =cut
593 :    
594 :     sub FieldMenu {
595 :     # Get the parameters.
596 :     my ($self, $cgi, $height, $name, $newFlag, $noteControl, $typeControl) = @_;
597 :     # These next two hashes make everything happen. "entities"
598 :     # maps each entity name to the list of values to be put into its
599 :     # option group. "labels" maps each entity name to a map from values
600 :     # to labels.
601 :     my @entityNames = sort ($self->GetEntityTypes());
602 :     my %entities = map { $_ => [] } @entityNames;
603 :     my %labels = map { $_ => { }} @entityNames;
604 :     # Loop through the entities, adding the existing attributes.
605 :     for my $entity (@entityNames) {
606 :     # Get this entity's field table.
607 :     my $fieldHash = $self->GetFieldTable($entity);
608 :     # Get its field list in our local hashes.
609 :     my $fieldList = $entities{$entity};
610 :     my $labelList = $labels{$entity};
611 :     # Add the NEW fields if we want them.
612 :     if ($newFlag) {
613 :     push @{$fieldList}, $entity;
614 :     $labelList->{$entity} = "(new)";
615 :     }
616 :     # Loop through the fields in the hash. We only keep the ones with a
617 :     # secondary relation name. (In other words, the name of the relation
618 :     # in which the field appears cannot be the same as the entity name.)
619 :     for my $fieldName (sort keys %{$fieldHash}) {
620 :     if ($fieldHash->{$fieldName}->{relation} ne $entity) {
621 :     my $value = "$entity/$fieldName";
622 :     push @{$fieldList}, $value;
623 :     $labelList->{$value} = $fieldName;
624 :     }
625 :     }
626 :     }
627 :     # Now we have a hash and a list for each entity, and they correspond
628 :     # exactly to what the $cgi->optgroup function expects.
629 :     # The last step is to create the name for the onChange function. This function
630 :     # may not do anything, but we need to know the name to generate the HTML
631 :     # for the menu.
632 :     my $changeName = "${name}_setNotes";
633 :     my $retVal = $cgi->popup_menu({name => $name,
634 :     size => $height,
635 :     onChange => "$changeName(this.value)",
636 :     values => [map { $cgi->optgroup(-name => $_,
637 :     -values => $entities{$_},
638 :     -labels => $labels{$_})
639 :     } @entityNames]}
640 :     );
641 :     # Create the change function.
642 :     $retVal .= "\n<script language=\"javascript\">\n";
643 :     $retVal .= " function $changeName(fieldValue) {\n";
644 :     # The function only has a body if we have a notes control to store the description.
645 :     if ($noteControl || $typeControl) {
646 :     # Check to see if we're storing HTML or text into the note control.
647 :     my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
648 :     # We use a CASE statement based on the newly-selected field value. The
649 :     # field description will be stored in the JavaScript variable "myText"
650 :     # and the data type in "myType". Note the default data type is a normal
651 :     # string, but the default notes is an empty string.
652 :     $retVal .= " var myText = \"\";\n";
653 :     $retVal .= " var myType = \"string\";\n";
654 :     $retVal .= " switch (fieldValue) {\n";
655 :     # Loop through the entities.
656 :     for my $entity (@entityNames) {
657 :     # Get the entity's field hash. This has the notes in it.
658 :     my $fieldHash = $self->GetFieldTable($entity);
659 :     # Loop through the values we might see for this entity's fields.
660 :     my $fields = $entities{$entity};
661 :     for my $value (@{$fields}) {
662 :     # Only proceed if we have an existing field.
663 :     if ($value =~ m!/(.+)$!) {
664 :     # Get the field's hash element.
665 :     my $element = $fieldHash->{$1};
666 :     # Generate this case.
667 :     $retVal .= " case \"$value\" :\n";
668 :     # Here we either want to update the note display, the
669 :     # type display, or both.
670 :     if ($noteControl) {
671 :     # Here we want the notes updated.
672 :     my $notes = $element->{Notes}->{content};
673 :     # Insure it's in the proper form.
674 :     if ($htmlMode) {
675 :     $notes = ERDB::HTMLNote($notes);
676 :     }
677 :     # Escape it for use as a string literal.
678 :     $notes =~ s/\n/\\n/g;
679 :     $notes =~ s/"/\\"/g;
680 :     $retVal .= " myText = \"$notes\";\n";
681 :     }
682 :     if ($typeControl) {
683 :     # Here we want the type updated.
684 :     my $type = $element->{type};
685 :     $retVal .= " myType = \"$type\";\n";
686 :     }
687 :     # Close this case.
688 :     $retVal .= " break;\n";
689 :     }
690 :     }
691 :     }
692 :     # Close the CASE statement and make the appropriate assignments.
693 :     $retVal .= " }\n";
694 :     if ($noteControl) {
695 :     $retVal .= " $noteControl = myText;\n";
696 :     }
697 :     if ($typeControl) {
698 :     $retVal .= " $typeControl = myType;\n";
699 :     }
700 :     }
701 :     # Terminate the change function.
702 :     $retVal .= " }\n";
703 :     $retVal .= "</script>\n";
704 :     # Return the result.
705 :     return $retVal;
706 :     }
707 :    
708 : parrello 1.3 =head3 MatchSqlPattern
709 :    
710 : parrello 1.4 C<< my $matched = CustomAttributes::MatchSqlPattern($value, $pattern); >>
711 : parrello 1.3
712 :     Determine whether or not a specified value matches an SQL pattern. An SQL
713 :     pattern has two wild card characters: C<%> that matches multiple characters,
714 :     and C<_> that matches a single character. These can be escaped using a
715 :     backslash (C<\>). We pull this off by converting the SQL pattern to a
716 :     PERL regular expression. As per SQL rules, the match is case-insensitive.
717 :    
718 :     =over 4
719 :    
720 :     =item value
721 :    
722 : parrello 1.4 Value to be matched against the pattern. Note that an undefined or empty
723 :     value will not match anything.
724 : parrello 1.3
725 :     =item pattern
726 :    
727 : parrello 1.4 SQL pattern against which to match the value. An undefined or empty pattern will
728 : parrello 1.3 match everything.
729 :    
730 :     =item RETURN
731 :    
732 :     Returns TRUE if the value and pattern match, else FALSE.
733 :    
734 :     =back
735 :    
736 :     =cut
737 :    
738 :     sub MatchSqlPattern {
739 :     # Get the parameters.
740 :     my ($value, $pattern) = @_;
741 :     # Declare the return variable.
742 :     my $retVal;
743 :     # Insure we have a pattern.
744 : parrello 1.4 if (! defined($pattern) || $pattern eq "") {
745 : parrello 1.3 $retVal = 1;
746 :     } else {
747 :     # Break the pattern into pieces around the wildcard characters. Because we
748 :     # use parentheses in the split function's delimiter expression, we'll get
749 :     # list elements for the delimiters as well as the rest of the string.
750 :     my @pieces = split /([_%]|\\[_%])/, $pattern;
751 :     # Check some fast special cases.
752 :     if ($pattern eq '%') {
753 :     # A null pattern matches everything.
754 :     $retVal = 1;
755 :     } elsif (@pieces == 1) {
756 :     # No wildcards, so we have a literal comparison. Note we're case-insensitive.
757 :     $retVal = (lc($value) eq lc($pattern));
758 :     } elsif (@pieces == 2 && $pieces[1] eq '%') {
759 :     # A wildcard at the end, so we have a substring match. This is also case-insensitive.
760 :     $retVal = (lc(substr($value, 0, length($pieces[0]))) eq lc($pieces[0]));
761 :     } else {
762 :     # Okay, we have to do it the hard way. Convert each piece to a PERL pattern.
763 :     my $realPattern = "";
764 :     for my $piece (@pieces) {
765 :     # Determine the type of piece.
766 :     if ($piece eq "") {
767 :     # Empty pieces are ignored.
768 :     } elsif ($piece eq "%") {
769 :     # Here we have a multi-character wildcard. Note that it can match
770 :     # zero or more characters.
771 :     $realPattern .= ".*"
772 :     } elsif ($piece eq "_") {
773 :     # Here we have a single-character wildcard.
774 :     $realPattern .= ".";
775 :     } elsif ($piece eq "\\%" || $piece eq "\\_") {
776 :     # This is an escape sequence (which is a rare thing, actually).
777 :     $realPattern .= substr($piece, 1, 1);
778 :     } else {
779 :     # Here we have raw text.
780 :     $realPattern .= quotemeta($piece);
781 :     }
782 :     }
783 :     # Do the match.
784 :     $retVal = ($value =~ /^$realPattern$/i ? 1 : 0);
785 :     }
786 :     }
787 :     # Return the result.
788 :     return $retVal;
789 :     }
790 :    
791 :     =head3 MigrateAttributes
792 :    
793 :     C<< CustomAttributes::MigrateAttributes($fig); >>
794 :    
795 :     Migrate all the attributes data from the specified FIG instance. This is a long, slow
796 :     method used to convert the old attribute data to the new system. Only attribute
797 :     keys that are not already in the database will be loaded, and only for entity instances
798 :     current in the database. To get an accurate capture of the attributes in the given
799 :     instance, you may want to clear the database and the DBD before starting and
800 :     run L</Refresh> to populate the entities.
801 :    
802 :     =over 4
803 :    
804 :     =item fig
805 :    
806 :     A FIG object that can be used to retrieve attributes for migration purposes.
807 :    
808 :     =back
809 :    
810 :     =cut
811 :    
812 :     sub MigrateAttributes {
813 :     # Get the parameters.
814 :     my ($fig) = @_;
815 :     # Get a list of the objects to migrate. This requires connecting. Note we
816 :     # will map each entity type to a file name. The file will contain a list
817 :     # of the object's IDs so we can get to them when we're not connected to
818 :     # the database.
819 :     my $ca = CustomAttributes->new();
820 :     my %objects = map { $_ => "$FIG_Config::temp/$_.keys.tbl" } $ca->GetEntityTypes();
821 :     # Set up hash of the existing attribute keys for each entity type.
822 :     my %oldKeys = ();
823 :     # Finally, we have a hash that counts the IDs for each entity type.
824 :     my %idCounts = map { $_ => 0 } keys %objects;
825 :     # Loop through the list, creating key files to read back in.
826 :     for my $entityType (keys %objects) {
827 :     Trace("Retrieving keys for $entityType.") if T(2);
828 :     # Create the key file.
829 :     my $idFile = Open(undef, ">$objects{$entityType}");
830 :     # Loop through the keys.
831 :     my @ids = $ca->GetFlat([$entityType], "", [], "$entityType(id)");
832 :     for my $id (@ids) {
833 :     print $idFile "$id\n";
834 :     }
835 :     close $idFile;
836 :     # In addition to the key file, we must get a list of attributes already
837 :     # in the database. This avoids a circularity problem that might occur if the $fig
838 :     # object is retrieving from the custom attributes database already.
839 :     my %fields = $ca->GetSecondaryFields($entityType);
840 :     $oldKeys{$entityType} = \%fields;
841 :     # Finally, we have the ID count.
842 :     $idCounts{$entityType} = scalar @ids;
843 :     }
844 :     # Release the custom attributes database so we can add attributes.
845 :     undef $ca;
846 :     # Loop through the objects.
847 :     for my $entityType (keys %objects) {
848 :     # Get a hash of all the attributes already in this database. These are
849 :     # left untouched.
850 :     my $myOldKeys = $oldKeys{$entityType};
851 :     # Create a hash to control the load file names for each attribute key we find.
852 :     my %keyHash = ();
853 :     # Set up some counters so we can trace our progress.
854 :     my ($totalIDs, $processedIDs, $keyCount, $valueCount) = ($idCounts{$entityType}, 0, 0, 0);
855 :     # Open this object's ID file.
856 :     Trace("Migrating data for $entityType. $totalIDs found.") if T(3);
857 :     my $keysIn = Open(undef, "<$objects{$entityType}");
858 :     while (my $id = <$keysIn>) {
859 :     # Remove the EOL characters.
860 :     chomp $id;
861 :     # Get this object's attributes.
862 :     my @allData = $fig->get_attributes($id);
863 :     Trace(scalar(@allData) . " attribute values found for $entityType($id).") if T(4);
864 :     # Loop through the attribute values one at a time.
865 :     for my $dataTuple (@allData) {
866 :     # Get the key, value, and URL. We ignore the first element because that's the
867 :     # object ID, and we already know the object ID.
868 :     my (undef, $key, $value, $url) = @{$dataTuple};
869 : parrello 1.7 # Remove the buggy "1" for $url.
870 :     if ($url eq "1") {
871 :     $url = undef;
872 :     }
873 : parrello 1.3 # Only proceed if this is not an old key.
874 :     if (! $myOldKeys->{$key}) {
875 :     # See if we've run into this key before.
876 :     if (! exists $keyHash{$key}) {
877 :     # Here we need to create the attribute key in the database.
878 :     StoreAttributeKey($entityType, $key, 'text',
879 :     "Key migrated automatically from the FIG system. " .
880 :     "Please replace these notes as soon as possible " .
881 :     "with useful text."
882 :     );
883 :     # Compute the attribute's load file name and open it for output.
884 :     my $fileName = "$FIG_Config::temp/$entityType.$key.load.tbl";
885 :     my $fh = Open(undef, ">$fileName");
886 :     # Store the file name and handle.
887 :     $keyHash{$key} = {h => $fh, name => $fileName};
888 :     # Count this key.
889 :     $keyCount++;
890 :     }
891 :     # Smash the value and the URL together.
892 :     if (defined($url) && length($url) > 0) {
893 :     $value .= "::$url";
894 :     }
895 :     # Write the attribute value to the load file.
896 :     Tracer::PutLine($keyHash{$key}->{h}, [$id, $value]);
897 :     $valueCount++;
898 :     }
899 :     }
900 :     # Now we've finished all the attributes for this object. Count and trace it.
901 :     $processedIDs++;
902 :     if ($processedIDs % 500 == 0) {
903 :     Trace("$processedIDs of $totalIDs ${entityType}s processed.") if T(3);
904 :     Trace("$entityType has $keyCount keys and $valueCount values so far.") if T(3);
905 :     }
906 :     }
907 :     # Now we've finished all the attributes for all objects of this type.
908 :     Trace("$processedIDs ${entityType}s processed, with $keyCount keys and $valueCount values.") if T(2);
909 :     # Loop through the files, loading the keys into the database.
910 :     Trace("Connecting to database.") if T(2);
911 :     my $objectCA = CustomAttributes->new();
912 :     Trace("Loading key files.") if T(2);
913 :     for my $key (sort keys %keyHash) {
914 :     # Close the key's load file.
915 :     close $keyHash{$key}->{h};
916 :     # Reopen it for input.
917 :     my $fileName = $keyHash{$key}->{name};
918 :     my $fh = Open(undef, "<$fileName");
919 :     Trace("Loading $key from $fileName.") if T(3);
920 :     my $stats = $objectCA->LoadAttributeKey($entityType, $key, $fh, 0, 1);
921 :     Trace("Statistics for $key of $entityType:\n" . $stats->Show()) if T(3);
922 :     }
923 :     # All the keys for this entity type are now loaded.
924 :     Trace("Key files loaded for $entityType.") if T(2);
925 :     }
926 :     # All keys for all entity types are now loaded.
927 :     Trace("Migration complete.") if T(2);
928 :     }
929 :    
930 : parrello 1.4 =head3 ComputeObjectTypeFromID
931 :    
932 :     C<< my ($entityName, $id) = CustomAttributes::ComputeObjectTypeFromID($objectID); >>
933 :    
934 :     This method will compute the entity type corresponding to a specified object ID.
935 :     If the object ID begins with C<fig|>, it is presumed to be a feature ID. If it
936 :     is all digits with a single period, it is presumed to by a genome ID. Otherwise,
937 :     it must be a list reference. In this last case the first list element will be
938 :     taken as the entity type and the second will be taken as the actual ID.
939 :    
940 :     =over 4
941 :    
942 :     =item objectID
943 :    
944 :     Object ID to examine.
945 :    
946 :     =item RETURN
947 :    
948 :     Returns a 2-element list consisting of the entity type followed by the specified ID.
949 :    
950 :     =back
951 :    
952 :     =cut
953 :    
954 :     sub ComputeObjectTypeFromID {
955 :     # Get the parameters.
956 :     my ($objectID) = @_;
957 :     # Declare the return variables.
958 :     my ($entityName, $id);
959 :     # Only proceed if the object ID is defined. If it's not, we'll be returning a
960 :     # pair of undefs.
961 :     if ($objectID) {
962 :     if (ref $objectID eq 'ARRAY') {
963 :     # Here we have the new-style list reference. Pull out its pieces.
964 :     ($entityName, $id) = @{$objectID};
965 :     } else {
966 :     # Here the ID is the outgoing ID, and we need to look at its structure
967 :     # to determine the entity type.
968 :     $id = $objectID;
969 :     if ($objectID =~ /^\d+\.\d+/) {
970 :     # Digits with a single period is a genome.
971 :     $entityName = 'Genome';
972 :     } elsif ($objectID =~ /^fig\|/) {
973 :     # The "fig|" prefix indicates a feature.
974 :     $entityName = 'Feature';
975 :     } else {
976 :     # Anything else is illegal!
977 :     Confess("Invalid attribute ID specification \"$objectID\".");
978 :     }
979 :     }
980 :     }
981 :     # Return the result.
982 :     return ($entityName, $id);
983 :     }
984 :    
985 : parrello 1.3 =head2 FIG Method Replacements
986 :    
987 :     The following methods are used by B<FIG.pm> to replace the previous attribute functionality.
988 :     Some of the old functionality is no longer present. Controlled vocabulary is no longer
989 :     supported and there is no longer any searching by URL. Fortunately, neither of these
990 :     capabilities were used in the old system.
991 :    
992 : parrello 1.4 The methods here are the only ones supported by the B<RemoteCustomAttributes> object.
993 :     The idea is that these methods represent attribute manipulation allowed by all users, while
994 :     the others are only for privileged users with access to the attribute server.
995 :    
996 : parrello 1.3 In the previous implementation, an attribute had a value and a URL. In the new implementation,
997 :     there is only a value. In this implementation, each attribute has only a value. These
998 :     methods will treat the value as a list with the individual elements separated by the
999 :     value of the splitter parameter on the constructor (L</new>). The default is double
1000 :     colons C<::>.
1001 :    
1002 :     So, for example, an old-style keyword with a /value of C<essential> and a URL of
1003 :     C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default
1004 :     splitter value would be stored as
1005 :    
1006 :     essential::http://www.sciencemag.org/cgi/content/abstract/293/5538/2266
1007 :    
1008 :     The best performance is achieved by searching for a particular key for a specified
1009 :     feature or genome.
1010 :    
1011 :     =head3 GetAttributes
1012 :    
1013 :     C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @valuePatterns); >>
1014 :    
1015 :     In the database, attribute values are sectioned into pieces using a splitter
1016 :     value specified in the constructor (L</new>). This is not a requirement of
1017 :     the attribute system as a whole, merely a convenience for the purpose of
1018 :     these methods. If you are using the static method calls instead of the
1019 :     object-based calls, the splitter will always be the default value of
1020 :     double colons (C<::>). If a value has multiple sections, each section
1021 :     is matched against the correspond criterion in the I<@valuePatterns> list.
1022 :    
1023 :     This method returns a series of tuples that match the specified criteria. Each tuple
1024 :     will contain an object ID, a key, and one or more values. The parameters to this
1025 :     method therefore correspond structurally to the values expected in each tuple.
1026 :    
1027 :     my @attributeList = GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);
1028 :    
1029 :     would return something like
1030 :    
1031 :     ['fig}100226.1.peg.1004', 'structure', 1, 2]
1032 :     ['fig}100226.1.peg.1004', 'structure1', 1, 2]
1033 :     ['fig}100226.1.peg.1004', 'structure2', 1, 2]
1034 :     ['fig}100226.1.peg.1004', 'structureA', 1, 2]
1035 :    
1036 :     Use of C<undef> in any position acts as a wild card (all values). In addition,
1037 :     the I<$key> and I<@valuePatterns> parameters can contain SQL pattern characters: C<%>, which
1038 :     matches any sequence of characters, and C<_>, which matches any single character.
1039 :     (You can use an escape sequence C<\%> or C<\_> to match an actual percent sign or
1040 :     underscore.)
1041 :    
1042 :     In addition to values in multiple sections, a single attribute key can have multiple
1043 :     values, so even
1044 :    
1045 :     my @attributeList = GetAttributes($peg, 'virulent');
1046 :    
1047 :     which has no wildcard in the key or the object ID, may return multiple tuples.
1048 :    
1049 :     For reasons of backward compatability, we examine the structure of the object ID to
1050 :     determine the entity type. In that case the only two types allowed are C<Genome> and
1051 :     C<Feature>. An alternative method is to use a list reference, with the list consisting
1052 :     of an entity type name and the actual ID. Thus, the above example could equivalently
1053 :     be written as
1054 :    
1055 :     my @attributeList = GetAttributes([Feature => $peg], 'virulent');
1056 :    
1057 :     The list-reference approach allows us to add attributes to other entity types in
1058 :     the future. Doing so, however, will require modifying the L</Refresh> method and
1059 :     updated the database design XML.
1060 :    
1061 :     The list-reference approach also allows for a more fault-tolerant approach to
1062 :     getting all objects with a particular attribute.
1063 :    
1064 :     my @attributeList = GetAttributes([Feature => undef], 'virulent');
1065 :    
1066 :     will only return feature attributes, while
1067 :    
1068 :     my @attributeList = GetAttributes(undef, 'virulent');
1069 :    
1070 :     could at some point in the future get you attributes for genomes or even subsystems
1071 :     as well as features.
1072 :    
1073 :     =over 4
1074 :    
1075 :     =item objectID
1076 :    
1077 :     ID of the genome or feature whose attributes are desired. In general, an ID that
1078 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits with a
1079 :     single period is treated as a genome ID. For other entity types, use a list reference; in
1080 :     this case the first list element is the entity type and the second is the ID. A value of
1081 : parrello 1.4 C<undef> or an empty string here will match all objects.
1082 : parrello 1.3
1083 :     =item key
1084 :    
1085 :     Attribute key name. Since attributes are stored as fields in the database with a
1086 :     field name equal to the key name, it is very fast to find a list of all the
1087 :     matching keys. Each key's values require a separate query, however, which may
1088 :     be a performance problem if the pattern matches a lot of keys. Wild cards are
1089 : parrello 1.4 acceptable here, and a value of C<undef> or an empty string will match all
1090 :     attribute keys.
1091 : parrello 1.3
1092 :     =item valuePatterns
1093 :    
1094 :     List of the desired attribute values, section by section. If C<undef>
1095 : parrello 1.4 or an empty string is specified, all values in that section will match.
1096 : parrello 1.3
1097 :     =item RETURN
1098 :    
1099 :     Returns a list of tuples. The first element in the tuple is an object ID, the
1100 :     second is an attribute key, and the remaining elements are the sections of
1101 :     the attribute value. All of the tuples will match the criteria set forth in
1102 :     the parameter list.
1103 :    
1104 :     =back
1105 :    
1106 :     =cut
1107 :    
1108 :     sub GetAttributes {
1109 : parrello 1.4 # Get the parameters.
1110 :     my ($self, $objectID, $key, @valuePatterns) = @_;
1111 : parrello 1.3 # Declare the return variable.
1112 :     my @retVal = ();
1113 :     # Determine the entity types for our search.
1114 :     my @objects = ();
1115 :     my ($actualObjectID, $computedType);
1116 : parrello 1.4 if (! $objectID) {
1117 : parrello 1.3 push @objects, $self->GetEntityTypes();
1118 :     } else {
1119 :     ($computedType, $actualObjectID) = ComputeObjectTypeFromID($objectID);
1120 :     push @objects, $computedType;
1121 :     }
1122 :     # Loop through the entity types.
1123 :     for my $entityType (@objects) {
1124 :     # Now we need to find all the matching keys. The keys are actually stored in
1125 :     # our database object, so this process is fast. Note that our
1126 :     # MatchSqlPattern method
1127 :     my %secondaries = $self->GetSecondaryFields($entityType);
1128 :     my @fieldList = grep { MatchSqlPattern($_, $key) } keys %secondaries;
1129 : parrello 1.7 # Now we figure out whether or not we need to filter by object. We will always
1130 :     # filter by key to a limited extent, so if we're filtering by object we need an
1131 :     # AND to join the object ID filter with the key filter.
1132 : parrello 1.3 my $filter = "";
1133 :     my @params = ();
1134 : parrello 1.5 if (defined($actualObjectID)) {
1135 : parrello 1.3 # Here the caller wants to filter on object ID.
1136 : parrello 1.7 $filter = "$entityType(id) = ? AND ";
1137 : parrello 1.3 push @params, $actualObjectID;
1138 :     }
1139 :     # It's time to begin making queries. We process one attribute key at a time, because
1140 :     # each attribute is actually a different field in the database. We know here that
1141 :     # all the keys we've collected are for the correct entity because we got them from
1142 :     # the DBD. That's a good thing, because an invalid key name will cause an SQL error.
1143 :     for my $key (@fieldList) {
1144 :     # Get all of the attribute values for this key.
1145 : parrello 1.7 my @dataRows = $self->GetAll([$entityType], "$filter$entityType($key) IS NOT NULL",
1146 :     \@params, ["$entityType(id)", "$entityType($key)"]);
1147 : parrello 1.3 # Process each value separately. We need to verify the values and reformat the
1148 :     # tuples. Note that GetAll will give us one row per matching object ID,
1149 :     # with the ID first followed by a list of the data values. This is very
1150 :     # different from the structure we'll be returning, which has one row
1151 :     # per value.
1152 :     for my $dataRow (@dataRows) {
1153 :     # Get the object ID and the list of values.
1154 :     my ($rowObjectID, @dataValues) = @{$dataRow};
1155 :     # Loop through the values. There will be one result row per attribute value.
1156 :     for my $dataValue (@dataValues) {
1157 :     # Separate this value into sections.
1158 :     my @sections = split("::", $dataValue);
1159 :     # Loop through the value patterns, looking for a mismatch. Note that
1160 :     # since we're working through parallel arrays, we are using an index
1161 :     # loop. As soon as a match fails we stop checking. This means that
1162 :     # if the value pattern list is longer than the number of sections,
1163 :     # we will fail as soon as we run out of sections.
1164 :     my $match = 1;
1165 :     for (my $i = 0; $i <= $#valuePatterns && $match; $i++) {
1166 :     $match = MatchSqlPattern($sections[$i], $valuePatterns[$i]);
1167 :     }
1168 :     # If we match, we save this value in the output list.
1169 :     if ($match) {
1170 :     push @retVal, [$rowObjectID, $key, @sections];
1171 :     }
1172 :     }
1173 :     # Here we've processed all the attribute values for the current object ID.
1174 :     }
1175 :     # Here we've processed all the rows returned by GetAll. In general, there will
1176 :     # be one row per object ID.
1177 :     }
1178 :     # Here we've processed all the matching attribute keys.
1179 :     }
1180 :     # Here we've processed all the entity types. That means @retVal has all the matching
1181 :     # results.
1182 :     return @retVal;
1183 :     }
1184 :    
1185 :     =head3 AddAttribute
1186 :    
1187 :     C<< $attrDB->AddAttribute($objectID, $key, @values); >>
1188 :    
1189 :     Add an attribute key/value pair to an object. This method cannot add a new key, merely
1190 :     add a value to an existing key. Use L</StoreAttributeKey> to create a new key.
1191 :    
1192 :     =over 4
1193 :    
1194 :     =item objectID
1195 :    
1196 :     ID of the genome or feature to which the attribute is to be added. In general, an ID that
1197 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1198 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1199 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1200 :    
1201 :     =item key
1202 :    
1203 :     Attribute key name. This corresponds to the name of a field in the database.
1204 :    
1205 :     =item values
1206 :    
1207 :     One or more values to be associated with the key. The values are joined together with
1208 :     the splitter value before being stored as field values. This enables L</GetAttributes>
1209 :     to split them apart during retrieval. The splitter value defaults to double colons C<::>.
1210 :    
1211 :     =back
1212 :    
1213 :     =cut
1214 :    
1215 :     sub AddAttribute {
1216 :     # Get the parameters.
1217 : parrello 1.4 my ($self, $objectID, $key, @values) = @_;
1218 : parrello 1.3 # Don't allow undefs.
1219 :     if (! defined($objectID)) {
1220 :     Confess("No object ID specified for AddAttribute call.");
1221 :     } elsif (! defined($key)) {
1222 :     Confess("No attribute key specified for AddAttribute call.");
1223 :     } elsif (! @values) {
1224 :     Confess("No values specified in AddAttribute call for key $key.");
1225 :     } else {
1226 :     # Okay, now we have some reason to believe we can do this. Start by
1227 :     # computing the object type and ID.
1228 :     my ($entityName, $id) = ComputeObjectTypeFromID($objectID);
1229 :     # Form the values into a scalar.
1230 :     my $valueString = join($self->{splitter}, @values);
1231 :     # Insert the value.
1232 :     $self->InsertValue($id, "$entityName($key)", $valueString);
1233 :     }
1234 :     # Return a one. We do this for backward compatability.
1235 :     return 1;
1236 :     }
1237 :    
1238 :     =head3 DeleteAttribute
1239 :    
1240 :     C<< $attrDB->DeleteAttribute($objectID, $key, @values); >>
1241 :    
1242 :     Delete the specified attribute key/value combination from the database.
1243 :    
1244 :     The first form will connect to the database and release it. The second form
1245 :     uses the database connection contained in the object.
1246 :    
1247 :     =over 4
1248 :    
1249 :     =item objectID
1250 :    
1251 :     ID of the genome or feature to which the attribute is to be added. In general, an ID that
1252 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1253 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1254 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1255 :    
1256 :     =item key
1257 :    
1258 :     Attribute key name. This corresponds to the name of a field in the database.
1259 :    
1260 :     =item values
1261 :    
1262 :     One or more values to be associated with the key.
1263 :    
1264 :     =back
1265 :    
1266 :     =cut
1267 :    
1268 :     sub DeleteAttribute {
1269 :     # Get the parameters.
1270 : parrello 1.4 my ($self, $objectID, $key, @values) = @_;
1271 : parrello 1.3 # Don't allow undefs.
1272 :     if (! defined($objectID)) {
1273 :     Confess("No object ID specified for DeleteAttribute call.");
1274 :     } elsif (! defined($key)) {
1275 :     Confess("No attribute key specified for DeleteAttribute call.");
1276 :     } elsif (! @values) {
1277 :     Confess("No values specified in DeleteAttribute call for key $key.");
1278 :     } else {
1279 :     # Now compute the object type and ID.
1280 :     my ($entityName, $id) = ComputeObjectTypeFromID($objectID);
1281 :     # Form the values into a scalar.
1282 :     my $valueString = join($self->{splitter}, @values);
1283 :     # Delete the value.
1284 :     $self->DeleteValue($entityName, $id, $key, $valueString);
1285 :     }
1286 :     # Return a one. This is for backward compatability.
1287 :     return 1;
1288 :     }
1289 :    
1290 :     =head3 ChangeAttribute
1291 :    
1292 :     C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>
1293 :    
1294 :     Change the value of an attribute key/value pair for an object.
1295 :    
1296 :     =over 4
1297 :    
1298 :     =item objectID
1299 :    
1300 :     ID of the genome or feature to which the attribute is to be changed. In general, an ID that
1301 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1302 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1303 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1304 :    
1305 :     =item key
1306 :    
1307 :     Attribute key name. This corresponds to the name of a field in the database.
1308 :    
1309 :     =item oldValues
1310 :    
1311 :     One or more values identifying the key/value pair to change.
1312 :    
1313 :     =item newValues
1314 :    
1315 :     One or more values to be put in place of the old values.
1316 :    
1317 :     =back
1318 :    
1319 :     =cut
1320 :    
1321 :     sub ChangeAttribute {
1322 :     # Get the parameters.
1323 : parrello 1.4 my ($self, $objectID, $key, $oldValues, $newValues) = @_;
1324 : parrello 1.3 # Don't allow undefs.
1325 :     if (! defined($objectID)) {
1326 :     Confess("No object ID specified for ChangeAttribute call.");
1327 :     } elsif (! defined($key)) {
1328 :     Confess("No attribute key specified for ChangeAttribute call.");
1329 :     } elsif (! defined($oldValues) || ref $oldValues ne 'ARRAY') {
1330 :     Confess("No old values specified in ChangeAttribute call for key $key.");
1331 :     } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {
1332 :     Confess("No new values specified in ChangeAttribute call for key $key.");
1333 :     } else {
1334 :     # Okay, now we do the change as a delete/add.
1335 :     $self->DeleteAttribute($objectID, $key, @{$oldValues});
1336 :     $self->AddAttribute($objectID, $key, @{$newValues});
1337 :     }
1338 :     # Return a one. We do this for backward compatability.
1339 :     return 1;
1340 :     }
1341 :    
1342 : parrello 1.7 =head3 EraseAttribute
1343 :    
1344 :     C<< $attrDB->EraseAttribute($entityName, $key); >>
1345 :    
1346 :     Erase all values for the specified attribute key. This does not remove the
1347 :     key from the database; it merely removes all the values.
1348 :    
1349 :     =over 4
1350 :    
1351 :     =item entityName
1352 :    
1353 :     Name of the entity to which the key belongs. If undefined, all entities will be
1354 :     examined for the desired key.
1355 :    
1356 :     =item key
1357 :    
1358 :     Key to erase.
1359 :    
1360 :     =back
1361 :    
1362 :     =cut
1363 :    
1364 :     sub EraseAttribute {
1365 :     # Get the parameters.
1366 :     my ($self, $entityName, $key) = @_;
1367 :     # Determine the relevant entity types.
1368 :     my @objects = ();
1369 :     if (! $entityName) {
1370 :     push @objects, $self->GetEntityTypes();
1371 :     } else {
1372 :     push @objects, $entityName;
1373 :     }
1374 :     # Loop through the entity types.
1375 :     for my $entityType (@objects) {
1376 :     # Now check for this key in this entity.
1377 :     my %secondaries = $self->GetSecondaryFields($entityType);
1378 :     if (exists $secondaries{$key}) {
1379 :     # We found it, so delete all the values of the key.
1380 :     $self->DeleteValue($entityName, undef, $key);
1381 :     }
1382 :     }
1383 :     # Return a 1, for backward compatability.
1384 :     return 1;
1385 :     }
1386 :    
1387 : parrello 1.1 1;

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3