[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

Annotation of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.7 - (view) (download) (as text)

1 : parrello 1.1 #!/usr/bin/perl -w
2 :    
3 :     package CustomAttributes;
4 :    
5 :     require Exporter;
6 :     use ERDB;
7 : parrello 1.4 @ISA = qw(ERDB);
8 : parrello 1.1 use strict;
9 :     use Tracer;
10 :     use ERDBLoad;
11 :    
12 :     =head1 Custom SEED Attribute Manager
13 :    
14 :     =head2 Introduction
15 :    
16 :     The Custom SEED Attributes Manager allows the user to upload and retrieve
17 :     custom data for SEED objects. It uses the B<ERDB> database system to
18 :     store the attributes, which are implemented as multi-valued fields
19 :     of ERDB entities.
20 :    
21 :     The full suite of ERDB retrieval capabilities is provided. In addition,
22 :     custom methods are provided specific to this application. To get all
23 : parrello 1.6 the values of the attribute C<essential> in a specified B<Feature>, you
24 : parrello 1.1 would code
25 :    
26 : parrello 1.6 my @values = $attrDB->GetAttributes([Feature => $fid], 'essential');
27 : parrello 1.1
28 : parrello 1.2 where I<$fid> contains the ID of the desired feature. Each attribute has
29 :     an alternate index to allow searching for attributes by value.
30 : parrello 1.1
31 :     New attributes are introduced by updating the database definition at
32 :     run-time. Attribute values are stored by uploading data from files.
33 :     A web interface is provided for both these activities.
34 :    
35 :     =head2 FIG_Config Parameters
36 :    
37 :     The following configuration parameters are used to manage custom attributes.
38 :    
39 :     =over 4
40 :    
41 :     =item attrDbms
42 :    
43 :     Type of database manager used: C<mysql> for MySQL or C<pg> for PostGres.
44 :    
45 :     =item attrDbName
46 :    
47 :     Name of the attribute database.
48 :    
49 :     =item attrHost
50 :    
51 :     Name of the host server for the database. If omitted, the current host
52 :     is used.
53 :    
54 :     =item attrUser
55 :    
56 :     User name for logging in to the database.
57 :    
58 :     =item attrPass
59 :    
60 :     Password for logging in to the database.
61 :    
62 :     =item attrPort
63 :    
64 :     TCP/IP port for accessing the database.
65 :    
66 :     =item attrSock
67 :    
68 :     Socket name used to access the database. If omitted, the default socket
69 :     will be used.
70 :    
71 :     =item attrDBD
72 :    
73 :     Fully-qualified file name for the database definition XML file. This file
74 :     functions as data to the attribute management process, so if the data is
75 :     moved, this file must go with it.
76 :    
77 :     =back
78 :    
79 : parrello 1.6 The DBD file is critical, and must have reasonable contents before we can
80 :     begin using the system. In the old system, attributes were only provided
81 :     for Genomes and Features, so the initial XML file was the following.
82 :    
83 :     <Database>
84 :     <Title>SEED Custom Attribute Database</Title>
85 :     <Entities>
86 :     <Entity name="Feature" keyType="id-string">
87 :     <Notes>A [i]feature[/i] is a part of the genome
88 :     that is of special interest. Features may be spread
89 :     across multiple contigs of a genome, but never across
90 :     more than one genome. Features can be assigned to roles
91 :     via spreadsheet cells, and are the targets of
92 :     annotation.</Notes>
93 :     </Entity>
94 :     <Entity name="Genome" keyType="name-string">
95 :     <Notes>A [i]genome[/i] describes a particular individual
96 :     organism's DNA.</Notes>
97 :     </Entity>
98 :     </Entities>
99 :     </Database>
100 :    
101 :     It is not necessary to put any tables into the database; however, you should
102 :     run
103 :    
104 :     AttrDBRefresh
105 :    
106 :     periodically to insure it has the correct Genomes and Features in it. When
107 :     converting from the old system, use
108 :    
109 :     AttrDBRefresh -migrate
110 :    
111 :     to initialize the database and migrate the legacy data. You should only need
112 :     to do that once.
113 :    
114 : parrello 1.3 =head2 Implementation Note
115 : parrello 1.1
116 :     The L</Refresh> method reloads the entities in the database. If new
117 :     entity types are added, that method will need to be adjusted accordingly.
118 :    
119 :     =head2 Public Methods
120 :    
121 :     =head3 new
122 :    
123 : parrello 1.3 C<< my $attrDB = CustomAttributes->new($splitter); >>
124 : parrello 1.1
125 : parrello 1.3 Construct a new CustomAttributes object. This object cannot be used to add or
126 :     delete keys because that requires modifying the database design. To do that,
127 :     you need to use the static L</StoreAttributeKey> or L</DeleteAttributeKey>
128 :     methods.
129 :    
130 :     =over 4
131 :    
132 :     =item splitter
133 :    
134 :     Value to be used to split attribute values into sections in the
135 :     L</Fig Replacement Methods>. The default is a double colon C<::>.
136 :     If you do not use the replacement methods, you do not need to
137 :     worry about this parameter.
138 :    
139 :     =back
140 : parrello 1.1
141 :     =cut
142 :    
143 :     sub new {
144 :     # Get the parameters.
145 : parrello 1.3 my ($class, $splitter) = @_;
146 : parrello 1.1 # Connect to the database.
147 :     my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
148 :     $FIG_Config::attrUser, $FIG_Config::attrPass,
149 :     $FIG_Config::attrPort, $FIG_Config::attrHost,
150 :     $FIG_Config::attrSock);
151 :     # Create the ERDB object.
152 :     my $xmlFileName = $FIG_Config::attrDBD;
153 :     my $retVal = ERDB::new($class, $dbh, $xmlFileName);
154 : parrello 1.3 # Store the splitter value.
155 :     $retVal->{splitter} = (defined($splitter) ? $splitter : '::');
156 : parrello 1.1 # Return the result.
157 :     return $retVal;
158 :     }
159 :    
160 : parrello 1.3 =head3 StoreAttributeKey
161 : parrello 1.1
162 : parrello 1.3 C<< my $attrDB = CustomAttributes::StoreAttributeKey($entityName, $attributeName, $type, $notes); >>
163 : parrello 1.1
164 :     Create or update an attribute for the database. This method will update the database definition
165 :     XML, but it will not create the table. It will connect to the database so that the caller
166 :     can upload the attribute values.
167 :    
168 :     =over 4
169 :    
170 :     =item entityName
171 :    
172 :     Name of the entity containing the attribute. The entity must exist.
173 :    
174 :     =item attributeName
175 :    
176 :     Name of the attribute. It must be a valid ERDB field name, consisting entirely of
177 :     letters, digits, and hyphens, with a letter at the beginning. If it does not
178 :     exist already, it will be created.
179 :    
180 :     =item type
181 :    
182 :     Data type of the attribute. This must be a valid ERDB data type name.
183 :    
184 :     =item notes
185 :    
186 :     Descriptive notes about the attribute. It is presumed to be raw text, not HTML.
187 :    
188 :     =item RETURN
189 :    
190 :     Returns a Custom Attribute Database object if successful. If unsuccessful, an
191 :     error will be thrown.
192 :    
193 :     =back
194 :    
195 :     =cut
196 :    
197 : parrello 1.3 sub StoreAttributeKey {
198 : parrello 1.1 # Get the parameters.
199 :     my ($entityName, $attributeName, $type, $notes) = @_;
200 :     # Get the data type hash.
201 :     my %types = ERDB::GetDataTypes();
202 :     # Validate the initial input values.
203 :     if (! ERDB::ValidateFieldName($attributeName)) {
204 :     Confess("Invalid attribute name \"$attributeName\" specified.");
205 :     } elsif (! $notes || length($notes) < 25) {
206 :     Confess("Missing or incomplete description for $attributeName.");
207 :     } elsif (! exists $types{$type}) {
208 :     Confess("Invalid data type \"$type\" for $attributeName.");
209 :     }
210 :     # Our next step is to read in the XML for the database defintion. We
211 :     # need to verify that the named entity exists.
212 :     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
213 :     my $entityHash = $metadata->{Entities};
214 :     if (! exists $entityHash->{$entityName}) {
215 :     Confess("Entity $entityName not found.");
216 :     } else {
217 : parrello 1.2 # Okay, we're ready to begin. Get the entity hash and the field hash.
218 :     my $entityData = $entityHash->{$entityName};
219 : parrello 1.1 my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
220 :     # Compute the attribute's relation name.
221 : parrello 1.3 my $relName = join("", $entityName, map { ucfirst $_ } split(/-|_/, $attributeName));
222 : parrello 1.1 # Store the attribute's field data. Note the use of the "content" hash for
223 :     # the notes. This is how the XML writer knows Notes is a text tag instead of
224 :     # an attribute.
225 :     $fieldHash->{$attributeName} = { type => $type, relation => $relName,
226 :     Notes => { content => $notes } };
227 : parrello 1.2 # Insure we have an index for this attribute.
228 :     my $index = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
229 :     if (! defined($index)) {
230 :     push @{$entityData->{Indexes}}, { IndexFields => [ { name => $attributeName, order => 'ascending' } ],
231 :     Notes => "Alternate index provided for access by $attributeName." };
232 :     }
233 : parrello 1.1 # Write the XML back out.
234 :     ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
235 :     }
236 :     # Open a database with the new XML.
237 :     my $retVal = CustomAttributes->new();
238 :     return $retVal;
239 :     }
240 :    
241 :     =head3 Refresh
242 :    
243 : parrello 1.3 C<< $attrDB->Refresh($fig); >>
244 : parrello 1.1
245 :     Refresh the primary entity tables from the FIG data store. This method basically
246 :     drops and reloads the main tables of the custom attributes database.
247 :    
248 : parrello 1.3 =over 4
249 :    
250 :     =item fig
251 :    
252 :     FIG-like object that can be used to find genomes and features.
253 :    
254 :     =back
255 :    
256 : parrello 1.1 =cut
257 :    
258 :     sub Refresh {
259 :     # Get the parameters.
260 : parrello 1.3 my ($self, $fig) = @_;
261 : parrello 1.1 # Create load objects for the genomes and the features.
262 :     my $loadGenome = ERDBLoad->new($self, 'Genome', $FIG_Config::temp);
263 :     my $loadFeature = ERDBLoad->new($self, 'Feature', $FIG_Config::temp);
264 :     # Get the genome list.
265 :     my @genomes = $fig->genomes();
266 :     # Loop through the genomes.
267 :     for my $genomeID (@genomes) {
268 :     # Put this genome in the genome table.
269 :     $loadGenome->Put($genomeID);
270 :     Trace("Processing Genome $genomeID") if T(3);
271 :     # Put its features into the feature table. Note we have to use a hash to
272 :     # remove duplicates.
273 :     my %featureList = map { $_ => 1 } $fig->all_features($genomeID);
274 :     for my $fid (keys %featureList) {
275 :     $loadFeature->Put($fid);
276 :     }
277 :     }
278 :     # Get a variable for holding statistics objects.
279 :     my $stats;
280 :     # Finish the genome load.
281 :     Trace("Loading Genome relation.") if T(2);
282 :     $stats = $loadGenome->FinishAndLoad();
283 :     Trace("Genome table load statistics:\n" . $stats->Show()) if T(3);
284 :     # Finish the feature load.
285 :     Trace("Loading Feature relation.") if T(2);
286 :     $stats = $loadFeature->FinishAndLoad();
287 :     Trace("Feature table load statistics:\n" . $stats->Show()) if T(3);
288 :     }
289 :    
290 : parrello 1.3 =head3 LoadAttributeKey
291 : parrello 1.1
292 : parrello 1.3 C<< my $stats = $attrDB->LoadAttributeKey($entityName, $fieldName, $fh, $keyCol, $dataCol); >>
293 : parrello 1.1
294 :     Load the specified attribute from the specified file. The file should be a
295 :     tab-delimited file with internal tab and new-line characters escaped. This is
296 :     the typical TBL-style file used by most FIG applications. One of the columns
297 :     in the input file must contain the appropriate key value and the other the
298 :     corresponding attribute value.
299 :    
300 :     =over 4
301 :    
302 :     =item entityName
303 :    
304 :     Name of the entity containing the attribute.
305 :    
306 :     =item fieldName
307 :    
308 :     Name of the actual attribute.
309 :    
310 :     =item fh
311 :    
312 :     Open file handle for the input file.
313 :    
314 :     =item keyCol
315 :    
316 :     Index (0-based) of the column containing the key field. The key field should
317 :     contain the ID of an instance of the named entity.
318 :    
319 :     =item dataCol
320 :    
321 :     Index (0-based) of the column containing the data value field.
322 :    
323 :     =item RETURN
324 :    
325 :     Returns a statistics object for the load process.
326 :    
327 :     =back
328 :    
329 :     =cut
330 :    
331 : parrello 1.3 sub LoadAttributeKey {
332 : parrello 1.1 # Get the parameters.
333 :     my ($self, $entityName, $fieldName, $fh, $keyCol, $dataCol) = @_;
334 :     # Create the return variable.
335 :     my $retVal;
336 :     # Insure the entity exists.
337 :     my $found = grep { $_ eq $entityName } $self->GetEntityTypes();
338 :     if (! $found) {
339 :     Confess("Entity \"$entityName\" not found in database.");
340 :     } else {
341 :     # Get the field structure for the named entity.
342 :     my $fieldHash = $self->GetFieldTable($entityName);
343 :     # Verify that the attribute exists.
344 :     if (! exists $fieldHash->{$fieldName}) {
345 : parrello 1.3 Confess("Attribute key \"$fieldName\" does not exist in entity $entityName.");
346 : parrello 1.1 } else {
347 :     # Create a loader for the specified attribute. We need the
348 :     # relation name first.
349 :     my $relName = $fieldHash->{$fieldName}->{relation};
350 :     my $loadAttribute = ERDBLoad->new($self, $relName, $FIG_Config::temp);
351 :     # Loop through the input file.
352 :     while (! eof $fh) {
353 :     # Get the next line of the file.
354 :     my @fields = Tracer::GetLine($fh);
355 :     $loadAttribute->Add("lineIn");
356 :     # Now we need to validate the line.
357 :     if ($#fields < $dataCol) {
358 :     $loadAttribute->Add("shortLine");
359 :     } elsif (! $self->Exists($entityName, $fields[$keyCol])) {
360 :     $loadAttribute->Add("badKey");
361 :     } else {
362 :     # It's valid,so send it to the loader.
363 :     $loadAttribute->Put($fields[$keyCol], $fields[$dataCol]);
364 :     $loadAttribute->Add("lineUsed");
365 :     }
366 :     }
367 :     # Finish the load.
368 :     $retVal = $loadAttribute->FinishAndLoad();
369 :     }
370 :     }
371 :     # Return the statistics.
372 :     return $retVal;
373 :     }
374 :    
375 :    
376 : parrello 1.3 =head3 DeleteAttributeKey
377 :    
378 :     C<< CustomAttributes::DeleteAttributeKey($entityName, $attributeName); >>
379 : parrello 1.1
380 :     Delete an attribute from the custom attributes database.
381 :    
382 :     =over 4
383 :    
384 :     =item entityName
385 :    
386 :     Name of the entity possessing the attribute.
387 :    
388 :     =item attributeName
389 :    
390 :     Name of the attribute to delete.
391 :    
392 :     =back
393 :    
394 :     =cut
395 :    
396 : parrello 1.3 sub DeleteAttributeKey {
397 : parrello 1.1 # Get the parameters.
398 :     my ($entityName, $attributeName) = @_;
399 :     # Read in the XML for the database defintion. We need to verify that
400 :     # the named entity exists and it has the named attribute.
401 :     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
402 :     my $entityHash = $metadata->{Entities};
403 :     if (! exists $entityHash->{$entityName}) {
404 :     Confess("Entity \"$entityName\" not found.");
405 :     } else {
406 :     # Get the field hash.
407 :     my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
408 :     if (! exists $fieldHash->{$attributeName}) {
409 : parrello 1.3 Confess("Attribute key \"$attributeName\" not found in entity $entityName.");
410 : parrello 1.1 } else {
411 :     # Get the attribute's relation name.
412 :     my $relName = $fieldHash->{$attributeName}->{relation};
413 : parrello 1.2 # Check for an index.
414 :     my $indexIdx = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
415 :     if (defined($indexIdx)) {
416 :     Trace("Index for $attributeName found at position $indexIdx for $entityName.") if T(3);
417 :     delete $entityHash->{$entityName}->{Indexes}->[$indexIdx];
418 :     }
419 : parrello 1.1 # Delete the attribute from the field hash.
420 :     Trace("Deleting attribute $attributeName from $entityName.") if T(3);
421 :     delete $fieldHash->{$attributeName};
422 :     # Write the XML back out.
423 :     ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
424 :     # Insure the relation does not exist in the database. This requires connecting
425 :     # since we may have to do a table drop.
426 :     my $attrDB = CustomAttributes->new();
427 : parrello 1.2 Trace("Dropping table $relName.") if T(3);
428 : parrello 1.1 $attrDB->DropRelation($relName);
429 :     }
430 :     }
431 :     }
432 :    
433 :     =head3 ControlForm
434 :    
435 :     C<< my $formHtml = $attrDB->ControlForm($cgi, $name); >>
436 :    
437 :     Return a form that can be used to control the creation and modification of
438 :     attributes.
439 :    
440 :     =over 4
441 :    
442 :     =item cgi
443 :    
444 :     CGI query object used to create HTML.
445 :    
446 :     =item name
447 :    
448 :     Name to give to the form. This should be unique for the web page.
449 :    
450 :     =item RETURN
451 :    
452 :     Returns the HTML for a form that submits instructions to the C<Attributes.cgi> script
453 :     for loading, creating, or deleting an attribute.
454 :    
455 :     =back
456 :    
457 :     =cut
458 :    
459 :     sub ControlForm {
460 :     # Get the parameters.
461 :     my ($self, $cgi, $name) = @_;
462 :     # Declare the return list.
463 :     my @retVal = ();
464 :     # Start the form. We use multipart to support the upload control.
465 :     push @retVal, $cgi->start_multipart_form(-name => $name);
466 :     # We'll put the controls in a table. Nothing else ever seems to look nice.
467 :     push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
468 :     # The first row is for selecting the field name.
469 :     push @retVal, $cgi->Tr($cgi->th("Select a Field"),
470 :     $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', 1,
471 :     "document.$name.notes.value",
472 :     "document.$name.dataType.value")));
473 :     # Now we set up a dropdown for the data types. The values will be the
474 :     # data type names, and the labels will be the descriptions.
475 :     my %types = ERDB::GetDataTypes();
476 :     my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
477 :     my $typeMenu = $cgi->popup_menu(-name => 'dataType',
478 :     -values => [sort keys %types],
479 :     -labels => \%labelMap);
480 :     push @retVal, $cgi->Tr($cgi->th("Data type"),
481 :     $cgi->td($typeMenu));
482 :     # The next row is for the notes.
483 :     push @retVal, $cgi->Tr($cgi->th("Description"),
484 :     $cgi->td($cgi->textarea(-name => 'notes',
485 :     -rows => 6,
486 :     -columns => 80))
487 :     );
488 :     # Allow the user to specify a new field name. This is required if the
489 :     # user has selected one of the "(new)" markers.
490 :     push @retVal, $cgi->Tr($cgi->th("New Field Name"),
491 :     $cgi->td($cgi->textfield(-name => 'newName',
492 :     -size => 30)),
493 :     );
494 :     # If the user wants to upload new values for the field, then we have
495 :     # an upload file name and column indicators.
496 :     push @retVal, $cgi->Tr($cgi->th("Upload Values"),
497 :     $cgi->td($cgi->filefield(-name => 'newValueFile',
498 :     -size => 20) .
499 :     " Key&nbsp;" .
500 :     $cgi->textfield(-name => 'keyCol',
501 :     -size => 3,
502 :     -default => 0) .
503 :     " Value&nbsp;" .
504 :     $cgi->textfield(-name => 'valueCol',
505 :     -size => 3,
506 :     -default => 1)
507 :     ),
508 :     );
509 : parrello 1.7 # Now the three buttons: UPDATE, SHOW, and DELETE.
510 : parrello 1.1 push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
511 :     $cgi->td({align => 'center'},
512 :     $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .
513 : parrello 1.7 $cgi->submit(-name => 'Store', -value => 'STORE') . " " .
514 :     $cgi->submit(-name => 'Show', -value => 'SHOW')
515 : parrello 1.1 )
516 :     );
517 :     # Close the table and the form.
518 :     push @retVal, $cgi->end_table();
519 :     push @retVal, $cgi->end_form();
520 :     # Return the assembled HTML.
521 :     return join("\n", @retVal, "");
522 :     }
523 :    
524 :     =head3 FieldMenu
525 :    
526 :     C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $newFlag, $noteControl, $typeControl); >>
527 :    
528 :     Return the HTML for a menu to select an attribute field. The menu will
529 :     be a standard SELECT/OPTION thing which is called "popup menu" in the
530 :     CGI package, but actually looks like a list. The list will contain
531 :     one selectable row per field, grouped by entity.
532 :    
533 :     =over 4
534 :    
535 :     =item cgi
536 :    
537 :     CGI query object used to generate HTML.
538 :    
539 :     =item height
540 :    
541 :     Number of lines to display in the list.
542 :    
543 :     =item name
544 :    
545 :     Name to give to the menu. This is the name under which the value will
546 :     appear when the form is submitted.
547 :    
548 :     =item newFlag (optional)
549 :    
550 :     If TRUE, then extra rows will be provided to allow the user to select
551 :     a new attribute. In other words, the user can select an existing
552 :     attribute, or can choose a C<(new)> marker to indicate a field to
553 :     be created in the parent entity.
554 :    
555 :     =item noteControl (optional)
556 :    
557 :     If specified, the name of a variable for displaying the notes attached
558 :     to the field. This must be in Javascript form ready for assignment.
559 :     So, for example, if you have a variable called C<notes> that
560 :     represents a paragraph element, you should code C<notes.innerHTML>.
561 :     If it actually represents a form field you should code C<notes.value>.
562 :     If an C<innerHTML> coding is used, the text will be HTML-escaped before
563 :     it is copied in. Specifying this parameter generates Javascript for
564 :     displaying the field description when a field is selected.
565 :    
566 :     =item typeControl (optional)
567 :    
568 :     If specified, the name of a variable for displaying the field's
569 :     data type. Data types are a much more controlled vocabulary than
570 :     notes, so there is no worry about HTML translation. Instead, the
571 :     raw value is put into the specified variable. Otherwise, the same
572 :     rules apply to this value that apply to I<$noteControl>.
573 :    
574 :     =item RETURN
575 :    
576 :     Returns the HTML to create a form field that can be used to select an
577 :     attribute from the custom attributes system.
578 :    
579 :     =back
580 :    
581 :     =cut
582 :    
583 :     sub FieldMenu {
584 :     # Get the parameters.
585 :     my ($self, $cgi, $height, $name, $newFlag, $noteControl, $typeControl) = @_;
586 :     # These next two hashes make everything happen. "entities"
587 :     # maps each entity name to the list of values to be put into its
588 :     # option group. "labels" maps each entity name to a map from values
589 :     # to labels.
590 :     my @entityNames = sort ($self->GetEntityTypes());
591 :     my %entities = map { $_ => [] } @entityNames;
592 :     my %labels = map { $_ => { }} @entityNames;
593 :     # Loop through the entities, adding the existing attributes.
594 :     for my $entity (@entityNames) {
595 :     # Get this entity's field table.
596 :     my $fieldHash = $self->GetFieldTable($entity);
597 :     # Get its field list in our local hashes.
598 :     my $fieldList = $entities{$entity};
599 :     my $labelList = $labels{$entity};
600 :     # Add the NEW fields if we want them.
601 :     if ($newFlag) {
602 :     push @{$fieldList}, $entity;
603 :     $labelList->{$entity} = "(new)";
604 :     }
605 :     # Loop through the fields in the hash. We only keep the ones with a
606 :     # secondary relation name. (In other words, the name of the relation
607 :     # in which the field appears cannot be the same as the entity name.)
608 :     for my $fieldName (sort keys %{$fieldHash}) {
609 :     if ($fieldHash->{$fieldName}->{relation} ne $entity) {
610 :     my $value = "$entity/$fieldName";
611 :     push @{$fieldList}, $value;
612 :     $labelList->{$value} = $fieldName;
613 :     }
614 :     }
615 :     }
616 :     # Now we have a hash and a list for each entity, and they correspond
617 :     # exactly to what the $cgi->optgroup function expects.
618 :     # The last step is to create the name for the onChange function. This function
619 :     # may not do anything, but we need to know the name to generate the HTML
620 :     # for the menu.
621 :     my $changeName = "${name}_setNotes";
622 :     my $retVal = $cgi->popup_menu({name => $name,
623 :     size => $height,
624 :     onChange => "$changeName(this.value)",
625 :     values => [map { $cgi->optgroup(-name => $_,
626 :     -values => $entities{$_},
627 :     -labels => $labels{$_})
628 :     } @entityNames]}
629 :     );
630 :     # Create the change function.
631 :     $retVal .= "\n<script language=\"javascript\">\n";
632 :     $retVal .= " function $changeName(fieldValue) {\n";
633 :     # The function only has a body if we have a notes control to store the description.
634 :     if ($noteControl || $typeControl) {
635 :     # Check to see if we're storing HTML or text into the note control.
636 :     my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
637 :     # We use a CASE statement based on the newly-selected field value. The
638 :     # field description will be stored in the JavaScript variable "myText"
639 :     # and the data type in "myType". Note the default data type is a normal
640 :     # string, but the default notes is an empty string.
641 :     $retVal .= " var myText = \"\";\n";
642 :     $retVal .= " var myType = \"string\";\n";
643 :     $retVal .= " switch (fieldValue) {\n";
644 :     # Loop through the entities.
645 :     for my $entity (@entityNames) {
646 :     # Get the entity's field hash. This has the notes in it.
647 :     my $fieldHash = $self->GetFieldTable($entity);
648 :     # Loop through the values we might see for this entity's fields.
649 :     my $fields = $entities{$entity};
650 :     for my $value (@{$fields}) {
651 :     # Only proceed if we have an existing field.
652 :     if ($value =~ m!/(.+)$!) {
653 :     # Get the field's hash element.
654 :     my $element = $fieldHash->{$1};
655 :     # Generate this case.
656 :     $retVal .= " case \"$value\" :\n";
657 :     # Here we either want to update the note display, the
658 :     # type display, or both.
659 :     if ($noteControl) {
660 :     # Here we want the notes updated.
661 :     my $notes = $element->{Notes}->{content};
662 :     # Insure it's in the proper form.
663 :     if ($htmlMode) {
664 :     $notes = ERDB::HTMLNote($notes);
665 :     }
666 :     # Escape it for use as a string literal.
667 :     $notes =~ s/\n/\\n/g;
668 :     $notes =~ s/"/\\"/g;
669 :     $retVal .= " myText = \"$notes\";\n";
670 :     }
671 :     if ($typeControl) {
672 :     # Here we want the type updated.
673 :     my $type = $element->{type};
674 :     $retVal .= " myType = \"$type\";\n";
675 :     }
676 :     # Close this case.
677 :     $retVal .= " break;\n";
678 :     }
679 :     }
680 :     }
681 :     # Close the CASE statement and make the appropriate assignments.
682 :     $retVal .= " }\n";
683 :     if ($noteControl) {
684 :     $retVal .= " $noteControl = myText;\n";
685 :     }
686 :     if ($typeControl) {
687 :     $retVal .= " $typeControl = myType;\n";
688 :     }
689 :     }
690 :     # Terminate the change function.
691 :     $retVal .= " }\n";
692 :     $retVal .= "</script>\n";
693 :     # Return the result.
694 :     return $retVal;
695 :     }
696 :    
697 : parrello 1.3 =head3 MatchSqlPattern
698 :    
699 : parrello 1.4 C<< my $matched = CustomAttributes::MatchSqlPattern($value, $pattern); >>
700 : parrello 1.3
701 :     Determine whether or not a specified value matches an SQL pattern. An SQL
702 :     pattern has two wild card characters: C<%> that matches multiple characters,
703 :     and C<_> that matches a single character. These can be escaped using a
704 :     backslash (C<\>). We pull this off by converting the SQL pattern to a
705 :     PERL regular expression. As per SQL rules, the match is case-insensitive.
706 :    
707 :     =over 4
708 :    
709 :     =item value
710 :    
711 : parrello 1.4 Value to be matched against the pattern. Note that an undefined or empty
712 :     value will not match anything.
713 : parrello 1.3
714 :     =item pattern
715 :    
716 : parrello 1.4 SQL pattern against which to match the value. An undefined or empty pattern will
717 : parrello 1.3 match everything.
718 :    
719 :     =item RETURN
720 :    
721 :     Returns TRUE if the value and pattern match, else FALSE.
722 :    
723 :     =back
724 :    
725 :     =cut
726 :    
727 :     sub MatchSqlPattern {
728 :     # Get the parameters.
729 :     my ($value, $pattern) = @_;
730 :     # Declare the return variable.
731 :     my $retVal;
732 :     # Insure we have a pattern.
733 : parrello 1.4 if (! defined($pattern) || $pattern eq "") {
734 : parrello 1.3 $retVal = 1;
735 :     } else {
736 :     # Break the pattern into pieces around the wildcard characters. Because we
737 :     # use parentheses in the split function's delimiter expression, we'll get
738 :     # list elements for the delimiters as well as the rest of the string.
739 :     my @pieces = split /([_%]|\\[_%])/, $pattern;
740 :     # Check some fast special cases.
741 :     if ($pattern eq '%') {
742 :     # A null pattern matches everything.
743 :     $retVal = 1;
744 :     } elsif (@pieces == 1) {
745 :     # No wildcards, so we have a literal comparison. Note we're case-insensitive.
746 :     $retVal = (lc($value) eq lc($pattern));
747 :     } elsif (@pieces == 2 && $pieces[1] eq '%') {
748 :     # A wildcard at the end, so we have a substring match. This is also case-insensitive.
749 :     $retVal = (lc(substr($value, 0, length($pieces[0]))) eq lc($pieces[0]));
750 :     } else {
751 :     # Okay, we have to do it the hard way. Convert each piece to a PERL pattern.
752 :     my $realPattern = "";
753 :     for my $piece (@pieces) {
754 :     # Determine the type of piece.
755 :     if ($piece eq "") {
756 :     # Empty pieces are ignored.
757 :     } elsif ($piece eq "%") {
758 :     # Here we have a multi-character wildcard. Note that it can match
759 :     # zero or more characters.
760 :     $realPattern .= ".*"
761 :     } elsif ($piece eq "_") {
762 :     # Here we have a single-character wildcard.
763 :     $realPattern .= ".";
764 :     } elsif ($piece eq "\\%" || $piece eq "\\_") {
765 :     # This is an escape sequence (which is a rare thing, actually).
766 :     $realPattern .= substr($piece, 1, 1);
767 :     } else {
768 :     # Here we have raw text.
769 :     $realPattern .= quotemeta($piece);
770 :     }
771 :     }
772 :     # Do the match.
773 :     $retVal = ($value =~ /^$realPattern$/i ? 1 : 0);
774 :     }
775 :     }
776 :     # Return the result.
777 :     return $retVal;
778 :     }
779 :    
780 :     =head3 MigrateAttributes
781 :    
782 :     C<< CustomAttributes::MigrateAttributes($fig); >>
783 :    
784 :     Migrate all the attributes data from the specified FIG instance. This is a long, slow
785 :     method used to convert the old attribute data to the new system. Only attribute
786 :     keys that are not already in the database will be loaded, and only for entity instances
787 :     current in the database. To get an accurate capture of the attributes in the given
788 :     instance, you may want to clear the database and the DBD before starting and
789 :     run L</Refresh> to populate the entities.
790 :    
791 :     =over 4
792 :    
793 :     =item fig
794 :    
795 :     A FIG object that can be used to retrieve attributes for migration purposes.
796 :    
797 :     =back
798 :    
799 :     =cut
800 :    
801 :     sub MigrateAttributes {
802 :     # Get the parameters.
803 :     my ($fig) = @_;
804 :     # Get a list of the objects to migrate. This requires connecting. Note we
805 :     # will map each entity type to a file name. The file will contain a list
806 :     # of the object's IDs so we can get to them when we're not connected to
807 :     # the database.
808 :     my $ca = CustomAttributes->new();
809 :     my %objects = map { $_ => "$FIG_Config::temp/$_.keys.tbl" } $ca->GetEntityTypes();
810 :     # Set up hash of the existing attribute keys for each entity type.
811 :     my %oldKeys = ();
812 :     # Finally, we have a hash that counts the IDs for each entity type.
813 :     my %idCounts = map { $_ => 0 } keys %objects;
814 :     # Loop through the list, creating key files to read back in.
815 :     for my $entityType (keys %objects) {
816 :     Trace("Retrieving keys for $entityType.") if T(2);
817 :     # Create the key file.
818 :     my $idFile = Open(undef, ">$objects{$entityType}");
819 :     # Loop through the keys.
820 :     my @ids = $ca->GetFlat([$entityType], "", [], "$entityType(id)");
821 :     for my $id (@ids) {
822 :     print $idFile "$id\n";
823 :     }
824 :     close $idFile;
825 :     # In addition to the key file, we must get a list of attributes already
826 :     # in the database. This avoids a circularity problem that might occur if the $fig
827 :     # object is retrieving from the custom attributes database already.
828 :     my %fields = $ca->GetSecondaryFields($entityType);
829 :     $oldKeys{$entityType} = \%fields;
830 :     # Finally, we have the ID count.
831 :     $idCounts{$entityType} = scalar @ids;
832 :     }
833 :     # Release the custom attributes database so we can add attributes.
834 :     undef $ca;
835 :     # Loop through the objects.
836 :     for my $entityType (keys %objects) {
837 :     # Get a hash of all the attributes already in this database. These are
838 :     # left untouched.
839 :     my $myOldKeys = $oldKeys{$entityType};
840 :     # Create a hash to control the load file names for each attribute key we find.
841 :     my %keyHash = ();
842 :     # Set up some counters so we can trace our progress.
843 :     my ($totalIDs, $processedIDs, $keyCount, $valueCount) = ($idCounts{$entityType}, 0, 0, 0);
844 :     # Open this object's ID file.
845 :     Trace("Migrating data for $entityType. $totalIDs found.") if T(3);
846 :     my $keysIn = Open(undef, "<$objects{$entityType}");
847 :     while (my $id = <$keysIn>) {
848 :     # Remove the EOL characters.
849 :     chomp $id;
850 :     # Get this object's attributes.
851 :     my @allData = $fig->get_attributes($id);
852 :     Trace(scalar(@allData) . " attribute values found for $entityType($id).") if T(4);
853 :     # Loop through the attribute values one at a time.
854 :     for my $dataTuple (@allData) {
855 :     # Get the key, value, and URL. We ignore the first element because that's the
856 :     # object ID, and we already know the object ID.
857 :     my (undef, $key, $value, $url) = @{$dataTuple};
858 : parrello 1.7 # Remove the buggy "1" for $url.
859 :     if ($url eq "1") {
860 :     $url = undef;
861 :     }
862 : parrello 1.3 # Only proceed if this is not an old key.
863 :     if (! $myOldKeys->{$key}) {
864 :     # See if we've run into this key before.
865 :     if (! exists $keyHash{$key}) {
866 :     # Here we need to create the attribute key in the database.
867 :     StoreAttributeKey($entityType, $key, 'text',
868 :     "Key migrated automatically from the FIG system. " .
869 :     "Please replace these notes as soon as possible " .
870 :     "with useful text."
871 :     );
872 :     # Compute the attribute's load file name and open it for output.
873 :     my $fileName = "$FIG_Config::temp/$entityType.$key.load.tbl";
874 :     my $fh = Open(undef, ">$fileName");
875 :     # Store the file name and handle.
876 :     $keyHash{$key} = {h => $fh, name => $fileName};
877 :     # Count this key.
878 :     $keyCount++;
879 :     }
880 :     # Smash the value and the URL together.
881 :     if (defined($url) && length($url) > 0) {
882 :     $value .= "::$url";
883 :     }
884 :     # Write the attribute value to the load file.
885 :     Tracer::PutLine($keyHash{$key}->{h}, [$id, $value]);
886 :     $valueCount++;
887 :     }
888 :     }
889 :     # Now we've finished all the attributes for this object. Count and trace it.
890 :     $processedIDs++;
891 :     if ($processedIDs % 500 == 0) {
892 :     Trace("$processedIDs of $totalIDs ${entityType}s processed.") if T(3);
893 :     Trace("$entityType has $keyCount keys and $valueCount values so far.") if T(3);
894 :     }
895 :     }
896 :     # Now we've finished all the attributes for all objects of this type.
897 :     Trace("$processedIDs ${entityType}s processed, with $keyCount keys and $valueCount values.") if T(2);
898 :     # Loop through the files, loading the keys into the database.
899 :     Trace("Connecting to database.") if T(2);
900 :     my $objectCA = CustomAttributes->new();
901 :     Trace("Loading key files.") if T(2);
902 :     for my $key (sort keys %keyHash) {
903 :     # Close the key's load file.
904 :     close $keyHash{$key}->{h};
905 :     # Reopen it for input.
906 :     my $fileName = $keyHash{$key}->{name};
907 :     my $fh = Open(undef, "<$fileName");
908 :     Trace("Loading $key from $fileName.") if T(3);
909 :     my $stats = $objectCA->LoadAttributeKey($entityType, $key, $fh, 0, 1);
910 :     Trace("Statistics for $key of $entityType:\n" . $stats->Show()) if T(3);
911 :     }
912 :     # All the keys for this entity type are now loaded.
913 :     Trace("Key files loaded for $entityType.") if T(2);
914 :     }
915 :     # All keys for all entity types are now loaded.
916 :     Trace("Migration complete.") if T(2);
917 :     }
918 :    
919 : parrello 1.4 =head3 ComputeObjectTypeFromID
920 :    
921 :     C<< my ($entityName, $id) = CustomAttributes::ComputeObjectTypeFromID($objectID); >>
922 :    
923 :     This method will compute the entity type corresponding to a specified object ID.
924 :     If the object ID begins with C<fig|>, it is presumed to be a feature ID. If it
925 :     is all digits with a single period, it is presumed to by a genome ID. Otherwise,
926 :     it must be a list reference. In this last case the first list element will be
927 :     taken as the entity type and the second will be taken as the actual ID.
928 :    
929 :     =over 4
930 :    
931 :     =item objectID
932 :    
933 :     Object ID to examine.
934 :    
935 :     =item RETURN
936 :    
937 :     Returns a 2-element list consisting of the entity type followed by the specified ID.
938 :    
939 :     =back
940 :    
941 :     =cut
942 :    
943 :     sub ComputeObjectTypeFromID {
944 :     # Get the parameters.
945 :     my ($objectID) = @_;
946 :     # Declare the return variables.
947 :     my ($entityName, $id);
948 :     # Only proceed if the object ID is defined. If it's not, we'll be returning a
949 :     # pair of undefs.
950 :     if ($objectID) {
951 :     if (ref $objectID eq 'ARRAY') {
952 :     # Here we have the new-style list reference. Pull out its pieces.
953 :     ($entityName, $id) = @{$objectID};
954 :     } else {
955 :     # Here the ID is the outgoing ID, and we need to look at its structure
956 :     # to determine the entity type.
957 :     $id = $objectID;
958 :     if ($objectID =~ /^\d+\.\d+/) {
959 :     # Digits with a single period is a genome.
960 :     $entityName = 'Genome';
961 :     } elsif ($objectID =~ /^fig\|/) {
962 :     # The "fig|" prefix indicates a feature.
963 :     $entityName = 'Feature';
964 :     } else {
965 :     # Anything else is illegal!
966 :     Confess("Invalid attribute ID specification \"$objectID\".");
967 :     }
968 :     }
969 :     }
970 :     # Return the result.
971 :     return ($entityName, $id);
972 :     }
973 :    
974 : parrello 1.3 =head2 FIG Method Replacements
975 :    
976 :     The following methods are used by B<FIG.pm> to replace the previous attribute functionality.
977 :     Some of the old functionality is no longer present. Controlled vocabulary is no longer
978 :     supported and there is no longer any searching by URL. Fortunately, neither of these
979 :     capabilities were used in the old system.
980 :    
981 : parrello 1.4 The methods here are the only ones supported by the B<RemoteCustomAttributes> object.
982 :     The idea is that these methods represent attribute manipulation allowed by all users, while
983 :     the others are only for privileged users with access to the attribute server.
984 :    
985 : parrello 1.3 In the previous implementation, an attribute had a value and a URL. In the new implementation,
986 :     there is only a value. In this implementation, each attribute has only a value. These
987 :     methods will treat the value as a list with the individual elements separated by the
988 :     value of the splitter parameter on the constructor (L</new>). The default is double
989 :     colons C<::>.
990 :    
991 :     So, for example, an old-style keyword with a /value of C<essential> and a URL of
992 :     C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default
993 :     splitter value would be stored as
994 :    
995 :     essential::http://www.sciencemag.org/cgi/content/abstract/293/5538/2266
996 :    
997 :     The best performance is achieved by searching for a particular key for a specified
998 :     feature or genome.
999 :    
1000 :     =head3 GetAttributes
1001 :    
1002 :     C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @valuePatterns); >>
1003 :    
1004 :     In the database, attribute values are sectioned into pieces using a splitter
1005 :     value specified in the constructor (L</new>). This is not a requirement of
1006 :     the attribute system as a whole, merely a convenience for the purpose of
1007 :     these methods. If you are using the static method calls instead of the
1008 :     object-based calls, the splitter will always be the default value of
1009 :     double colons (C<::>). If a value has multiple sections, each section
1010 :     is matched against the correspond criterion in the I<@valuePatterns> list.
1011 :    
1012 :     This method returns a series of tuples that match the specified criteria. Each tuple
1013 :     will contain an object ID, a key, and one or more values. The parameters to this
1014 :     method therefore correspond structurally to the values expected in each tuple.
1015 :    
1016 :     my @attributeList = GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);
1017 :    
1018 :     would return something like
1019 :    
1020 :     ['fig}100226.1.peg.1004', 'structure', 1, 2]
1021 :     ['fig}100226.1.peg.1004', 'structure1', 1, 2]
1022 :     ['fig}100226.1.peg.1004', 'structure2', 1, 2]
1023 :     ['fig}100226.1.peg.1004', 'structureA', 1, 2]
1024 :    
1025 :     Use of C<undef> in any position acts as a wild card (all values). In addition,
1026 :     the I<$key> and I<@valuePatterns> parameters can contain SQL pattern characters: C<%>, which
1027 :     matches any sequence of characters, and C<_>, which matches any single character.
1028 :     (You can use an escape sequence C<\%> or C<\_> to match an actual percent sign or
1029 :     underscore.)
1030 :    
1031 :     In addition to values in multiple sections, a single attribute key can have multiple
1032 :     values, so even
1033 :    
1034 :     my @attributeList = GetAttributes($peg, 'virulent');
1035 :    
1036 :     which has no wildcard in the key or the object ID, may return multiple tuples.
1037 :    
1038 :     For reasons of backward compatability, we examine the structure of the object ID to
1039 :     determine the entity type. In that case the only two types allowed are C<Genome> and
1040 :     C<Feature>. An alternative method is to use a list reference, with the list consisting
1041 :     of an entity type name and the actual ID. Thus, the above example could equivalently
1042 :     be written as
1043 :    
1044 :     my @attributeList = GetAttributes([Feature => $peg], 'virulent');
1045 :    
1046 :     The list-reference approach allows us to add attributes to other entity types in
1047 :     the future. Doing so, however, will require modifying the L</Refresh> method and
1048 :     updated the database design XML.
1049 :    
1050 :     The list-reference approach also allows for a more fault-tolerant approach to
1051 :     getting all objects with a particular attribute.
1052 :    
1053 :     my @attributeList = GetAttributes([Feature => undef], 'virulent');
1054 :    
1055 :     will only return feature attributes, while
1056 :    
1057 :     my @attributeList = GetAttributes(undef, 'virulent');
1058 :    
1059 :     could at some point in the future get you attributes for genomes or even subsystems
1060 :     as well as features.
1061 :    
1062 :     =over 4
1063 :    
1064 :     =item objectID
1065 :    
1066 :     ID of the genome or feature whose attributes are desired. In general, an ID that
1067 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits with a
1068 :     single period is treated as a genome ID. For other entity types, use a list reference; in
1069 :     this case the first list element is the entity type and the second is the ID. A value of
1070 : parrello 1.4 C<undef> or an empty string here will match all objects.
1071 : parrello 1.3
1072 :     =item key
1073 :    
1074 :     Attribute key name. Since attributes are stored as fields in the database with a
1075 :     field name equal to the key name, it is very fast to find a list of all the
1076 :     matching keys. Each key's values require a separate query, however, which may
1077 :     be a performance problem if the pattern matches a lot of keys. Wild cards are
1078 : parrello 1.4 acceptable here, and a value of C<undef> or an empty string will match all
1079 :     attribute keys.
1080 : parrello 1.3
1081 :     =item valuePatterns
1082 :    
1083 :     List of the desired attribute values, section by section. If C<undef>
1084 : parrello 1.4 or an empty string is specified, all values in that section will match.
1085 : parrello 1.3
1086 :     =item RETURN
1087 :    
1088 :     Returns a list of tuples. The first element in the tuple is an object ID, the
1089 :     second is an attribute key, and the remaining elements are the sections of
1090 :     the attribute value. All of the tuples will match the criteria set forth in
1091 :     the parameter list.
1092 :    
1093 :     =back
1094 :    
1095 :     =cut
1096 :    
1097 :     sub GetAttributes {
1098 : parrello 1.4 # Get the parameters.
1099 :     my ($self, $objectID, $key, @valuePatterns) = @_;
1100 : parrello 1.3 # Declare the return variable.
1101 :     my @retVal = ();
1102 :     # Determine the entity types for our search.
1103 :     my @objects = ();
1104 :     my ($actualObjectID, $computedType);
1105 : parrello 1.4 if (! $objectID) {
1106 : parrello 1.3 push @objects, $self->GetEntityTypes();
1107 :     } else {
1108 :     ($computedType, $actualObjectID) = ComputeObjectTypeFromID($objectID);
1109 :     push @objects, $computedType;
1110 :     }
1111 :     # Loop through the entity types.
1112 :     for my $entityType (@objects) {
1113 :     # Now we need to find all the matching keys. The keys are actually stored in
1114 :     # our database object, so this process is fast. Note that our
1115 :     # MatchSqlPattern method
1116 :     my %secondaries = $self->GetSecondaryFields($entityType);
1117 :     my @fieldList = grep { MatchSqlPattern($_, $key) } keys %secondaries;
1118 : parrello 1.7 # Now we figure out whether or not we need to filter by object. We will always
1119 :     # filter by key to a limited extent, so if we're filtering by object we need an
1120 :     # AND to join the object ID filter with the key filter.
1121 : parrello 1.3 my $filter = "";
1122 :     my @params = ();
1123 : parrello 1.5 if (defined($actualObjectID)) {
1124 : parrello 1.3 # Here the caller wants to filter on object ID.
1125 : parrello 1.7 $filter = "$entityType(id) = ? AND ";
1126 : parrello 1.3 push @params, $actualObjectID;
1127 :     }
1128 :     # It's time to begin making queries. We process one attribute key at a time, because
1129 :     # each attribute is actually a different field in the database. We know here that
1130 :     # all the keys we've collected are for the correct entity because we got them from
1131 :     # the DBD. That's a good thing, because an invalid key name will cause an SQL error.
1132 :     for my $key (@fieldList) {
1133 :     # Get all of the attribute values for this key.
1134 : parrello 1.7 my @dataRows = $self->GetAll([$entityType], "$filter$entityType($key) IS NOT NULL",
1135 :     \@params, ["$entityType(id)", "$entityType($key)"]);
1136 : parrello 1.3 # Process each value separately. We need to verify the values and reformat the
1137 :     # tuples. Note that GetAll will give us one row per matching object ID,
1138 :     # with the ID first followed by a list of the data values. This is very
1139 :     # different from the structure we'll be returning, which has one row
1140 :     # per value.
1141 :     for my $dataRow (@dataRows) {
1142 :     # Get the object ID and the list of values.
1143 :     my ($rowObjectID, @dataValues) = @{$dataRow};
1144 :     # Loop through the values. There will be one result row per attribute value.
1145 :     for my $dataValue (@dataValues) {
1146 :     # Separate this value into sections.
1147 :     my @sections = split("::", $dataValue);
1148 :     # Loop through the value patterns, looking for a mismatch. Note that
1149 :     # since we're working through parallel arrays, we are using an index
1150 :     # loop. As soon as a match fails we stop checking. This means that
1151 :     # if the value pattern list is longer than the number of sections,
1152 :     # we will fail as soon as we run out of sections.
1153 :     my $match = 1;
1154 :     for (my $i = 0; $i <= $#valuePatterns && $match; $i++) {
1155 :     $match = MatchSqlPattern($sections[$i], $valuePatterns[$i]);
1156 :     }
1157 :     # If we match, we save this value in the output list.
1158 :     if ($match) {
1159 :     push @retVal, [$rowObjectID, $key, @sections];
1160 :     }
1161 :     }
1162 :     # Here we've processed all the attribute values for the current object ID.
1163 :     }
1164 :     # Here we've processed all the rows returned by GetAll. In general, there will
1165 :     # be one row per object ID.
1166 :     }
1167 :     # Here we've processed all the matching attribute keys.
1168 :     }
1169 :     # Here we've processed all the entity types. That means @retVal has all the matching
1170 :     # results.
1171 :     return @retVal;
1172 :     }
1173 :    
1174 :     =head3 AddAttribute
1175 :    
1176 :     C<< $attrDB->AddAttribute($objectID, $key, @values); >>
1177 :    
1178 :     Add an attribute key/value pair to an object. This method cannot add a new key, merely
1179 :     add a value to an existing key. Use L</StoreAttributeKey> to create a new key.
1180 :    
1181 :     =over 4
1182 :    
1183 :     =item objectID
1184 :    
1185 :     ID of the genome or feature to which the attribute is to be added. In general, an ID that
1186 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1187 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1188 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1189 :    
1190 :     =item key
1191 :    
1192 :     Attribute key name. This corresponds to the name of a field in the database.
1193 :    
1194 :     =item values
1195 :    
1196 :     One or more values to be associated with the key. The values are joined together with
1197 :     the splitter value before being stored as field values. This enables L</GetAttributes>
1198 :     to split them apart during retrieval. The splitter value defaults to double colons C<::>.
1199 :    
1200 :     =back
1201 :    
1202 :     =cut
1203 :    
1204 :     sub AddAttribute {
1205 :     # Get the parameters.
1206 : parrello 1.4 my ($self, $objectID, $key, @values) = @_;
1207 : parrello 1.3 # Don't allow undefs.
1208 :     if (! defined($objectID)) {
1209 :     Confess("No object ID specified for AddAttribute call.");
1210 :     } elsif (! defined($key)) {
1211 :     Confess("No attribute key specified for AddAttribute call.");
1212 :     } elsif (! @values) {
1213 :     Confess("No values specified in AddAttribute call for key $key.");
1214 :     } else {
1215 :     # Okay, now we have some reason to believe we can do this. Start by
1216 :     # computing the object type and ID.
1217 :     my ($entityName, $id) = ComputeObjectTypeFromID($objectID);
1218 :     # Form the values into a scalar.
1219 :     my $valueString = join($self->{splitter}, @values);
1220 :     # Insert the value.
1221 :     $self->InsertValue($id, "$entityName($key)", $valueString);
1222 :     }
1223 :     # Return a one. We do this for backward compatability.
1224 :     return 1;
1225 :     }
1226 :    
1227 :     =head3 DeleteAttribute
1228 :    
1229 :     C<< $attrDB->DeleteAttribute($objectID, $key, @values); >>
1230 :    
1231 :     Delete the specified attribute key/value combination from the database.
1232 :    
1233 :     The first form will connect to the database and release it. The second form
1234 :     uses the database connection contained in the object.
1235 :    
1236 :     =over 4
1237 :    
1238 :     =item objectID
1239 :    
1240 :     ID of the genome or feature to which the attribute is to be added. In general, an ID that
1241 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1242 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1243 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1244 :    
1245 :     =item key
1246 :    
1247 :     Attribute key name. This corresponds to the name of a field in the database.
1248 :    
1249 :     =item values
1250 :    
1251 :     One or more values to be associated with the key.
1252 :    
1253 :     =back
1254 :    
1255 :     =cut
1256 :    
1257 :     sub DeleteAttribute {
1258 :     # Get the parameters.
1259 : parrello 1.4 my ($self, $objectID, $key, @values) = @_;
1260 : parrello 1.3 # Don't allow undefs.
1261 :     if (! defined($objectID)) {
1262 :     Confess("No object ID specified for DeleteAttribute call.");
1263 :     } elsif (! defined($key)) {
1264 :     Confess("No attribute key specified for DeleteAttribute call.");
1265 :     } elsif (! @values) {
1266 :     Confess("No values specified in DeleteAttribute call for key $key.");
1267 :     } else {
1268 :     # Now compute the object type and ID.
1269 :     my ($entityName, $id) = ComputeObjectTypeFromID($objectID);
1270 :     # Form the values into a scalar.
1271 :     my $valueString = join($self->{splitter}, @values);
1272 :     # Delete the value.
1273 :     $self->DeleteValue($entityName, $id, $key, $valueString);
1274 :     }
1275 :     # Return a one. This is for backward compatability.
1276 :     return 1;
1277 :     }
1278 :    
1279 :     =head3 ChangeAttribute
1280 :    
1281 :     C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>
1282 :    
1283 :     Change the value of an attribute key/value pair for an object.
1284 :    
1285 :     =over 4
1286 :    
1287 :     =item objectID
1288 :    
1289 :     ID of the genome or feature to which the attribute is to be changed. In general, an ID that
1290 :     starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
1291 :     is treated as a genome ID. For IDs of other types, this parameter should be a reference
1292 :     to a 2-tuple consisting of the entity type name followed by the object ID.
1293 :    
1294 :     =item key
1295 :    
1296 :     Attribute key name. This corresponds to the name of a field in the database.
1297 :    
1298 :     =item oldValues
1299 :    
1300 :     One or more values identifying the key/value pair to change.
1301 :    
1302 :     =item newValues
1303 :    
1304 :     One or more values to be put in place of the old values.
1305 :    
1306 :     =back
1307 :    
1308 :     =cut
1309 :    
1310 :     sub ChangeAttribute {
1311 :     # Get the parameters.
1312 : parrello 1.4 my ($self, $objectID, $key, $oldValues, $newValues) = @_;
1313 : parrello 1.3 # Don't allow undefs.
1314 :     if (! defined($objectID)) {
1315 :     Confess("No object ID specified for ChangeAttribute call.");
1316 :     } elsif (! defined($key)) {
1317 :     Confess("No attribute key specified for ChangeAttribute call.");
1318 :     } elsif (! defined($oldValues) || ref $oldValues ne 'ARRAY') {
1319 :     Confess("No old values specified in ChangeAttribute call for key $key.");
1320 :     } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {
1321 :     Confess("No new values specified in ChangeAttribute call for key $key.");
1322 :     } else {
1323 :     # Okay, now we do the change as a delete/add.
1324 :     $self->DeleteAttribute($objectID, $key, @{$oldValues});
1325 :     $self->AddAttribute($objectID, $key, @{$newValues});
1326 :     }
1327 :     # Return a one. We do this for backward compatability.
1328 :     return 1;
1329 :     }
1330 :    
1331 : parrello 1.7 =head3 EraseAttribute
1332 :    
1333 :     C<< $attrDB->EraseAttribute($entityName, $key); >>
1334 :    
1335 :     Erase all values for the specified attribute key. This does not remove the
1336 :     key from the database; it merely removes all the values.
1337 :    
1338 :     =over 4
1339 :    
1340 :     =item entityName
1341 :    
1342 :     Name of the entity to which the key belongs. If undefined, all entities will be
1343 :     examined for the desired key.
1344 :    
1345 :     =item key
1346 :    
1347 :     Key to erase.
1348 :    
1349 :     =back
1350 :    
1351 :     =cut
1352 :    
1353 :     sub EraseAttribute {
1354 :     # Get the parameters.
1355 :     my ($self, $entityName, $key) = @_;
1356 :     # Determine the relevant entity types.
1357 :     my @objects = ();
1358 :     if (! $entityName) {
1359 :     push @objects, $self->GetEntityTypes();
1360 :     } else {
1361 :     push @objects, $entityName;
1362 :     }
1363 :     # Loop through the entity types.
1364 :     for my $entityType (@objects) {
1365 :     # Now check for this key in this entity.
1366 :     my %secondaries = $self->GetSecondaryFields($entityType);
1367 :     if (exists $secondaries{$key}) {
1368 :     # We found it, so delete all the values of the key.
1369 :     $self->DeleteValue($entityName, undef, $key);
1370 :     }
1371 :     }
1372 :     # Return a 1, for backward compatability.
1373 :     return 1;
1374 :     }
1375 :    
1376 : parrello 1.1 1;

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3