[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

Diff of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.7, Wed Nov 15 12:04:05 2006 UTC revision 1.32, Fri Jan 25 19:00:58 2008 UTC
# Line 8  Line 8 
8      use strict;      use strict;
9      use Tracer;      use Tracer;
10      use ERDBLoad;      use ERDBLoad;
11        use Stats;
12        use Time::HiRes qw(time);
13    
14  =head1 Custom SEED Attribute Manager  =head1 Custom SEED Attribute Manager
15    
# Line 15  Line 17 
17    
18  The Custom SEED Attributes Manager allows the user to upload and retrieve  The Custom SEED Attributes Manager allows the user to upload and retrieve
19  custom data for SEED objects. It uses the B<ERDB> database system to  custom data for SEED objects. It uses the B<ERDB> database system to
20  store the attributes, which are implemented as multi-valued fields  store the attributes.
21  of ERDB entities.  
22    Attributes are organized by I<attribute key>. Attribute values are
23    assigned to I<objects>. In the real world, objects have types and IDs;
24    however, to the attribute database only the ID matters. This will create
25    a problem if we have a single ID that applies to two objects of different
26    types, but it is more consistent with the original attribute implementation
27    in the SEED (which this implementation replaces).
28    
29    The actual attribute values are stored as a relationship between the attribute
30    keys and the objects. There can be multiple values for a single key/object pair.
31    
32    =head3 Object IDs
33    
34    The object ID is normally represented as
35    
36        I<type>:I<id>
37    
38    where I<type> is the object type (C<Role>, C<Coupling>, etc.) and I<id> is
39    the actual object ID. Note that the object type must consist of only upper- and
40    lower-case letters! Thus, C<GenomeGroup> is a valid object type, but
41    C<genome_group> is not. Given that restriction, the object ID
42    
43        Family:aclame|cluster10
44    
45    would represent the FIG family C<aclame|cluster10>. For historical reasons,
46    there are three exceptions: subsystems, genomes, and features do not need
47    a type. So, for PEG 3361 of Streptomyces coelicolor A3(2), you simply code
48    
49        fig|100226.1.peg.3361
50    
51    The methods L</ParseID> and L</FormID> can be used to make this all seem
52    more consistent. Given any object ID string, L</ParseID> will convert it to an
53    object type and ID, and given any object type and ID, L</FormID> will
54    convert it to an object ID string. The attribute database is pretty
55    freewheeling about what it will allow for an ID; however, for best
56    results, the type should match an entity type from a Sprout genetics
57    database. If this rule is followed, then the database object
58    corresponding to an ID in the attribute database could be retrieved using
59    L</GetTargetObject> method.
60    
61        my $object = CustomAttributes::GetTargetObject($sprout, $idValue);
62    
63    =head3 Retrieval and Logging
64    
65  The full suite of ERDB retrieval capabilities is provided. In addition,  The full suite of ERDB retrieval capabilities is provided. In addition,
66  custom methods are provided specific to this application. To get all  custom methods are provided specific to this application. To get all
67  the values of the attribute C<essential> in a specified B<Feature>, you  the values of the attribute C<essential> in a specified B<Feature>, you
68  would code  would code
69    
70      my @values = $attrDB->GetAttributes([Feature => $fid], 'essential');      my @values = $attrDB->GetAttributes($fid, 'essential');
71    
72  where I<$fid> contains the ID of the desired feature. Each attribute has  where I<$fid> contains the ID of the desired feature.
 an alternate index to allow searching for attributes by value.  
73    
74  New attributes are introduced by updating the database definition at  Keys can be split into two pieces using the splitter value defined in the
75  run-time. Attribute values are stored by uploading data from files.  constructor (the default is C<::>). The first piece of the key is called
76  A web interface is provided for both these activities.  the I<real key>. This portion of the key must be defined using the
77    web interface (C<Attributes.cgi>). The second portion of the key is called
78    the I<sub key>, and can take any value.
79    
80    Major attribute activity is recorded in a log (C<attributes.log>) in the
81    C<$FIG_Config::var> directory. The log reports the user name, time, and
82    the details of the operation. The user name will almost always be unknown,
83    the exception being when it is specified in this object's constructor
84    (see L</new>).
85    
86  =head2 FIG_Config Parameters  =head2 FIG_Config Parameters
87    
# Line 76  Line 127 
127    
128  =back  =back
129    
 The DBD file is critical, and must have reasonable contents before we can  
 begin using the system. In the old system, attributes were only provided  
 for Genomes and Features, so the initial XML file was the following.  
   
     <Database>  
       <Title>SEED Custom Attribute Database</Title>  
       <Entities>  
         <Entity name="Feature" keyType="id-string">  
           <Notes>A [i]feature[/i] is a part of the genome  
           that is of special interest. Features may be spread  
           across multiple contigs of a genome, but never across  
           more than one genome. Features can be assigned to roles  
           via spreadsheet cells, and are the targets of  
           annotation.</Notes>  
         </Entity>  
         <Entity name="Genome" keyType="name-string">  
           <Notes>A [i]genome[/i] describes a particular individual  
           organism's DNA.</Notes>  
         </Entity>  
       </Entities>  
     </Database>  
   
 It is not necessary to put any tables into the database; however, you should  
 run  
   
     AttrDBRefresh  
   
 periodically to insure it has the correct Genomes and Features in it. When  
 converting from the old system, use  
   
     AttrDBRefresh -migrate  
   
 to initialize the database and migrate the legacy data. You should only need  
 to do that once.  
   
 =head2 Implementation Note  
   
 The L</Refresh> method reloads the entities in the database. If new  
 entity types are added, that method will need to be adjusted accordingly.  
   
130  =head2 Public Methods  =head2 Public Methods
131    
132  =head3 new  =head3 new
133    
134  C<< my $attrDB = CustomAttributes->new($splitter); >>      my $attrDB = CustomAttributes->new(%options);
135    
136  Construct a new CustomAttributes object. This object cannot be used to add or  Construct a new CustomAttributes object. The following options are
137  delete keys because that requires modifying the database design. To do that,  supported.
 you need to use the static L</StoreAttributeKey> or L</DeleteAttributeKey>  
 methods.  
138    
139  =over 4  =over 4
140    
141  =item splitter  =item splitter
142    
143  Value to be used to split attribute values into sections in the  Value to be used to split attribute values into sections in the
144  L</Fig Replacement Methods>. The default is a double colon C<::>.  L</Fig Replacement Methods>. The default is a double colon C<::>,
145  If you do not use the replacement methods, you do not need to  and should only be overridden in extreme circumstances.
146  worry about this parameter.  
147    =item user
148    
149    Name of the current user. This will appear in the attribute log.
150    
151  =back  =back
152    
# Line 142  Line 154 
154    
155  sub new {  sub new {
156      # Get the parameters.      # Get the parameters.
157      my ($class, $splitter) = @_;      my ($class, %options) = @_;
158      # Connect to the database.      # Connect to the database.
159      my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,      my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
160                              $FIG_Config::attrUser, $FIG_Config::attrPass,                              $FIG_Config::attrUser, $FIG_Config::attrPass,
# Line 152  Line 164 
164      my $xmlFileName = $FIG_Config::attrDBD;      my $xmlFileName = $FIG_Config::attrDBD;
165      my $retVal = ERDB::new($class, $dbh, $xmlFileName);      my $retVal = ERDB::new($class, $dbh, $xmlFileName);
166      # Store the splitter value.      # Store the splitter value.
167      $retVal->{splitter} = (defined($splitter) ? $splitter : '::');      $retVal->{splitter} = $options{splitter} || '::';
168        # Store the user name.
169        $retVal->{user} = $options{user} || '<unknown>';
170        Trace("User $retVal->{user} selected for attribute object.") if T(3);
171      # Return the result.      # Return the result.
172      return $retVal;      return $retVal;
173  }  }
174    
175  =head3 StoreAttributeKey  =head3 StoreAttributeKey
176    
177  C<< my $attrDB = CustomAttributes::StoreAttributeKey($entityName, $attributeName, $type, $notes); >>      $attrDB->StoreAttributeKey($attributeName, $type, $notes, \@groups);
178    
179  Create or update an attribute for the database. This method will update the database definition  Create or update an attribute for the database.
 XML, but it will not create the table. It will connect to the database so that the caller  
 can upload the attribute values.  
180    
181  =over 4  =over 4
182    
 =item entityName  
   
 Name of the entity containing the attribute. The entity must exist.  
   
183  =item attributeName  =item attributeName
184    
185  Name of the attribute. It must be a valid ERDB field name, consisting entirely of  Name of the attribute (the real key). If it does not exist already, it will be created.
 letters, digits, and hyphens, with a letter at the beginning. If it does not  
 exist already, it will be created.  
186    
187  =item type  =item type
188    
# Line 185  Line 192 
192    
193  Descriptive notes about the attribute. It is presumed to be raw text, not HTML.  Descriptive notes about the attribute. It is presumed to be raw text, not HTML.
194    
195  =item RETURN  =item groups
196    
197  Returns a Custom Attribute Database object if successful. If unsuccessful, an  Reference to a list of the groups to which the attribute should be associated.
198  error will be thrown.  This will replace any groups to which the attribute is currently attached.
199    
200  =back  =back
201    
# Line 196  Line 203 
203    
204  sub StoreAttributeKey {  sub StoreAttributeKey {
205      # Get the parameters.      # Get the parameters.
206      my ($entityName, $attributeName, $type, $notes) = @_;      my ($self, $attributeName, $type, $notes, $groups) = @_;
207        # Declare the return variable.
208        my $retVal;
209      # Get the data type hash.      # Get the data type hash.
210      my %types = ERDB::GetDataTypes();      my %types = ERDB::GetDataTypes();
211      # Validate the initial input values.      # Validate the initial input values.
212      if (! ERDB::ValidateFieldName($attributeName)) {      if ($attributeName =~ /$self->{splitter}/) {
213          Confess("Invalid attribute name \"$attributeName\" specified.");          Confess("Invalid attribute name \"$attributeName\" specified.");
214      } elsif (! $notes || length($notes) < 25) {      } elsif (! $notes || length($notes) < 25) {
215          Confess("Missing or incomplete description for $attributeName.");          Confess("Missing or incomplete description for $attributeName.");
216      } elsif (! exists $types{$type}) {      } elsif (! exists $types{$type}) {
217          Confess("Invalid data type \"$type\" for $attributeName.");          Confess("Invalid data type \"$type\" for $attributeName.");
218        } else {
219            # Create a variable to hold the action to be displayed for the log (Add or Update).
220            my $action;
221            # Okay, we're ready to begin. See if this key exists.
222            my $attribute = $self->GetEntity('AttributeKey', $attributeName);
223            if (defined($attribute)) {
224                # It does, so we do an update.
225                $action = "Update Key";
226                $self->UpdateEntity('AttributeKey', $attributeName,
227                                    { description => $notes, 'data-type' => $type });
228                # Detach the key from its current groups.
229                $self->Disconnect('IsInGroup', 'AttributeKey', $attributeName);
230            } else {
231                # It doesn't, so we do an insert.
232                $action = "Insert Key";
233                $self->InsertObject('AttributeKey', { id => $attributeName,
234                                    description => $notes, 'data-type' => $type });
235            }
236            # Attach the key to the specified groups. (We presume the groups already
237            # exist.)
238            for my $group (@{$groups}) {
239                $self->InsertObject('IsInGroup', { 'from-link' => $attributeName,
240                                                   'to-link'   => $group });
241      }      }
242      # Our next step is to read in the XML for the database defintion. We          # Log the operation.
243      # need to verify that the named entity exists.          $self->LogOperation($action, $attributeName, "Group list is " . join(" ", @{$groups}));
     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);  
     my $entityHash = $metadata->{Entities};  
     if (! exists $entityHash->{$entityName}) {  
         Confess("Entity $entityName not found.");  
     } else {  
         # Okay, we're ready to begin. Get the entity hash and the field hash.  
         my $entityData = $entityHash->{$entityName};  
         my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);  
         # Compute the attribute's relation name.  
         my $relName = join("", $entityName, map { ucfirst $_ } split(/-|_/, $attributeName));  
         # Store the attribute's field data. Note the use of the "content" hash for  
         # the notes. This is how the XML writer knows Notes is a text tag instead of  
         # an attribute.  
         $fieldHash->{$attributeName} = { type => $type, relation => $relName,  
                                          Notes => { content => $notes } };  
         # Insure we have an index for this attribute.  
         my $index = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);  
         if (! defined($index)) {  
             push @{$entityData->{Indexes}}, { IndexFields => [ { name => $attributeName, order => 'ascending' } ],  
                                               Notes       => "Alternate index provided for access by $attributeName." };  
         }  
         # Write the XML back out.  
         ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);  
244      }      }
     # Open a database with the new XML.  
     my $retVal = CustomAttributes->new();  
     return $retVal;  
245  }  }
246    
 =head3 Refresh  
247    
248  C<< $attrDB->Refresh($fig); >>  =head3 DeleteAttributeKey
249    
250        my $stats = $attrDB->DeleteAttributeKey($attributeName);
251    
252  Refresh the primary entity tables from the FIG data store. This method basically  Delete an attribute from the custom attributes database.
 drops and reloads the main tables of the custom attributes database.  
253    
254  =over 4  =over 4
255    
256  =item fig  =item attributeName
257    
258    Name of the attribute to delete.
259    
260    =item RETURN
261    
262  FIG-like object that can be used to find genomes and features.  Returns a statistics object describing the effects of the deletion.
263    
264  =back  =back
265    
266  =cut  =cut
267    
268  sub Refresh {  sub DeleteAttributeKey {
269      # Get the parameters.      # Get the parameters.
270      my ($self, $fig) = @_;      my ($self, $attributeName) = @_;
271      # Create load objects for the genomes and the features.      # Delete the attribute key.
272      my $loadGenome = ERDBLoad->new($self, 'Genome', $FIG_Config::temp);      my $retVal = $self->Delete('AttributeKey', $attributeName);
273      my $loadFeature = ERDBLoad->new($self, 'Feature', $FIG_Config::temp);      # Log this operation.
274      # Get the genome list.      $self->LogOperation("Delete Key", $attributeName, "Key will no longer be available for use by anyone.");
275      my @genomes = $fig->genomes();      # Return the result.
276      # Loop through the genomes.      return $retVal;
277      for my $genomeID (@genomes) {  
         # Put this genome in the genome table.  
         $loadGenome->Put($genomeID);  
         Trace("Processing Genome $genomeID") if T(3);  
         # Put its features into the feature table. Note we have to use a hash to  
         # remove duplicates.  
         my %featureList = map { $_ => 1 } $fig->all_features($genomeID);  
         for my $fid (keys %featureList) {  
             $loadFeature->Put($fid);  
         }  
278      }      }
279      # Get a variable for holding statistics objects.  
280      my $stats;  =head3 NewName
281      # Finish the genome load.  
282      Trace("Loading Genome relation.") if T(2);      my $text = CustomAttributes::NewName();
283      $stats = $loadGenome->FinishAndLoad();  
284      Trace("Genome table load statistics:\n" . $stats->Show()) if T(3);  Return the string used to indicate the user wants to add a new attribute.
285      # Finish the feature load.  
286      Trace("Loading Feature relation.") if T(2);  =cut
287      $stats = $loadFeature->FinishAndLoad();  
288      Trace("Feature table load statistics:\n" . $stats->Show()) if T(3);  sub NewName {
289        return "(new)";
290  }  }
291    
292  =head3 LoadAttributeKey  =head3 ControlForm
293    
294  C<< my $stats = $attrDB->LoadAttributeKey($entityName, $fieldName, $fh, $keyCol, $dataCol); >>      my $formHtml = $attrDB->ControlForm($cgi, $name, \%keys);
295    
296  Load the specified attribute from the specified file. The file should be a  Return a form that can be used to control the creation and modification of
297  tab-delimited file with internal tab and new-line characters escaped. This is  attributes. Only a subset of the attribute keys will be displayed, as
298  the typical TBL-style file used by most FIG applications. One of the columns  determined by the incoming list.
 in the input file must contain the appropriate key value and the other the  
 corresponding attribute value.  
299    
300  =over 4  =over 4
301    
302  =item entityName  =item cgi
303    
304    CGI query object used to create HTML.
305    
306    =item name
307    
308    Name to give to the form. This should be unique for the web page.
309    
310    =item keys
311    
312    Reference to a hash mapping attribute keys to n-tuples. Each tuple will contain the
313    attribute's data type, its description, and a list of the groups in which it participates.
314    
315    =item RETURN
316    
317    Returns the HTML for a form that can be used to  submit instructions to the C<Attributes.cgi> script
318    for loading, creating, displaying, changing, or deleting an attribute. Note that only the form
319    controls are generated. The form tags are left to the caller.
320    
321    =back
322    
323    =cut
324    
325  Name of the entity containing the attribute.  sub ControlForm {
326        # Get the parameters.
327        my ($self, $cgi, $name, $keys) = @_;
328        # Declare the return list.
329        my @retVal = ();
330        # We'll put the controls in a table. Nothing else ever seems to look nice.
331        push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
332        # The first row is for selecting the field name.
333        push @retVal, $cgi->Tr($cgi->th("Select a Field"),
334                               $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', $keys,
335                                                         new => 1,
336                                                         notes => "document.$name.notes.value",
337                                                         type => "document.$name.dataType.value",
338                                                         groups => "document.$name.groups")));
339        # Now we set up a dropdown for the data types. The values will be the
340        # data type names, and the labels will be the descriptions.
341        my %types = ERDB::GetDataTypes();
342        my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
343        my $typeMenu = $cgi->popup_menu(-name   => 'dataType',
344                                        -values => [sort keys %types],
345                                        -labels => \%labelMap,
346                                        -default => 'string');
347        # Allow the user to specify a new field name. This is required if the
348        # user has selected the "(new)" marker.
349        my $fieldField = "document.$name.fieldName";
350        my $newName = "\"" . NewName() . "\"";
351        push @retVal, $cgi->Tr($cgi->th("New Field Name"),
352                               $cgi->td($cgi->textfield(-name => 'newName',
353                                                        -size => 30,
354                                                        -value => "",
355                                                        -onFocus => "setIfEmpty($fieldField, $newName);")),
356                                        );
357        push @retVal, $cgi->Tr($cgi->th("Data type"),
358                               $cgi->td($typeMenu));
359        # The next row is for the notes.
360        push @retVal, $cgi->Tr($cgi->th("Description"),
361                               $cgi->td($cgi->textarea(-name => 'notes',
362                                                       -rows => 6,
363                                                       -columns => 80))
364                              );
365        # Now we have the groups, which are implemented as a checkbox group.
366        my @groups = $self->GetGroups();
367        push @retVal, $cgi->Tr($cgi->th("Groups"),
368                               $cgi->td($cgi->checkbox_group(-name=>'groups',
369                                        -values=> \@groups))
370                              );
371        # Now the four buttons: STORE, SHOW, ERASE, and DELETE.
372        push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
373                               $cgi->td({align => 'center'}, join(" ",
374                                        $cgi->submit(-name => 'Delete', -value => 'DELETE'),
375                                        $cgi->submit(-name => 'Store',  -value => 'STORE'),
376                                        $cgi->submit(-name => 'Erase',  -value => 'ERASE'),
377                                        $cgi->submit(-name => 'Show',   -value => 'SHOW')
378                                       ))
379                              );
380        # Close the table and the form.
381        push @retVal, $cgi->end_table();
382        # Return the assembled HTML.
383        return join("\n", @retVal, "");
384    }
385    
386  =item fieldName  =head3 LoadAttributesFrom
387    
388  Name of the actual attribute.  C<< my $stats = $attrDB->LoadAttributesFrom($fileName, %options); >>
389    
390  =item fh  Load attributes from the specified tab-delimited file. Each line of the file must
391    contain an object ID in the first column, an attribute key name in the second
392    column, and attribute values in the remaining columns. The attribute values will
393    be assembled into a single value using the splitter code. In addition, the key names may
394    contain a splitter. If this is the case, the portion of the key after the splitter is
395    treated as a subkey.
396    
397  Open file handle for the input file.  =over 4
398    
399  =item keyCol  =item fileName
400    
401  Index (0-based) of the column containing the key field. The key field should  Name of the file from which to load the attributes, or an open handle for the file.
402  contain the ID of an instance of the named entity.  (This last enables the method to be used in conjunction with the CGI form upload
403    control.)
404    
405  =item dataCol  =item options
406    
407  Index (0-based) of the column containing the data value field.  Hash of options for modifying the load process.
408    
409  =item RETURN  =item RETURN
410    
411  Returns a statistics object for the load process.  Returns a statistics object describing the load.
412    
413    =back
414    
415    Permissible option values are as follows.
416    
417    =over 4
418    
419    =item append
420    
421    If TRUE, then the attributes will be appended to existing data; otherwise, the
422    first time a key name is encountered, it will be erased.
423    
424    =item archive
425    
426    If specified, the name of a file into which the incoming data should be saved.
427    If I<resume> is also specified, only the lines actually loaded will be put
428    into this file.
429    
430    =item objectType
431    
432    If specified, the specified object type will be prefixed to each object ID.
433    
434    =item resume
435    
436    If specified, key-value pairs already in the database will not be reinserted.
437    Specify a number to start checking after the specified number of lines and
438    then admit everything after the first line not yet loaded. Specify C<careful>
439    to check every single line. Specify C<none> to ignore this option. The default
440    is C<none>. So, if you believe that a previous load failed somewhere after 50000
441    lines, a resume value of C<50000> would skip 50000 lines in the file, then
442    check each line after that until it finds one not already in the database. The
443    first such line found and all lines after that will be loaded. On the other
444    hand, if you have a file of 100000 records, and some have been loaded and some
445    not, you would use the word C<careful>, so that every line would be checked before
446    it is inserted. A resume of C<0> will start checking the first line of the
447    input file and then begin loading once it finds a line not in the database.
448    
449    =item chunkSize
450    
451    Number of lines to load in each burst. The default is 10,000.
452    
453  =back  =back
454    
455  =cut  =cut
456    
457  sub LoadAttributeKey {  sub LoadAttributesFrom {
458      # Get the parameters.      # Get the parameters.
459      my ($self, $entityName, $fieldName, $fh, $keyCol, $dataCol) = @_;      my ($self, $fileName, %options) = @_;
460      # Create the return variable.      # Declare the return variable.
461      my $retVal;      my $retVal = Stats->new('keys', 'values', 'linesOut');
462      # Insure the entity exists.      # Initialize the timers.
463      my $found = grep { $_ eq $entityName } $self->GetEntityTypes();      my ($insertTime, $eraseTime, $archiveTime, $checkTime) = (0, 0, 0, 0);
464      if (! $found) {      # Check for append mode.
465          Confess("Entity \"$entityName\" not found in database.");      my $append = ($options{append} ? 1 : 0);
466      } else {      # Check for resume mode.
467          # Get the field structure for the named entity.      my $resume = (defined($options{resume}) ? $options{resume} : 'none');
468          my $fieldHash = $self->GetFieldTable($entityName);      # Create a hash of key names found.
469          # Verify that the attribute exists.      my %keyHash = ();
470          if (! exists $fieldHash->{$fieldName}) {      # Compute the chunk size.
471              Confess("Attribute key \"$fieldName\" does not exist in entity $entityName.");      my $chunkSize = ($options{chunkSize} ? $options{chunkSize} : 10000);
472          } else {      # Open the file for input. Note we must anticipate the possibility of an
473              # Create a loader for the specified attribute. We need the      # open filehandle being passed in.
474              # relation name first.      my $fh;
475              my $relName = $fieldHash->{$fieldName}->{relation};      if (ref $fileName) {
476              my $loadAttribute = ERDBLoad->new($self, $relName, $FIG_Config::temp);          Trace("Using file opened by caller.") if T(3);
477              # Loop through the input file.          $fh = $fileName;
478        } else {
479            Trace("Attributes will be loaded from $fileName.") if T(3);
480            $fh = Open(undef, "<$fileName");
481        }
482        # Trace the mode.
483        Trace("Mode is $options{mode}.") if $options{mode} && T(3);
484        Trace("No mode specified.") if T(3) && ! $options{mode};
485        # Now check to see if we need to archive.
486        my $ah;
487        if (exists $options{archive}) {
488            my $ah = Open(undef, ">$options{archive}");
489            Trace("Load file will be archived to $options{archive}.") if T(3);
490        }
491        # This next file is used to cache the attribute data before loading it.
492        # To avoid problems, we use a series of small files instead of one
493        # big one.
494        my $tempFileName = "$FIG_Config::temp/attributeLoadFile$$.tbl";
495        # Insure we recover from errors.
496        eval {
497            # Open the temporary file and start a counter.
498            my $th = Tracer::Open(undef, ">$tempFileName");
499            my $chunkLinesLeft = $chunkSize;
500            # If we have a resume number, process it here.
501            if ($resume =~ /\d+/) {
502                Trace("Skipping $resume lines.") if T(2);
503                my $startTime = time();
504                # Skip the specified number of lines.
505                for (my $skipped = 0; ! eof($fh) && $skipped < $resume; $skipped++) {
506                    my $line = <$fh>;
507                    $retVal->Add(skipped => 1);
508                }
509                $checkTime += time() - $startTime;
510            }
511            # Loop through the file.
512            Trace("Starting load.") if T(2);
513              while (! eof $fh) {              while (! eof $fh) {
514                  # Get the next line of the file.              # Read the current line.
515                  my @fields = Tracer::GetLine($fh);              my ($id, $key, @values) = Tracer::GetLine($fh);
516                  $loadAttribute->Add("lineIn");              $retVal->Add(linesIn => 1);
517                  # Now we need to validate the line.              # Do some validation.
518                  if ($#fields < $dataCol) {              if (! $id) {
519                      $loadAttribute->Add("shortLine");                  # We ignore blank lines.
520                  } elsif (! $self->Exists($entityName, $fields[$keyCol])) {                  $retVal->Add(blankLines => 1);
521                      $loadAttribute->Add("badKey");              } elsif (substr($id, 0, 1) eq '#') {
522                    # A line beginning with a pound sign is a comment.
523                    $retVal->Add(comments => 1);
524                } elsif (! defined($key)) {
525                    # An ID without a key is a serious error.
526                    my $lines = $retVal->Ask('linesIn');
527                    Confess("Line $lines in $fileName has no attribute key.");
528                } elsif (! @values) {
529                    # A line with no values is not allowed.
530                    my $lines = $retVal->Ask('linesIn');
531                    Trace("Line $lines for key $key has no attribute values.") if T(1);
532                    $retVal->Add(skipped => 1);
533                  } else {                  } else {
534                      # It's valid,so send it to the loader.                  # Check to see if we need to fix up the object ID.
535                      $loadAttribute->Put($fields[$keyCol], $fields[$dataCol]);                  if ($options{objectType}) {
536                      $loadAttribute->Add("lineUsed");                      $id = "$options{objectType}:$id";
537                    }
538                    # The key contains a real part and an optional sub-part. We need the real part.
539                    my ($realKey, $subKey) = $self->SplitKey($key);
540                    # Now we need to check for a new key.
541                    if (! exists $keyHash{$realKey}) {
542                        my $keyObject = $self->GetEntity(AttributeKey => $realKey);
543                        if (! defined($keyObject)) {
544                            # Here the specified key does not exist, which is an error.
545                            my $line = $retVal->Ask('linesIn');
546                            Confess("Attribute \"$realKey\" on line $line of $fileName not found in database.");
547                        } else {
548                            # Make sure we know this is no longer a new key.
549                            $keyHash{$realKey} = 1;
550                            $retVal->Add(keys => 1);
551                            # If this is NOT append mode, erase the key. This does not delete the key
552                            # itself; it just clears out all the values.
553                            if (! $append) {
554                                my $startTime = time();
555                                $self->EraseAttribute($realKey);
556                                $eraseTime += time() - $startTime;
557                                Trace("Attribute $realKey erased.") if T(3);
558                            }
559                        }
560                        Trace("Key $realKey found.") if T(3);
561                    }
562                    # If we're in resume mode, check to see if this insert is redundant.
563                    my $ok = 1;
564                    if ($resume ne 'none') {
565                        my $startTime = time();
566                        my $count = $self->GetAttributes($id, $key, @values);
567                        if ($count) {
568                            # Here the record is found, so we skip it.
569                            $ok = 0;
570                            $retVal->Add(skipped => 1);
571                        } else {
572                            # Here the record is not found. If we're in non-careful mode, we
573                            # stop resume checking at this point.
574                            if ($resume ne 'careful') {
575                                $resume = 'none';
576                            }
577                        }
578                        $checkTime += time() - $startTime;
579                    }
580                    if ($ok) {
581                        # We're in business. First, archive this row.
582                        if (defined $ah) {
583                            my $startTime = time();
584                            Tracer::PutLine($ah, [$id, $key, @values]);
585                            $archiveTime += time() - $startTime;
586                        }
587                        # We need to format the attribute data so it will work
588                        # as if it were a load file. This means we join the
589                        # values.
590                        my $valueString = join('::', @values);
591                        # Everything is all set up, so put the value in the temporary file and
592                        # count it.
593                        my $startTime = time();
594                        Tracer::PutLine($th, [$realKey, $id, $subKey, $valueString]);
595                        $archiveTime += time() - $startTime;
596                        $retVal->Add(linesOut => 1);
597                        # Check to see if it's time to output a chunk.
598                        $chunkLinesLeft--;
599                        if ($chunkLinesLeft <= 0) {
600                            close $th;
601                            # Now we load the table from the file. Note that we don't do an analyze.
602                            # The analyze is done only after loading the residual.
603                            my $startTime = time();
604                            Trace("Loading attributes from $tempFileName: " . (-s $tempFileName) .
605                                  " characters.") if T(3);
606                            my $loadStats = $self->LoadTable($tempFileName, 'HasValueFor',
607                                                             mode => $options{mode}, partial => 1);
608                            $retVal->Add(insertTime => time() - $startTime);
609                            # Re-open the file and restart the counter.
610                            $th = Tracer::Open(undef, ">$tempFileName");
611                            $chunkLinesLeft = $chunkSize;
612                            $retVal->Add(chunks => 1);
613                  }                  }
614                    } else {
615                        # Here we skipped because of resume mode.
616                        $retVal->Add(resumeSkip => 1);
617              }              }
618              # Finish the load.                  my $progress = $retVal->Add(values => 1);
619              $retVal = $loadAttribute->FinishAndLoad();                  Trace("$progress values processed.") if T(3) && ($progress % 1000 == 0);
620          }          }
621      }      }
622      # Return the statistics.          # Now we close the archive file. Note we undefine the handle so the error methods know
623            # not to worry.
624            if (defined $ah) {
625                close $ah;
626                undef $ah;
627            }
628            # Now we load the residual from the temporary file (if any). This time we'll do an
629            # analyze as well.
630            close $th;
631            my $startTime = time();
632            Trace("Loading residual attributes from $tempFileName: " . (-s $tempFileName) .
633                  " characters.") if T(3);
634            my $loadStats = $self->LoadTable($tempFileName, 'HasValueFor', mode => $options{mode}, partial => 1);
635            $retVal->Add(insertTime => time() - $startTime);
636            $retVal->Add(chunks => 1);
637            Trace("Attribute load successful.") if T(2);
638        };
639        # Check for an error.
640        if ($@) {
641            # Here we have an error. Display the error message.
642            my $message = $@;
643            Trace("Error during attribute load: $message") if T(0);
644            $retVal->AddMessage($message);
645            # Close the archive file if it's open. The archive file can sometimes provide
646            # clues as to what happened.
647            if (defined $ah) {
648                close $ah;
649            }
650        }
651        # Store the timers.
652        $retVal->Add(eraseTime   => $eraseTime);
653        $retVal->Add(insertTime  => $insertTime);
654        $retVal->Add(archiveTime => $archiveTime);
655        $retVal->Add(checkTime   => $checkTime);
656        # Return the result.
657      return $retVal;      return $retVal;
658  }  }
659    
660    =head3 BackupKeys
661    
662  =head3 DeleteAttributeKey      my $stats = $attrDB->BackupKeys($fileName, %options);
   
 C<< CustomAttributes::DeleteAttributeKey($entityName, $attributeName); >>  
663    
664  Delete an attribute from the custom attributes database.  Backup the attribute key information from the attribute database.
665    
666  =over 4  =over 4
667    
668  =item entityName  =item fileName
669    
670  Name of the entity possessing the attribute.  Name of the output file.
671    
672  =item attributeName  =item options
673    
674  Name of the attribute to delete.  Options for modifying the backup process.
675    
676    =item RETURN
677    
678    Returns a statistics object for the backup.
679    
680  =back  =back
681    
682    Currently there are no options. The backup is straight to a text file in
683    tab-delimited format. Each key is backup up to two lines. The first line
684    is all of the data from the B<AttributeKey> table. The second is a
685    tab-delimited list of all the groups.
686    
687  =cut  =cut
688    
689  sub DeleteAttributeKey {  sub BackupKeys {
690        # Get the parameters.
691        my ($self, $fileName, %options) = @_;
692        # Declare the return variable.
693        my $retVal = Stats->new();
694        # Open the output file.
695        my $fh = Open(undef, ">$fileName");
696        # Set up to read the keys.
697        my $keyQuery = $self->Get(['AttributeKey'], "", []);
698        # Loop through the keys.
699        while (my $keyData = $keyQuery->Fetch()) {
700            $retVal->Add(key => 1);
701            # Get the fields.
702            my ($id, $type, $description) = $keyData->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
703                                                              'AttributeKey(description)']);
704            # Escape any tabs or new-lines in the description.
705            my $escapedDescription = Tracer::Escape($description);
706            # Write the key data to the output.
707            Tracer::PutLine($fh, [$id, $type, $escapedDescription]);
708            # Get the key's groups.
709            my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?", [$id],
710                                        'IsInGroup(to-link)');
711            $retVal->Add(memberships => scalar(@groups));
712            # Write them to the output. Note we put a marker at the beginning to insure the line
713            # is nonempty.
714            Tracer::PutLine($fh, ['#GROUPS', @groups]);
715        }
716        # Log the operation.
717        $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
718        # Return the result.
719        return $retVal;
720    }
721    
722    =head3 RestoreKeys
723    
724        my $stats = $attrDB->RestoreKeys($fileName, %options);
725    
726    Restore the attribute keys and groups from a backup file.
727    
728    =over 4
729    
730    =item fileName
731    
732    Name of the file containing the backed-up keys. Each key has a pair of lines,
733    one containing the key data and one listing its groups.
734    
735    =back
736    
737    =cut
738    
739    sub RestoreKeys {
740      # Get the parameters.      # Get the parameters.
741      my ($entityName, $attributeName) = @_;      my ($self, $fileName, %options) = @_;
742      # Read in the XML for the database defintion. We need to verify that      # Declare the return variable.
743      # the named entity exists and it has the named attribute.      my $retVal = Stats->new();
744      my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);      # Set up a hash to hold the group IDs.
745      my $entityHash = $metadata->{Entities};      my %groups = ();
746      if (! exists $entityHash->{$entityName}) {      # Open the file.
747          Confess("Entity \"$entityName\" not found.");      my $fh = Open(undef, "<$fileName");
748      } else {      # Loop until we're done.
749          # Get the field hash.      while (! eof $fh) {
750          my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);          # Get a key record.
751          if (! exists $fieldHash->{$attributeName}) {          my ($id, $dataType, $description) = Tracer::GetLine($fh);
752              Confess("Attribute key \"$attributeName\" not found in entity $entityName.");          if ($id eq '#GROUPS') {
753          } else {              Confess("Group record found when key record expected.");
754              # Get the attribute's relation name.          } elsif (! defined($description)) {
755              my $relName = $fieldHash->{$attributeName}->{relation};              Confess("Invalid format found for key record.");
756              # Check for an index.          } else {
757              my $indexIdx = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);              $retVal->Add("keyIn" => 1);
758              if (defined($indexIdx)) {              # Add this key to the database.
759                  Trace("Index for $attributeName found at position $indexIdx for $entityName.") if T(3);              $self->InsertObject('AttributeKey', { id => $id, 'data-type' => $dataType,
760                  delete $entityHash->{$entityName}->{Indexes}->[$indexIdx];                                                    description => Tracer::UnEscape($description) });
761              }              Trace("Attribute $id stored.") if T(3);
762              # Delete the attribute from the field hash.              # Get the group line.
763              Trace("Deleting attribute $attributeName from $entityName.") if T(3);              my ($marker, @groups) = Tracer::GetLine($fh);
764              delete $fieldHash->{$attributeName};              if (! defined($marker)) {
765              # Write the XML back out.                  Confess("End of file found where group record expected.");
766              ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);              } elsif ($marker ne '#GROUPS') {
767              # Insure the relation does not exist in the database. This requires connecting                  Confess("Group record not found after key record.");
768              # since we may have to do a table drop.              } else {
769              my $attrDB = CustomAttributes->new();                  $retVal->Add(memberships => scalar(@groups));
770              Trace("Dropping table $relName.") if T(3);                  # Connect the groups.
771              $attrDB->DropRelation($relName);                  for my $group (@groups) {
772                        # Find out if this is a new group.
773                        if (! $groups{$group}) {
774                            $retVal->Add(newGroup => 1);
775                            # Add the group.
776                            $self->InsertObject('AttributeGroup', { id => $group });
777                            Trace("Group $group created.") if T(3);
778                            # Make sure we know it's not new.
779                            $groups{$group} = 1;
780                        }
781                        # Connect the group to our key.
782                        $self->InsertObject('IsInGroup', { 'from-link' => $id, 'to-link' => $group });
783                    }
784                    Trace("$id added to " . scalar(@groups) . " groups.") if T(3);
785          }          }
786      }      }
787  }  }
788        # Log the operation.
789        $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
790        # Return the result.
791        return $retVal;
792    }
793    
794  =head3 ControlForm  =head3 ArchiveFileName
795    
796  C<< my $formHtml = $attrDB->ControlForm($cgi, $name); >>      my $fileName = $ca->ArchiveFileName();
797    
798  Return a form that can be used to control the creation and modification of  Compute a file name for archiving attribute input data. The file will be in the attribute log directory
799  attributes.  
800    =cut
801    
802    sub ArchiveFileName {
803        # Get the parameters.
804        my ($self) = @_;
805        # Declare the return variable.
806        my $retVal;
807        # We start by turning the timestamp into something usable as a file name.
808        my $now = Tracer::Now();
809        $now =~ tr/ :\//___/;
810        # Next we get the directory name.
811        my $dir = "$FIG_Config::var/attributes";
812        if (! -e $dir) {
813            Trace("Creating attribute file directory $dir.") if T(1);
814            mkdir $dir;
815        }
816        # Put it together with the field name and the time stamp.
817        $retVal = "$dir/upload.$now";
818        # Modify the file name to insure it's unique.
819        my $seq = 0;
820        while (-e "$retVal.$seq.tbl") { $seq++ }
821        # Use the computed sequence number to get the correct file name.
822        $retVal .= ".$seq.tbl";
823        # Return the result.
824        return $retVal;
825    }
826    
827    =head3 BackupAllAttributes
828    
829        my $stats = $attrDB->BackupAllAttributes($fileName, %options);
830    
831    Backup all of the attributes to a file. The attributes will be stored in a
832    tab-delimited file suitable for reloading via L</LoadAttributesFrom>.
833    
834  =over 4  =over 4
835    
836  =item cgi  =item fileName
837    
838  CGI query object used to create HTML.  Name of the file to which the attribute data should be backed up.
839    
840  =item name  =item options
841    
842  Name to give to the form. This should be unique for the web page.  Hash of options for the backup.
843    
844  =item RETURN  =item RETURN
845    
846  Returns the HTML for a form that submits instructions to the C<Attributes.cgi> script  Returns a statistics object describing the backup.
 for loading, creating, or deleting an attribute.  
847    
848  =back  =back
849    
850    Currently there are no options defined.
851    
852  =cut  =cut
853    
854  sub ControlForm {  sub BackupAllAttributes {
855      # Get the parameters.      # Get the parameters.
856      my ($self, $cgi, $name) = @_;      my ($self, $fileName, %options) = @_;
857      # Declare the return list.      # Declare the return variable.
858      my @retVal = ();      my $retVal = Stats->new();
859      # Start the form. We use multipart to support the upload control.      # Get a list of the keys.
860      push @retVal, $cgi->start_multipart_form(-name => $name);      my @keys = $self->GetFlat(['AttributeKey'], "", [], 'AttributeKey(id)');
861      # We'll put the controls in a table. Nothing else ever seems to look nice.      Trace(scalar(@keys) . " keys found during backup.") if T(2);
862      push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });      # Open the file for output.
863      # The first row is for selecting the field name.      my $fh = Open(undef, ">$fileName");
864      push @retVal, $cgi->Tr($cgi->th("Select a Field"),      # Loop through the keys.
865                             $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', 1,      for my $key (@keys) {
866                                                       "document.$name.notes.value",          Trace("Backing up attribute $key.") if T(3);
867                                                       "document.$name.dataType.value")));          $retVal->Add(keys => 1);
868      # Now we set up a dropdown for the data types. The values will be the          # Loop through this key's values.
869      # data type names, and the labels will be the descriptions.          my $query = $self->Get(['HasValueFor'], "HasValueFor(from-link) = ?", [$key]);
870      my %types = ERDB::GetDataTypes();          my $valuesFound = 0;
871      my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;          while (my $line = $query->Fetch()) {
872      my $typeMenu = $cgi->popup_menu(-name   => 'dataType',              $valuesFound++;
873                                      -values => [sort keys %types],              # Get this row's data.
874                                      -labels => \%labelMap);              my ($id, $key, $subKey, $value) = $line->Values(['HasValueFor(to-link)',
875      push @retVal, $cgi->Tr($cgi->th("Data type"),                                                               'HasValueFor(from-link)',
876                             $cgi->td($typeMenu));                                                               'HasValueFor(subkey)',
877      # The next row is for the notes.                                                               'HasValueFor(value)']);
878      push @retVal, $cgi->Tr($cgi->th("Description"),              # Check for a subkey.
879                             $cgi->td($cgi->textarea(-name => 'notes',              if ($subKey ne '') {
880                                                     -rows => 6,                  $key = "$key$self->{splitter}$subKey";
881                                                     -columns => 80))              }
882                            );              # Write it to the file.
883      # Allow the user to specify a new field name. This is required if the              Tracer::PutLine($fh, [$id, $key, $value]);
884      # user has selected one of the "(new)" markers.          }
885      push @retVal, $cgi->Tr($cgi->th("New Field Name"),          Trace("$valuesFound values backed up for key $key.") if T(3);
886                             $cgi->td($cgi->textfield(-name => 'newName',          $retVal->Add(values => $valuesFound);
887                                                      -size => 30)),      }
888                                      );      # Log the operation.
889      # If the user wants to upload new values for the field, then we have      $self->LogOperation("Backup Data", $fileName, $retVal->Display());
890      # an upload file name and column indicators.      # Return the result.
891      push @retVal, $cgi->Tr($cgi->th("Upload Values"),      return $retVal;
                            $cgi->td($cgi->filefield(-name => 'newValueFile',  
                                                     -size => 20) .  
                                     " Key&nbsp;" .  
                                     $cgi->textfield(-name => 'keyCol',  
                                                     -size => 3,  
                                                     -default => 0) .  
                                     " Value&nbsp;" .  
                                     $cgi->textfield(-name => 'valueCol',  
                                                     -size => 3,  
                                                     -default => 1)  
                                    ),  
                           );  
     # Now the three buttons: UPDATE, SHOW, and DELETE.  
     push @retVal, $cgi->Tr($cgi->th("&nbsp;"),  
                            $cgi->td({align => 'center'},  
                                     $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .  
                                     $cgi->submit(-name => 'Store',  -value => 'STORE') . " " .  
                                     $cgi->submit(-name => 'Show',   -value => 'SHOW')  
                                    )  
                           );  
     # Close the table and the form.  
     push @retVal, $cgi->end_table();  
     push @retVal, $cgi->end_form();  
     # Return the assembled HTML.  
     return join("\n", @retVal, "");  
892  }  }
893    
894  =head3 FieldMenu  =head3 FieldMenu
895    
896  C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $newFlag, $noteControl, $typeControl); >>      my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $keys, %options);
897    
898  Return the HTML for a menu to select an attribute field. The menu will  Return the HTML for a menu to select an attribute field. The menu will
899  be a standard SELECT/OPTION thing which is called "popup menu" in the  be a standard SELECT/OPTION thing which is called "popup menu" in the
900  CGI package, but actually looks like a list. The list will contain  CGI package, but actually looks like a list. The list will contain
901  one selectable row per field, grouped by entity.  one selectable row per field.
902    
903  =over 4  =over 4
904    
# Line 545  Line 915 
915  Name to give to the menu. This is the name under which the value will  Name to give to the menu. This is the name under which the value will
916  appear when the form is submitted.  appear when the form is submitted.
917    
918  =item newFlag (optional)  =item keys
919    
920    Reference to a hash mapping each attribute key name to a list reference,
921    the list itself consisting of the attribute data type, its description,
922    and a list of its groups.
923    
924    =item options
925    
926    Hash containing options that modify the generation of the menu.
927    
928    =item RETURN
929    
930    Returns the HTML to create a form field that can be used to select an
931    attribute from the custom attributes system.
932    
933    =back
934    
935    The permissible options are as follows.
936    
937    =over 4
938    
939    =item new
940    
941  If TRUE, then extra rows will be provided to allow the user to select  If TRUE, then extra rows will be provided to allow the user to select
942  a new attribute. In other words, the user can select an existing  a new attribute. In other words, the user can select an existing
943  attribute, or can choose a C<(new)> marker to indicate a field to  attribute, or can choose a C<(new)> marker to indicate a field to
944  be created in the parent entity.  be created in the parent entity.
945    
946  =item noteControl (optional)  =item notes
947    
948  If specified, the name of a variable for displaying the notes attached  If specified, the name of a variable for displaying the notes attached
949  to the field. This must be in Javascript form ready for assignment.  to the field. This must be in Javascript form ready for assignment.
# Line 563  Line 954 
954  it is copied in. Specifying this parameter generates Javascript for  it is copied in. Specifying this parameter generates Javascript for
955  displaying the field description when a field is selected.  displaying the field description when a field is selected.
956    
957  =item typeControl (optional)  =item type
958    
959  If specified, the name of a variable for displaying the field's  If specified, the name of a variable for displaying the field's
960  data type. Data types are a much more controlled vocabulary than  data type. Data types are a much more controlled vocabulary than
# Line 571  Line 962 
962  raw value is put into the specified variable. Otherwise, the same  raw value is put into the specified variable. Otherwise, the same
963  rules apply to this value that apply to I<$noteControl>.  rules apply to this value that apply to I<$noteControl>.
964    
965  =item RETURN  =item groups
966    
967  Returns the HTML to create a form field that can be used to select an  If specified, the name of a multiple-selection list control (also called
968  attribute from the custom attributes system.  a popup menu) which shall be used to display the selected groups.
969    
970  =back  =back
971    
# Line 582  Line 973 
973    
974  sub FieldMenu {  sub FieldMenu {
975      # Get the parameters.      # Get the parameters.
976      my ($self, $cgi, $height, $name, $newFlag, $noteControl, $typeControl) = @_;      my ($self, $cgi, $height, $name, $keys, %options) = @_;
977      # These next two hashes make everything happen. "entities"      # Reformat the list of keys.
978      # maps each entity name to the list of values to be put into its      my %keys = %{$keys};
979      # option group. "labels" maps each entity name to a map from values      # Add the (new) key, if needed.
980      # to labels.      if ($options{new}) {
981      my @entityNames = sort ($self->GetEntityTypes());          $keys{NewName()} = ["string", ""];
982      my %entities = map { $_ => [] } @entityNames;      }
983      my %labels = map { $_ => { }} @entityNames;      # Get a sorted list of key.
984      # Loop through the entities, adding the existing attributes.      my @keys = sort keys %keys;
985      for my $entity (@entityNames) {      # We need to create the name for the onChange function. This function
         # Get this entity's field table.  
         my $fieldHash = $self->GetFieldTable($entity);  
         # Get its field list in our local hashes.  
         my $fieldList = $entities{$entity};  
         my $labelList = $labels{$entity};  
         # Add the NEW fields if we want them.  
         if ($newFlag) {  
             push @{$fieldList}, $entity;  
             $labelList->{$entity} = "(new)";  
         }  
         # Loop through the fields in the hash. We only keep the ones with a  
         # secondary relation name. (In other words, the name of the relation  
         # in which the field appears cannot be the same as the entity name.)  
         for my $fieldName (sort keys %{$fieldHash}) {  
             if ($fieldHash->{$fieldName}->{relation} ne $entity) {  
                 my $value = "$entity/$fieldName";  
                 push @{$fieldList}, $value;  
                 $labelList->{$value} = $fieldName;  
             }  
         }  
     }  
     # Now we have a hash and a list for each entity, and they correspond  
     # exactly to what the $cgi->optgroup function expects.  
     # The last step is to create the name for the onChange function. This function  
986      # may not do anything, but we need to know the name to generate the HTML      # may not do anything, but we need to know the name to generate the HTML
987      # for the menu.      # for the menu.
988      my $changeName = "${name}_setNotes";      my $changeName = "${name}_setNotes";
989      my $retVal = $cgi->popup_menu({name => $name,      my $retVal = $cgi->popup_menu({name => $name,
990                                     size => $height,                                     size => $height,
991                                     onChange => "$changeName(this.value)",                                     onChange => "$changeName(this.value)",
992                                     values => [map { $cgi->optgroup(-name => $_,                                     values => \@keys,
993                                                                     -values => $entities{$_},                                    });
                                                                    -labels => $labels{$_})  
                                                   } @entityNames]}  
                                  );  
994      # Create the change function.      # Create the change function.
995      $retVal .= "\n<script language=\"javascript\">\n";      $retVal .= "\n<script language=\"javascript\">\n";
996      $retVal .= "    function $changeName(fieldValue) {\n";      $retVal .= "    function $changeName(fieldValue) {\n";
997      # The function only has a body if we have a notes control to store the description.      # The function only has a body if we have a control to store data about the
998      if ($noteControl || $typeControl) {      # attribute.
999        if ($options{notes} || $options{type} || $options{groups}) {
1000          # Check to see if we're storing HTML or text into the note control.          # Check to see if we're storing HTML or text into the note control.
1001            my $noteControl = $options{notes};
1002          my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);          my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
1003          # We use a CASE statement based on the newly-selected field value. The          # We use a CASE statement based on the newly-selected field value. The
1004          # field description will be stored in the JavaScript variable "myText"          # field description will be stored in the JavaScript variable "myText"
# Line 641  Line 1007 
1007          $retVal .= "        var myText = \"\";\n";          $retVal .= "        var myText = \"\";\n";
1008          $retVal .= "        var myType = \"string\";\n";          $retVal .= "        var myType = \"string\";\n";
1009          $retVal .= "        switch (fieldValue) {\n";          $retVal .= "        switch (fieldValue) {\n";
1010          # Loop through the entities.          # Loop through the keys.
1011          for my $entity (@entityNames) {          for my $key (@keys) {
             # Get the entity's field hash. This has the notes in it.  
             my $fieldHash = $self->GetFieldTable($entity);  
             # Loop through the values we might see for this entity's fields.  
             my $fields = $entities{$entity};  
             for my $value (@{$fields}) {  
                 # Only proceed if we have an existing field.  
                 if ($value =~ m!/(.+)$!) {  
                     # Get the field's hash element.  
                     my $element = $fieldHash->{$1};  
1012                      # Generate this case.                      # Generate this case.
1013                      $retVal .= "        case \"$value\" :\n";              $retVal .= "        case \"$key\" :\n";
1014                      # Here we either want to update the note display, the                      # Here we either want to update the note display, the
1015                      # type display, or both.              # type display, the group list, or a combination of them.
1016                my ($type, $notes, @groups) = @{$keys{$key}};
1017                      if ($noteControl) {                      if ($noteControl) {
                         # Here we want the notes updated.  
                         my $notes = $element->{Notes}->{content};  
1018                          # Insure it's in the proper form.                          # Insure it's in the proper form.
1019                          if ($htmlMode) {                          if ($htmlMode) {
1020                              $notes = ERDB::HTMLNote($notes);                              $notes = ERDB::HTMLNote($notes);
# Line 668  Line 1024 
1024                          $notes =~ s/"/\\"/g;                          $notes =~ s/"/\\"/g;
1025                          $retVal .= "           myText = \"$notes\";\n";                          $retVal .= "           myText = \"$notes\";\n";
1026                      }                      }
1027                      if ($typeControl) {              if ($options{type}) {
1028                          # Here we want the type updated.                          # Here we want the type updated.
                         my $type = $element->{type};  
1029                          $retVal .= "           myType = \"$type\";\n";                          $retVal .= "           myType = \"$type\";\n";
1030                      }                      }
1031                if ($options{groups}) {
1032                    # Here we want the groups shown. Get a list of this attribute's groups.
1033                    # We'll search through this list for each group to see if it belongs with
1034                    # our attribute.
1035                    my $groupLiteral = "=" . join("=", @groups) . "=";
1036                    # Now we need some variables containing useful code for the javascript. It's
1037                    # worth knowing we go through a bit of pain to insure $groupField[i] isn't
1038                    # parsed as an array element.
1039                    my $groupField = $options{groups};
1040                    my $currentField = $groupField . "[i]";
1041                    # Do the javascript.
1042                    $retVal .= "           var groupList = \"$groupLiteral\";\n";
1043                    $retVal .= "           for (var i = 0; i < $groupField.length; i++) {\n";
1044                    $retVal .= "              var srchString = \"=\" + $currentField.value + \"=\";\n";
1045                    $retVal .= "              var srchLoc = groupList.indexOf(srchString);\n";
1046                    $retVal .= "              $currentField.checked = (srchLoc >= 0);\n";
1047                    $retVal .= "           }\n";
1048                }
1049                      # Close this case.                      # Close this case.
1050                      $retVal .= "           break;\n";                      $retVal .= "           break;\n";
1051                  }                  }
             }  
         }  
1052          # Close the CASE statement and make the appropriate assignments.          # Close the CASE statement and make the appropriate assignments.
1053          $retVal .= "        }\n";          $retVal .= "        }\n";
1054          if ($noteControl) {          if ($noteControl) {
1055              $retVal .= "        $noteControl = myText;\n";              $retVal .= "        $noteControl = myText;\n";
1056          }          }
1057          if ($typeControl) {          if ($options{type}) {
1058              $retVal .= "        $typeControl = myType;\n";              $retVal .= "        $options{type} = myType;\n";
1059          }          }
1060      }      }
1061      # Terminate the change function.      # Terminate the change function.
# Line 694  Line 1065 
1065      return $retVal;      return $retVal;
1066  }  }
1067    
1068  =head3 MatchSqlPattern  =head3 GetGroups
1069    
1070  C<< my $matched = CustomAttributes::MatchSqlPattern($value, $pattern); >>      my @groups = $attrDB->GetGroups();
1071    
1072  Determine whether or not a specified value matches an SQL pattern. An SQL  Return a list of the available groups.
 pattern has two wild card characters: C<%> that matches multiple characters,  
 and C<_> that matches a single character. These can be escaped using a  
 backslash (C<\>). We pull this off by converting the SQL pattern to a  
 PERL regular expression. As per SQL rules, the match is case-insensitive.  
1073    
1074  =over 4  =cut
1075    
1076  =item value  sub GetGroups {
1077        # Get the parameters.
1078        my ($self) = @_;
1079        # Get the groups.
1080        my @retVal = $self->GetFlat(['AttributeGroup'], "", [], 'AttributeGroup(id)');
1081        # Return them.
1082        return @retVal;
1083    }
1084    
1085    =head3 GetAttributeData
1086    
1087        my %keys = $attrDB->GetAttributeData($type, @list);
1088    
1089    Return attribute data for the selected attributes. The attribute
1090    data is a hash mapping each attribute key name to a n-tuple containing the
1091    data type, the description, and the groups. This is the same format expected in
1092    the L</FieldMenu> and L</ControlForm> methods for the list of attributes to display.
1093    
1094    =over 4
1095    
1096  Value to be matched against the pattern. Note that an undefined or empty  =item type
1097  value will not match anything.  
1098    Type of attribute criterion: C<name> for attributes whose names begin with the
1099    specified string, or C<group> for attributes in the specified group.
1100    
1101  =item pattern  =item list
1102    
1103  SQL pattern against which to match the value. An undefined or empty pattern will  List containing the names of the groups or keys for the desired attributes.
 match everything.  
1104    
1105  =item RETURN  =item RETURN
1106    
1107  Returns TRUE if the value and pattern match, else FALSE.  Returns a hash mapping each attribute key name to its data type, description, and
1108    parent groups.
1109    
1110  =back  =back
1111    
1112  =cut  =cut
1113    
1114  sub MatchSqlPattern {  sub GetAttributeData {
1115      # Get the parameters.      # Get the parameters.
1116      my ($value, $pattern) = @_;      my ($self, $type, @list) = @_;
1117      # Declare the return variable.      # Set up a hash to store the attribute data.
1118      my $retVal;      my %retVal = ();
1119      # Insure we have a pattern.      # Loop through the list items.
1120      if (! defined($pattern) || $pattern eq "") {      for my $item (@list) {
1121          $retVal = 1;          # Set up a query for the desired attributes.
1122      } else {          my $query;
1123          # Break the pattern into pieces around the wildcard characters. Because we          if ($type eq 'name') {
1124          # use parentheses in the split function's delimiter expression, we'll get              # Here we're doing a generic name search. We need to escape it and then tack
1125          # list elements for the delimiters as well as the rest of the string.              # on a %.
1126          my @pieces = split /([_%]|\\[_%])/, $pattern;              my $parm = $item;
1127          # Check some fast special cases.              $parm =~ s/_/\\_/g;
1128          if ($pattern eq '%') {              $parm =~ s/%/\\%/g;
1129              # A null pattern matches everything.              $parm .= "%";
1130              $retVal = 1;              # Ask for matching attributes. (Note that if the user passed in a null string
1131          } elsif (@pieces == 1) {              # he'll get everything.)
1132              # No wildcards, so we have a literal comparison. Note we're case-insensitive.              $query = $self->Get(['AttributeKey'], "AttributeKey(id) LIKE ?", [$parm]);
1133              $retVal = (lc($value) eq lc($pattern));          } elsif ($type eq 'group') {
1134          } elsif (@pieces == 2 && $pieces[1] eq '%') {              $query = $self->Get(['IsInGroup', 'AttributeKey'], "IsInGroup(to-link) = ?", [$item]);
             # A wildcard at the end, so we have a substring match. This is also case-insensitive.  
             $retVal = (lc(substr($value, 0, length($pieces[0]))) eq lc($pieces[0]));  
         } else {  
             # Okay, we have to do it the hard way. Convert each piece to a PERL pattern.  
             my $realPattern = "";  
             for my $piece (@pieces) {  
                 # Determine the type of piece.  
                 if ($piece eq "") {  
                     # Empty pieces are ignored.  
                 } elsif ($piece eq "%") {  
                     # Here we have a multi-character wildcard. Note that it can match  
                     # zero or more characters.  
                     $realPattern .= ".*"  
                 } elsif ($piece eq "_") {  
                     # Here we have a single-character wildcard.  
                     $realPattern .= ".";  
                 } elsif ($piece eq "\\%" || $piece eq "\\_") {  
                     # This is an escape sequence (which is a rare thing, actually).  
                     $realPattern .= substr($piece, 1, 1);  
1135                  } else {                  } else {
1136                      # Here we have raw text.              Confess("Unknown attribute query type \"$type\".");
                     $realPattern .= quotemeta($piece);  
1137                  }                  }
1138            while (my $row = $query->Fetch()) {
1139                # Get this attribute's data.
1140                my ($key, $type, $notes) = $row->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
1141                                                         'AttributeKey(description)']);
1142                # If it's new, get its groups and add it to the return hash.
1143                if (! exists $retVal{$key}) {
1144                    my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?",
1145                                                [$key], 'IsInGroup(to-link)');
1146                    $retVal{$key} = [$type, $notes, @groups];
1147              }              }
             # Do the match.  
             $retVal = ($value =~ /^$realPattern$/i ? 1 : 0);  
1148          }          }
1149      }      }
1150      # Return the result.      # Return the result.
1151      return $retVal;      return %retVal;
1152  }  }
1153    
1154  =head3 MigrateAttributes  =head3 LogOperation
1155    
1156  C<< CustomAttributes::MigrateAttributes($fig); >>      $ca->LogOperation($action, $target, $description);
1157    
1158  Migrate all the attributes data from the specified FIG instance. This is a long, slow  Write an operation description to the attribute activity log (C<$FIG_Config::var/attributes.log>).
 method used to convert the old attribute data to the new system. Only attribute  
 keys that are not already in the database will be loaded, and only for entity instances  
 current in the database. To get an accurate capture of the attributes in the given  
 instance, you may want to clear the database and the DBD before starting and  
 run L</Refresh> to populate the entities.  
1159    
1160  =over 4  =over 4
1161    
1162  =item fig  =item action
1163    
1164  A FIG object that can be used to retrieve attributes for migration purposes.  Action being logged (e.g. C<Delete Group> or C<Load Key>).
1165    
1166    =item target
1167    
1168    ID of the key or group affected.
1169    
1170    =item description
1171    
1172    Short description of the action.
1173    
1174  =back  =back
1175    
1176  =cut  =cut
1177    
1178  sub MigrateAttributes {  sub LogOperation {
1179      # Get the parameters.      # Get the parameters.
1180      my ($fig) = @_;      my ($self, $action, $target, $description) = @_;
1181      # Get a list of the objects to migrate. This requires connecting. Note we      # Get the user ID.
1182      # will map each entity type to a file name. The file will contain a list      my $user = $self->{user};
1183      # of the object's IDs so we can get to them when we're not connected to      # Get a timestamp.
1184      # the database.      my $timeString = Tracer::Now();
1185      my $ca = CustomAttributes->new();      # Open the log file for appending.
1186      my %objects = map { $_ => "$FIG_Config::temp/$_.keys.tbl" } $ca->GetEntityTypes();      my $oh = Open(undef, ">>$FIG_Config::var/attributes.log");
1187      # Set up hash of the existing attribute keys for each entity type.      # Write the data to it.
1188      my %oldKeys = ();      Tracer::PutLine($oh, [$timeString, $user, $action, $target, $description]);
1189      # Finally, we have a hash that counts the IDs for each entity type.      # Close the log file.
1190      my %idCounts = map { $_ => 0 } keys %objects;      close $oh;
1191      # Loop through the list, creating key files to read back in.  }
1192      for my $entityType (keys %objects) {  
1193          Trace("Retrieving keys for $entityType.") if T(2);  =head2 Internal Utility Methods
1194          # Create the key file.  
1195          my $idFile = Open(undef, ">$objects{$entityType}");  =head3 _KeywordString
1196          # Loop through the keys.  
1197          my @ids = $ca->GetFlat([$entityType], "", [], "$entityType(id)");      my $keywordString = $ca->_KeywordString($key, $value);
1198          for my $id (@ids) {  
1199              print $idFile "$id\n";  Compute the keyword string for a specified key/value pair. This consists of the
1200          }  key name and value converted to lower case with underscores translated to spaces.
1201          close $idFile;  
1202          # In addition to the key file, we must get a list of attributes already  This method is for internal use only. It is called whenever we need to update or
1203          # in the database. This avoids a circularity problem that might occur if the $fig  insert a B<HasValueFor> record.
1204          # object is retrieving from the custom attributes database already.  
1205          my %fields = $ca->GetSecondaryFields($entityType);  =over 4
1206          $oldKeys{$entityType} = \%fields;  
1207          # Finally, we have the ID count.  =item key
1208          $idCounts{$entityType} = scalar @ids;  
1209      }  Name of the relevant attribute key.
1210      # Release the custom attributes database so we can add attributes.  
1211      undef $ca;  =item target
1212      # Loop through the objects.  
1213      for my $entityType (keys %objects) {  ID of the target object to which this key/value pair will be associated.
1214          # Get a hash of all the attributes already in this database. These are  
1215          # left untouched.  =item value
1216          my $myOldKeys = $oldKeys{$entityType};  
1217          # Create a hash to control the load file names for each attribute key we find.  The value to store for this key/object combination.
1218          my %keyHash = ();  
1219          # Set up some counters so we can trace our progress.  =item RETURN
1220          my ($totalIDs, $processedIDs, $keyCount, $valueCount) = ($idCounts{$entityType}, 0, 0, 0);  
1221          # Open this object's ID file.  Returns the value that should be stored as the keyword string for the specified
1222          Trace("Migrating data for $entityType. $totalIDs found.") if T(3);  key/value pair.
1223          my $keysIn = Open(undef, "<$objects{$entityType}");  
1224          while (my $id = <$keysIn>) {  =back
1225              # Remove the EOL characters.  
1226              chomp $id;  =cut
1227              # Get this object's attributes.  
1228              my @allData = $fig->get_attributes($id);  sub _KeywordString {
1229              Trace(scalar(@allData) . " attribute values found for $entityType($id).") if T(4);      # Get the parameters.
1230              # Loop through the attribute values one at a time.      my ($self, $key, $value) = @_;
1231              for my $dataTuple (@allData) {      # Get a copy of the key name and convert underscores to spaces.
1232                  # Get the key, value, and URL. We ignore the first element because that's the      my $keywordString = $key;
1233                  # object ID, and we already know the object ID.      $keywordString =~ s/_/ /g;
1234                  my (undef, $key, $value, $url) = @{$dataTuple};      # Add the value convert it all to lower case.
1235                  # Remove the buggy "1" for $url.      my $retVal = lc "$keywordString $value";
1236                  if ($url eq "1") {      # Return the result.
1237                      $url = undef;      return $retVal;
                 }  
                 # Only proceed if this is not an old key.  
                 if (! $myOldKeys->{$key}) {  
                     # See if we've run into this key before.  
                     if (! exists $keyHash{$key}) {  
                         # Here we need to create the attribute key in the database.  
                         StoreAttributeKey($entityType, $key, 'text',  
                                           "Key migrated automatically from the FIG system. " .  
                                           "Please replace these notes as soon as possible " .  
                                           "with useful text."  
                                          );  
                         # Compute the attribute's load file name and open it for output.  
                         my $fileName = "$FIG_Config::temp/$entityType.$key.load.tbl";  
                         my $fh = Open(undef, ">$fileName");  
                         # Store the file name and handle.  
                         $keyHash{$key} = {h => $fh, name => $fileName};  
                         # Count this key.  
                         $keyCount++;  
                     }  
                     # Smash the value and the URL together.  
                     if (defined($url) && length($url) > 0) {  
                         $value .= "::$url";  
                     }  
                     # Write the attribute value to the load file.  
                     Tracer::PutLine($keyHash{$key}->{h}, [$id, $value]);  
                     $valueCount++;  
                 }  
             }  
             # Now we've finished all the attributes for this object. Count and trace it.  
             $processedIDs++;  
             if ($processedIDs % 500 == 0) {  
                 Trace("$processedIDs of $totalIDs ${entityType}s processed.") if T(3);  
                 Trace("$entityType has $keyCount keys and $valueCount values so far.") if T(3);  
             }  
         }  
         # Now we've finished all the attributes for all objects of this type.  
         Trace("$processedIDs ${entityType}s processed, with $keyCount keys and $valueCount values.") if T(2);  
         # Loop through the files, loading the keys into the database.  
         Trace("Connecting to database.") if T(2);  
         my $objectCA = CustomAttributes->new();  
         Trace("Loading key files.") if T(2);  
         for my $key (sort keys %keyHash) {  
             # Close the key's load file.  
             close $keyHash{$key}->{h};  
             # Reopen it for input.  
             my $fileName = $keyHash{$key}->{name};  
             my $fh = Open(undef, "<$fileName");  
             Trace("Loading $key from $fileName.") if T(3);  
             my $stats = $objectCA->LoadAttributeKey($entityType, $key, $fh, 0, 1);  
             Trace("Statistics for $key of $entityType:\n" . $stats->Show()) if T(3);  
         }  
         # All the keys for this entity type are now loaded.  
         Trace("Key files loaded for $entityType.") if T(2);  
     }  
     # All keys for all entity types are now loaded.  
     Trace("Migration complete.") if T(2);  
1238  }  }
1239    
1240  =head3 ComputeObjectTypeFromID  =head3 _QueryResults
1241    
1242  C<< my ($entityName, $id) = CustomAttributes::ComputeObjectTypeFromID($objectID); >>      my @attributeList = $attrDB->_QueryResults($query, @values);
1243    
1244  This method will compute the entity type corresponding to a specified object ID.  Match the results of a B<HasValueFor> query against value criteria and return
1245  If the object ID begins with C<fig|>, it is presumed to be a feature ID. If it  the results. This is an internal method that splits the values coming back
1246  is all digits with a single period, it is presumed to by a genome ID. Otherwise,  and matches the sections against the specified section patterns. It serves
1247  it must be a list reference. In this last case the first list element will be  as the back end to L</GetAttributes> and L</FindAttributes>.
 taken as the entity type and the second will be taken as the actual ID.  
1248    
1249  =over 4  =over 4
1250    
1251  =item objectID  =item query
1252    
1253  Object ID to examine.  A query object that will return the desired B<HasValueFor> records.
1254    
1255    =item values
1256    
1257    List of the desired attribute values, section by section. If C<undef>
1258    or an empty string is specified, all values in that section will match. A
1259    generic match can be requested by placing a percent sign (C<%>) at the end.
1260    In that case, all values that match up to and not including the percent sign
1261    will match. You may also specify a regular expression enclosed
1262    in slashes. All values that match the regular expression will be returned. For
1263    performance reasons, only values have this extra capability.
1264    
1265  =item RETURN  =item RETURN
1266    
1267  Returns a 2-element list consisting of the entity type followed by the specified ID.  Returns a list of tuples. The first element in the tuple is an object ID, the
1268    second is an attribute key, and the remaining elements are the sections of
1269    the attribute value. All of the tuples will match the criteria set forth in
1270    the parameter list.
1271    
1272  =back  =back
1273    
1274  =cut  =cut
1275    
1276  sub ComputeObjectTypeFromID {  sub _QueryResults {
1277      # Get the parameters.      # Get the parameters.
1278      my ($objectID) = @_;      my ($self, $query, @values) = @_;
1279      # Declare the return variables.      # Declare the return value.
1280      my ($entityName, $id);      my @retVal = ();
1281      # Only proceed if the object ID is defined. If it's not, we'll be returning a      # Get the number of value sections we have to match.
1282      # pair of undefs.      my $sectionCount = scalar(@values);
1283      if ($objectID) {      # Loop through the assignments found.
1284          if (ref $objectID eq 'ARRAY') {      while (my $row = $query->Fetch()) {
1285              # Here we have the new-style list reference. Pull out its pieces.          # Get the current row's data.
1286              ($entityName, $id) = @{$objectID};          my ($id, $realKey, $subKey, $valueString) = $row->Values(['HasValueFor(to-link)',
1287          } else {                                                                    'HasValueFor(from-link)',
1288              # Here the ID is the outgoing ID, and we need to look at its structure                                                                    'HasValueFor(subkey)',
1289              # to determine the entity type.                                                                    'HasValueFor(value)'
1290              $id = $objectID;                                                                  ]);
1291              if ($objectID =~ /^\d+\.\d+/) {          # Form the key from the real key and the sub key.
1292                  # Digits with a single period is a genome.          my $key = $self->JoinKey($realKey, $subKey);
1293                  $entityName = 'Genome';          # Break the value into sections.
1294              } elsif ($objectID =~ /^fig\|/) {          my @sections = split($self->{splitter}, $valueString);
1295                  # The "fig|" prefix indicates a feature.          # Match each section against the incoming values. We'll assume we're
1296                  $entityName = 'Feature';          # okay unless we learn otherwise.
1297            my $matching = 1;
1298            for (my $i = 0; $i < $sectionCount && $matching; $i++) {
1299                # We need to check to see if this section is generic.
1300                my $value = $values[$i];
1301                Trace("Current value pattern is \"$value\".") if T(4);
1302                if (substr($value, -1, 1) eq '%') {
1303                    Trace("Generic match used.") if T(4);
1304                    # Here we have a generic match.
1305                    my $matchLen = length($values[$i]) - 1;
1306                    $matching = substr($sections[$i], 0, $matchLen) eq
1307                                substr($values[$i], 0, $matchLen);
1308                } elsif ($value =~ m#^/(.+)/[a-z]*$#) {
1309                    Trace("Regular expression detected.") if T(4);
1310                    # Here we have a regular expression match.
1311                    my $section = $sections[$i];
1312                    $matching = eval("\$section =~ $value");
1313              } else {              } else {
1314                  # Anything else is illegal!                  # Here we have a strict match.
1315                  Confess("Invalid attribute ID specification \"$objectID\".");                  Trace("Strict match used.") if T(4);
1316                    $matching = ($sections[$i] eq $values[$i]);
1317              }              }
1318          }          }
1319            # If we match, output this row to the return list.
1320            if ($matching) {
1321                push @retVal, [$id, $key, @sections];
1322      }      }
1323      # Return the result.      }
1324      return ($entityName, $id);      # Return the rows found.
1325        return @retVal;
1326  }  }
1327    
1328  =head2 FIG Method Replacements  =head2 FIG Method Replacements
1329    
1330  The following methods are used by B<FIG.pm> to replace the previous attribute functionality.  The following methods are used by B<FIG.pm> to replace the previous attribute functionality.
1331  Some of the old functionality is no longer present. Controlled vocabulary is no longer  Some of the old functionality is no longer present: controlled vocabulary is no longer
1332  supported and there is no longer any searching by URL. Fortunately, neither of these  supported and there is no longer any searching by URL. Fortunately, neither of these
1333  capabilities were used in the old system.  capabilities were used in the old system.
1334    
# Line 982  Line 1336 
1336  The idea is that these methods represent attribute manipulation allowed by all users, while  The idea is that these methods represent attribute manipulation allowed by all users, while
1337  the others are only for privileged users with access to the attribute server.  the others are only for privileged users with access to the attribute server.
1338    
1339  In the previous implementation, an attribute had a value and a URL. In the new implementation,  In the previous implementation, an attribute had a value and a URL. In this implementation,
1340  there is only a value. In this implementation, each attribute has only a value. These  each attribute has only a value. These methods will treat the value as a list with the individual
1341  methods will treat the value as a list with the individual elements separated by the  elements separated by the value of the splitter parameter on the constructor (L</new>). The default
1342  value of the splitter parameter on the constructor (L</new>). The default is double  is double colons C<::>.
 colons C<::>.  
1343    
1344  So, for example, an old-style keyword with a /value of C<essential> and a URL of  So, for example, an old-style keyword with a value of C<essential> and a URL of
1345  C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default  C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default
1346  splitter value would be stored as  splitter value would be stored as
1347    
# Line 999  Line 1352 
1352    
1353  =head3 GetAttributes  =head3 GetAttributes
1354    
1355  C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @valuePatterns); >>      my @attributeList = $attrDB->GetAttributes($objectID, $key, @values);
1356    
1357  In the database, attribute values are sectioned into pieces using a splitter  In the database, attribute values are sectioned into pieces using a splitter
1358  value specified in the constructor (L</new>). This is not a requirement of  value specified in the constructor (L</new>). This is not a requirement of
1359  the attribute system as a whole, merely a convenience for the purpose of  the attribute system as a whole, merely a convenience for the purpose of
1360  these methods. If you are using the static method calls instead of the  these methods. If a value has multiple sections, each section
1361  object-based calls, the splitter will always be the default value of  is matched against the corresponding criterion in the I<@valuePatterns> list.
 double colons (C<::>). If a value has multiple sections, each section  
 is matched against the correspond criterion in the I<@valuePatterns> list.  
1362    
1363  This method returns a series of tuples that match the specified criteria. Each tuple  This method returns a series of tuples that match the specified criteria. Each tuple
1364  will contain an object ID, a key, and one or more values. The parameters to this  will contain an object ID, a key, and one or more values. The parameters to this
1365  method therefore correspond structurally to the values expected in each tuple.  method therefore correspond structurally to the values expected in each tuple. In
1366    addition, you can ask for a generic search by suffixing a percent sign (C<%>) to any
1367    of the parameters. So, for example,
1368    
1369      my @attributeList = GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);      my @attributeList = $attrDB->GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);
1370    
1371  would return something like  would return something like
1372    
# Line 1022  Line 1375 
1375      ['fig}100226.1.peg.1004', 'structure2', 1, 2]      ['fig}100226.1.peg.1004', 'structure2', 1, 2]
1376      ['fig}100226.1.peg.1004', 'structureA', 1, 2]      ['fig}100226.1.peg.1004', 'structureA', 1, 2]
1377    
1378  Use of C<undef> in any position acts as a wild card (all values). In addition,  Use of C<undef> in any position acts as a wild card (all values). You can also specify
1379  the I<$key> and I<@valuePatterns> parameters can contain SQL pattern characters: C<%>, which  a list reference in the ID column. Thus,
1380  matches any sequence of characters, and C<_>, which matches any single character.  
1381  (You can use an escape sequence C<\%> or C<\_> to match an actual percent sign or      my @attributeList = $attrDB->GetAttributes(['100226.1', 'fig|100226.1.%'], 'PUBMED');
1382  underscore.)  
1383    would get the PUBMED attribute data for Streptomyces coelicolor A3(2) and all its
1384    features.
1385    
1386  In addition to values in multiple sections, a single attribute key can have multiple  In addition to values in multiple sections, a single attribute key can have multiple
1387  values, so even  values, so even
1388    
1389      my @attributeList = GetAttributes($peg, 'virulent');      my @attributeList = $attrDB->GetAttributes($peg, 'virulent');
1390    
1391  which has no wildcard in the key or the object ID, may return multiple tuples.  which has no wildcard in the key or the object ID, may return multiple tuples.
1392    
1393  For reasons of backward compatability, we examine the structure of the object ID to  Value matching in this system works very poorly, because of the way multiple values are
1394  determine the entity type. In that case the only two types allowed are C<Genome> and  stored. For the object ID, key name, and first value, we create queries that filter for the
1395  C<Feature>. An alternative method is to use a list reference, with the list consisting  desired results. On any filtering by value, we must do a comparison after the attributes are
1396  of an entity type name and the actual ID. Thus, the above example could equivalently  retrieved from the database, since the database has no notion of the multiple values, which
1397  be written as  are stored in a single string. As a result, queries in which filter only on value end up
1398    reading a lot more than they need to.
     my @attributeList = GetAttributes([Feature => $peg], 'virulent');  
   
 The list-reference approach allows us to add attributes to other entity types in  
 the future. Doing so, however, will require modifying the L</Refresh> method and  
 updated the database design XML.  
   
 The list-reference approach also allows for a more fault-tolerant approach to  
 getting all objects with a particular attribute.  
   
     my @attributeList = GetAttributes([Feature => undef], 'virulent');  
   
 will only return feature attributes, while  
   
     my @attributeList = GetAttributes(undef, 'virulent');  
   
 could at some point in the future get you attributes for genomes or even subsystems  
 as well as features.  
1399    
1400  =over 4  =over 4
1401    
1402  =item objectID  =item objectID
1403    
1404  ID of the genome or feature whose attributes are desired. In general, an ID that  ID of object whose attributes are desired. If the attributes are desired for multiple
1405  starts with C<fig|> is treated as a feature ID, and an ID that is all digits with a  objects, this parameter can be specified as a list reference. If the attributes are
1406  single period is treated as a genome ID. For other entity types, use a list reference; in  desired for all objects, specify C<undef> or an empty string. Finally, you can specify
1407  this case the first list element is the entity type and the second is the ID. A value of  attributes for a range of object IDs by putting a percent sign (C<%>) at the end.
 C<undef> or an empty string here will match all objects.  
1408    
1409  =item key  =item key
1410    
1411  Attribute key name. Since attributes are stored as fields in the database with a  Attribute key name. A value of C<undef> or an empty string will match all
1412  field name equal to the key name, it is very fast to find a list of all the  attribute keys. If the values are desired for multiple keys, this parameter can be
1413  matching keys. Each key's values require a separate query, however, which may  specified as a list reference. Finally, you can specify attributes for a range of
1414  be a performance problem if the pattern matches a lot of keys. Wild cards are  keys by putting a percent sign (C<%>) at the end.
 acceptable here, and a value of C<undef> or an empty string will match all  
 attribute keys.  
1415    
1416  =item valuePatterns  =item values
1417    
1418  List of the desired attribute values, section by section. If C<undef>  List of the desired attribute values, section by section. If C<undef>
1419  or an empty string is specified, all values in that section will match.  or an empty string is specified, all values in that section will match. A
1420    generic match can be requested by placing a percent sign (C<%>) at the end.
1421    In that case, all values that match up to and not including the percent sign
1422    will match. You may also specify a regular expression enclosed
1423    in slashes. All values that match the regular expression will be returned. For
1424    performance reasons, only values have this extra capability.
1425    
1426  =item RETURN  =item RETURN
1427    
# Line 1096  Line 1436 
1436    
1437  sub GetAttributes {  sub GetAttributes {
1438      # Get the parameters.      # Get the parameters.
1439      my ($self, $objectID, $key, @valuePatterns) = @_;      my ($self, $objectID, $key, @values) = @_;
1440      # Declare the return variable.      # This hash will map "HasValueFor" fields to patterns. We use it to build the
1441      my @retVal = ();      # SQL statement.
1442      # Determine the entity types for our search.      my %data;
1443      my @objects = ();      # Before we do anything else, we must parse the key. The key is treated by the
1444      my ($actualObjectID, $computedType);      # user as a single field, but to us it's actually a real key and a subkey.
1445      if (! $objectID) {      # If the key has no splitter and is exact, the real key is the original key
1446          push @objects, $self->GetEntityTypes();      # and the subkey is an empty string. If the key has a splitter, it is
1447      } else {      # split into two pieces and each piece is processed separately. If the key has
1448          ($computedType, $actualObjectID) = ComputeObjectTypeFromID($objectID);      # no splitter and is generic, the real key is the incoming key and the subkey
1449          push @objects, $computedType;      # is allowed to be wild. Of course, this only matters if an actual key has
1450      }      # been specified.
1451      # Loop through the entity types.      if (defined $key) {
1452      for my $entityType (@objects) {          if ($key =~ /$self->{splitter}/) {
1453          # Now we need to find all the matching keys. The keys are actually stored in              # Here we have a two-part key, so we split it normally.
1454          # our database object, so this process is fast. Note that our              my ($realKey, $subKey) = $self->SplitKey($key);
1455          # MatchSqlPattern method              $data{'HasValueFor(from-link)'} = $realKey;
1456          my %secondaries = $self->GetSecondaryFields($entityType);              $data{'HasValueFor(subkey)'} = $subKey;
1457          my @fieldList = grep { MatchSqlPattern($_, $key) } keys %secondaries;          } elsif (substr($key, -1, 1) eq '%') {
1458          # Now we figure out whether or not we need to filter by object. We will always              $data{'HasValueFor(from-link)'} = $key;
1459          # filter by key to a limited extent, so if we're filtering by object we need an          } else {
1460          # AND to join the object ID filter with the key filter.              $data{'HasValueFor(from-link)'} = $key;
1461          my $filter = "";              $data{'HasValueFor(subkey)'} = '';
1462          my @params = ();          }
1463          if (defined($actualObjectID)) {      }
1464              # Here the caller wants to filter on object ID.      # Add the object ID to the key information.
1465              $filter = "$entityType(id) = ? AND ";      $data{'HasValueFor(to-link)'} = $objectID;
1466              push @params, $actualObjectID;      # The first value represents a problem, because we can search it using SQL, but not
1467          }      # in the normal way. If the user specifies a generic search or exact match for
1468          # It's time to begin making queries. We process one attribute key at a time, because      # every alternative value (remember, the values may be specified as a list),
1469          # each attribute is actually a different field in the database. We know here that      # then we can create SQL filtering for it. If any of the values are specified
1470          # all the keys we've collected are for the correct entity because we got them from      # as a regular expression, however, that's a problem, because we need to read
1471          # the DBD. That's a good thing, because an invalid key name will cause an SQL error.      # every value to verify a match.
1472          for my $key (@fieldList) {      if (@values > 0) {
1473              # Get all of the attribute values for this key.          # Get the first value and put its alternatives in an array.
1474              my @dataRows = $self->GetAll([$entityType], "$filter$entityType($key) IS NOT NULL",          my $valueParm = $values[0];
1475                                           \@params, ["$entityType(id)", "$entityType($key)"]);          my @valueList;
1476              # Process each value separately. We need to verify the values and reformat the          if (ref $valueParm eq 'ARRAY') {
1477              # tuples. Note that GetAll will give us one row per matching object ID,              @valueList = @{$valueParm};
1478              # with the ID first followed by a list of the data values. This is very          } else {
1479              # different from the structure we'll be returning, which has one row              @valueList = ($valueParm);
1480              # per value.          }
1481              for my $dataRow (@dataRows) {          # Okay, now we have all the possible criteria for the first value in the list
1482                  # Get the object ID and the list of values.          # @valueList. We'll copy the values to a new array in which they have been
1483                  my ($rowObjectID, @dataValues) = @{$dataRow};          # converted to generic requests. If we find a regular-expression match
1484                  # Loop through the values. There will be one result row per attribute value.          # anywhere in the list, we toss the whole thing.
1485                  for my $dataValue (@dataValues) {          my @valuePatterns = ();
1486                      # Separate this value into sections.          my $okValues = 1;
1487                      my @sections = split("::", $dataValue);          for my $valuePattern (@valueList) {
1488                      # Loop through the value patterns, looking for a mismatch. Note that              # Check the pattern type.
1489                      # since we're working through parallel arrays, we are using an index              if (substr($valuePattern, 0, 1) eq '/') {
1490                      # loop. As soon as a match fails we stop checking. This means that                  # Regular expressions invalidate the entire process.
1491                      # if the value pattern list is longer than the number of sections,                  $okValues = 0;
1492                      # we will fail as soon as we run out of sections.              } elsif (substr($valuePattern, -1, 1) eq '%') {
1493                      my $match = 1;                  # A Generic pattern is passed in unmodified.
1494                      for (my $i = 0; $i <= $#valuePatterns && $match; $i++) {                  push @valuePatterns, $valuePattern;
1495                          $match = MatchSqlPattern($sections[$i], $valuePatterns[$i]);              } else {
1496                      }                  # An exact match is converted to generic.
1497                      # If we match, we save this value in the output list.                  push @valuePatterns, "$valuePattern%";
1498                      if ($match) {              }
1499                          push @retVal, [$rowObjectID, $key, @sections];          }
1500                      }          # If everything works, add the value data to the filtering hash.
1501                  }          if ($okValues) {
1502                  # Here we've processed all the attribute values for the current object ID.              $data{'HasValueFor(value)'} = \@valuePatterns;
1503            }
1504        }
1505        # Create some lists to contain the filter fragments and parameter values.
1506        my @filter = ();
1507        my @parms = ();
1508        # This next loop goes through the different fields that can be specified in the
1509        # parameter list and generates filters for each. The %data hash that we built above
1510        # contains all the necessary information to do this.
1511        for my $field (keys %data) {
1512            # Accumulate filter information for this field. We will OR together all the
1513            # elements accumulated to create the final result.
1514            my @fieldFilter = ();
1515            # Get the specified data from the caller.
1516            my $fieldPattern = $data{$field};
1517            # Only proceed if the pattern is one that won't match everything.
1518            if (defined($fieldPattern) && $fieldPattern ne "" && $fieldPattern ne "%") {
1519                # Convert the pattern to an array.
1520                my @patterns = ();
1521                if (ref $fieldPattern eq 'ARRAY') {
1522                    push @patterns, @{$fieldPattern};
1523                } else {
1524                    push @patterns, $fieldPattern;
1525                }
1526                # Only proceed if the array is nonempty. The loop will work fine if the
1527                # array is empty, but when we build the filter string at the end we'll
1528                # get "()" in the filter list, which will result in an SQL syntax error.
1529                if (@patterns) {
1530                    # Loop through the individual patterns.
1531                    for my $pattern (@patterns) {
1532                        # Check for a generic request.
1533                        if (substr($pattern, -1, 1) ne '%') {
1534                            # Here we have a normal request.
1535                            push @fieldFilter, "$field = ?";
1536                            push @parms, $pattern;
1537                        } else {
1538                            # Here we have a generic request, so we will use the LIKE operator to
1539                            # filter the field to this value pattern.
1540                            push @fieldFilter, "$field LIKE ?";
1541                            # We must convert the pattern value to an SQL match pattern. First
1542                            # we get a copy of it.
1543                            my $actualPattern = $pattern;
1544                            # Now we escape the underscores. Underscores are an SQL wild card
1545                            # character, but they are used frequently in key names and object IDs.
1546                            $actualPattern =~ s/_/\\_/g;
1547                            # Add the escaped pattern to the bound parameter list.
1548                            push @parms, $actualPattern;
1549                        }
1550                    }
1551                    # Form the filter for this field.
1552                    my $fieldFilterString = join(" OR ", @fieldFilter);
1553                    push @filter, "($fieldFilterString)";
1554              }              }
             # Here we've processed all the rows returned by GetAll. In general, there will  
             # be one row per object ID.  
1555          }          }
         # Here we've processed all the matching attribute keys.  
1556      }      }
1557      # Here we've processed all the entity types. That means @retVal has all the matching      # Now @filter contains one or more filter strings and @parms contains the parameter
1558      # results.      # values to bind to them.
1559        my $actualFilter = join(" AND ", @filter);
1560        # Insure we have at least one filter.
1561        if (! $actualFilter) {
1562            Confess("No filter specified in GetAttributes query.");
1563        }
1564        # Now we're ready to make our query.
1565        my $query = $self->Get(['HasValueFor'], $actualFilter, \@parms);
1566        # Format the results.
1567        my @retVal = $self->_QueryResults($query, @values);
1568        # Return the rows found.
1569      return @retVal;      return @retVal;
1570  }  }
1571    
1572  =head3 AddAttribute  =head3 AddAttribute
1573    
1574  C<< $attrDB->AddAttribute($objectID, $key, @values); >>      $attrDB->AddAttribute($objectID, $key, @values);
1575    
1576  Add an attribute key/value pair to an object. This method cannot add a new key, merely  Add an attribute key/value pair to an object. This method cannot add a new key, merely
1577  add a value to an existing key. Use L</StoreAttributeKey> to create a new key.  add a value to an existing key. Use L</StoreAttributeKey> to create a new key.
# Line 1182  Line 1580 
1580    
1581  =item objectID  =item objectID
1582    
1583  ID of the genome or feature to which the attribute is to be added. In general, an ID that  ID of the object to which the attribute is to be added.
 starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods  
 is treated as a genome ID. For IDs of other types, this parameter should be a reference  
 to a 2-tuple consisting of the entity type name followed by the object ID.  
1584    
1585  =item key  =item key
1586    
1587  Attribute key name. This corresponds to the name of a field in the database.  Attribute key name.
1588    
1589  =item values  =item values
1590    
# Line 1212  Line 1607 
1607      } elsif (! @values) {      } elsif (! @values) {
1608          Confess("No values specified in AddAttribute call for key $key.");          Confess("No values specified in AddAttribute call for key $key.");
1609      } else {      } else {
1610          # Okay, now we have some reason to believe we can do this. Start by          # Okay, now we have some reason to believe we can do this. Form the values
1611          # computing the object type and ID.          # into a scalar.
         my ($entityName, $id) = ComputeObjectTypeFromID($objectID);  
         # Form the values into a scalar.  
1612          my $valueString = join($self->{splitter}, @values);          my $valueString = join($self->{splitter}, @values);
1613          # Insert the value.          # Split up the key.
1614          $self->InsertValue($id, "$entityName($key)", $valueString);          my ($realKey, $subKey) = $self->SplitKey($key);
1615            # Connect the object to the key.
1616            $self->InsertObject('HasValueFor', { 'from-link' => $realKey,
1617                                                 'to-link'   => $objectID,
1618                                                 'subkey'    => $subKey,
1619                                                 'value'     => $valueString,
1620                                           });
1621      }      }
1622      # Return a one. We do this for backward compatability.      # Return a one, indicating success. We do this for backward compatability.
1623      return 1;      return 1;
1624  }  }
1625    
1626  =head3 DeleteAttribute  =head3 DeleteAttribute
1627    
1628  C<< $attrDB->DeleteAttribute($objectID, $key, @values); >>      $attrDB->DeleteAttribute($objectID, $key, @values);
1629    
1630  Delete the specified attribute key/value combination from the database.  Delete the specified attribute key/value combination from the database.
1631    
 The first form will connect to the database and release it. The second form  
 uses the database connection contained in the object.  
   
1632  =over 4  =over 4
1633    
1634  =item objectID  =item objectID
1635    
1636  ID of the genome or feature to which the attribute is to be added. In general, an ID that  ID of the object whose attribute is to be deleted.
 starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods  
 is treated as a genome ID. For IDs of other types, this parameter should be a reference  
 to a 2-tuple consisting of the entity type name followed by the object ID.  
1637    
1638  =item key  =item key
1639    
1640  Attribute key name. This corresponds to the name of a field in the database.  Attribute key name.
1641    
1642  =item values  =item values
1643    
1644  One or more values to be associated with the key.  One or more values associated with the key. If no values are specified, then all values
1645    will be deleted. Otherwise, only a matching value will be deleted.
1646    
1647  =back  =back
1648    
# Line 1262  Line 1656 
1656          Confess("No object ID specified for DeleteAttribute call.");          Confess("No object ID specified for DeleteAttribute call.");
1657      } elsif (! defined($key)) {      } elsif (! defined($key)) {
1658          Confess("No attribute key specified for DeleteAttribute call.");          Confess("No attribute key specified for DeleteAttribute call.");
     } elsif (! @values) {  
         Confess("No values specified in DeleteAttribute call for key $key.");  
1659      } else {      } else {
1660          # Now compute the object type and ID.          # Split the key into the real key and the subkey.
1661          my ($entityName, $id) = ComputeObjectTypeFromID($objectID);          my ($realKey, $subKey) = $self->SplitKey($key);
1662          # Form the values into a scalar.          if ($subKey eq '' && scalar(@values) == 0) {
1663                # Here we erase the entire key for this object.
1664                $self->DeleteRow('HasValueFor', $key, $objectID);
1665            } else {
1666                # Here we erase the matching values.
1667          my $valueString = join($self->{splitter}, @values);          my $valueString = join($self->{splitter}, @values);
1668          # Delete the value.              $self->DeleteRow('HasValueFor', $realKey, $objectID,
1669          $self->DeleteValue($entityName, $id, $key, $valueString);                               { subkey => $subKey, value => $valueString });
1670            }
1671      }      }
1672      # Return a one. This is for backward compatability.      # Return a one. This is for backward compatability.
1673      return 1;      return 1;
1674  }  }
1675    
1676    =head3 DeleteMatchingAttributes
1677    
1678        my @deleted = $attrDB->DeleteMatchingAttributes($objectID, $key, @values);
1679    
1680    Delete all attributes that match the specified criteria. This is equivalent to
1681    calling L</GetAttributes> and then invoking L</DeleteAttribute> for each
1682    row found.
1683    
1684    =over 4
1685    
1686    =item objectID
1687    
1688    ID of object whose attributes are to be deleted. If the attributes for multiple
1689    objects are to be deleted, this parameter can be specified as a list reference. If
1690    attributes are to be deleted for all objects, specify C<undef> or an empty string.
1691    Finally, you can delete attributes for a range of object IDs by putting a percent
1692    sign (C<%>) at the end.
1693    
1694    =item key
1695    
1696    Attribute key name. A value of C<undef> or an empty string will match all
1697    attribute keys. If the values are to be deletedfor multiple keys, this parameter can be
1698    specified as a list reference. Finally, you can delete attributes for a range of
1699    keys by putting a percent sign (C<%>) at the end.
1700    
1701    =item values
1702    
1703    List of the desired attribute values, section by section. If C<undef>
1704    or an empty string is specified, all values in that section will match. A
1705    generic match can be requested by placing a percent sign (C<%>) at the end.
1706    In that case, all values that match up to and not including the percent sign
1707    will match. You may also specify a regular expression enclosed
1708    in slashes. All values that match the regular expression will be deleted. For
1709    performance reasons, only values have this extra capability.
1710    
1711    =item RETURN
1712    
1713    Returns a list of tuples for the attributes that were deleted, in the
1714    same form as L</GetAttributes>.
1715    
1716    =back
1717    
1718    =cut
1719    
1720    sub DeleteMatchingAttributes {
1721        # Get the parameters.
1722        my ($self, $objectID, $key, @values) = @_;
1723        # Get the matching attributes.
1724        my @retVal = $self->GetAttributes($objectID, $key, @values);
1725        # Loop through the attributes, deleting them.
1726        for my $tuple (@retVal) {
1727            $self->DeleteAttribute(@{$tuple});
1728        }
1729        # Log this operation.
1730        my $count = @retVal;
1731        $self->LogOperation("Mass Delete", $key, "$count matching attributes deleted.");
1732        # Return the deleted attributes.
1733        return @retVal;
1734    }
1735    
1736  =head3 ChangeAttribute  =head3 ChangeAttribute
1737    
1738  C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>      $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues);
1739    
1740  Change the value of an attribute key/value pair for an object.  Change the value of an attribute key/value pair for an object.
1741    
# Line 1320  Line 1777 
1777      } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {      } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {
1778          Confess("No new values specified in ChangeAttribute call for key $key.");          Confess("No new values specified in ChangeAttribute call for key $key.");
1779      } else {      } else {
1780          # Okay, now we do the change as a delete/add.          # We do the change as a delete/add.
1781          $self->DeleteAttribute($objectID, $key, @{$oldValues});          $self->DeleteAttribute($objectID, $key, @{$oldValues});
1782          $self->AddAttribute($objectID, $key, @{$newValues});          $self->AddAttribute($objectID, $key, @{$newValues});
1783      }      }
# Line 1330  Line 1787 
1787    
1788  =head3 EraseAttribute  =head3 EraseAttribute
1789    
1790  C<< $attrDB->EraseAttribute($entityName, $key); >>      $attrDB->EraseAttribute($key);
1791    
1792  Erase all values for the specified attribute key. This does not remove the  Erase all values for the specified attribute key. This does not remove the
1793  key from the database; it merely removes all the values.  key from the database; it merely removes all the values.
1794    
1795  =over 4  =over 4
1796    
1797  =item entityName  =item key
1798    
1799    Key to erase. This must be a real key; that is, it cannot have a subkey
1800    component.
1801    
1802    =back
1803    
1804    =cut
1805    
1806    sub EraseAttribute {
1807        # Get the parameters.
1808        my ($self, $key) = @_;
1809        # Delete everything connected to the key.
1810        $self->Disconnect('HasValueFor', 'AttributeKey', $key);
1811        # Log the operation.
1812        $self->LogOperation("Erase Data", $key);
1813        # Return a 1, for backward compatability.
1814        return 1;
1815    }
1816    
1817    =head3 GetAttributeKeys
1818    
1819        my @keyList = $attrDB->GetAttributeKeys($groupName);
1820    
1821    Return a list of the attribute keys for a particular group.
1822    
1823    =over 4
1824    
1825    =item groupName
1826    
1827    Name of the group whose keys are desired.
1828    
1829    =item RETURN
1830    
1831    Returns a list of the attribute keys for the specified group.
1832    
1833    =back
1834    
1835    =cut
1836    
1837    sub GetAttributeKeys {
1838        # Get the parameters.
1839        my ($self, $groupName) = @_;
1840        # Get the attributes for the specified group.
1841        my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(to-link) = ?", [$groupName],
1842                                    'IsInGroup(from-link)');
1843        # Return the keys.
1844        return sort @groups;
1845    }
1846    
1847    =head3 QueryAttributes
1848    
1849        my @attributeData = $ca->QueryAttributes($filter, $filterParms);
1850    
1851    Return the attribute data based on an SQL filter clause. In the filter clause,
1852    the name C<$object> should be used for the object ID, C<$key> should be used for
1853    the key name, C<$subkey> for the subkey value, and C<$value> for the value field.
1854    
1855    =over 4
1856    
1857    =item filter
1858    
1859    Filter clause in the standard ERDB format, except that the field names are C<$object> for
1860    the object ID field, C<$key> for the key name field, C<$subkey> for the subkey field,
1861    and C<$value> for the value field. This abstraction enables us to hide the details of
1862    the database construction from the user.
1863    
1864    =item filterParms
1865    
1866    Parameters for the filter clause.
1867    
1868    =item RETURN
1869    
1870    Returns a list of tuples. Each tuple consists of an object ID, a key (with optional subkey), and
1871    one or more attribute values.
1872    
1873    =back
1874    
1875    =cut
1876    
1877    # This hash is used to drive the substitution process.
1878    my %AttributeParms = (object => 'HasValueFor(to-link)',
1879                          key    => 'HasValueFor(from-link)',
1880                          subkey => 'HasValueFor(subkey)',
1881                          value  => 'HasValueFor(value)');
1882    
1883    sub QueryAttributes {
1884        # Get the parameters.
1885        my ($self, $filter, $filterParms) = @_;
1886        # Declare the return variable.
1887        my @retVal = ();
1888        # Make sue we have filter parameters.
1889        my $realParms = (defined($filterParms) ? $filterParms : []);
1890        # Create the query by converting the filter.
1891        my $realFilter = $filter;
1892        for my $name (keys %AttributeParms) {
1893            $realFilter =~ s/\$$name/$AttributeParms{$name}/g;
1894        }
1895        my $query = $self->Get(['HasValueFor'], $realFilter, $realParms);
1896        # Loop through the results, forming the output attribute tuples.
1897        while (my $result = $query->Fetch()) {
1898            # Get the four values from this query result row.
1899            my ($objectID, $key, $subkey, $value) = $result->Values([$AttributeParms{object},
1900                                                                    $AttributeParms{key},
1901                                                                    $AttributeParms{subkey},
1902                                                                    $AttributeParms{value}]);
1903            # Combine the key and the subkey.
1904            my $realKey = ($subkey ? $key . $self->{splitter} . $subkey : $key);
1905            # Split the value.
1906            my @values = split $self->{splitter}, $value;
1907            # Output the result.
1908            push @retVal, [$objectID, $realKey, @values];
1909        }
1910        # Return the result.
1911        return @retVal;
1912    }
1913    
1914    =head2 Key and ID Manipulation Methods
1915    
1916    =head3 ParseID
1917    
1918        my ($type, $id) = CustomAttributes::ParseID($idValue);
1919    
1920    Determine the type and object ID corresponding to an ID value from the attribute database.
1921    Most ID values consist of a type name and an ID, separated by a colon (e.g. C<Family:aclame|cluster10>);
1922    however, Genomes, Features, and Subsystems are not stored with a type name, so we need to
1923    deduce the type from the ID value structure.
1924    
1925    The theory here is that you can plug the ID and type directly into a Sprout database method, as
1926    follows
1927    
1928        my ($type, $id) = CustomAttributes::ParseID($attrList[$num]->[0]);
1929        my $target = $sprout->GetEntity($type, $id);
1930    
1931    =over 4
1932    
1933    =item idValue
1934    
1935    ID value taken from the attribute database.
1936    
1937    =item RETURN
1938    
1939    Returns a two-element list. The first element is the type of object indicated by the ID value,
1940    and the second element is the actual object ID.
1941    
1942    =back
1943    
1944    =cut
1945    
1946    sub ParseID {
1947        # Get the parameters.
1948        my ($idValue) = @_;
1949        # Declare the return variables.
1950        my ($type, $id);
1951        # Parse the incoming ID. We first check for the presence of an entity name. Entity names
1952        # can only contain letters, which helps to insure typed object IDs don't collide with
1953        # subsystem names (which are untyped).
1954        if ($idValue =~ /^([A-Za-z]+):(.+)/) {
1955            # Here we have a typed ID.
1956            ($type, $id) = ($1, $2);
1957            # Fix the case sensitivity on PDB IDs.
1958            if ($type eq 'PDB') { $id = lc $id; }
1959        } elsif ($idValue =~ /fig\|/) {
1960            # Here we have a feature ID.
1961            ($type, $id) = (Feature => $idValue);
1962        } elsif ($idValue =~ /\d+\.\d+/) {
1963            # Here we have a genome ID.
1964            ($type, $id) = (Genome => $idValue);
1965        } else {
1966            # The default is a subsystem ID.
1967            ($type, $id) = (Subsystem => $idValue);
1968        }
1969        # Return the results.
1970        return ($type, $id);
1971    }
1972    
1973    =head3 FormID
1974    
1975        my $idValue = CustomAttributes::FormID($type, $id);
1976    
1977    Convert an object type and ID pair into an object ID string for the attribute system. Subsystems,
1978    genomes, and features are stored in the database without type information, but all other object IDs
1979    must be prefixed with the object type.
1980    
1981    =over 4
1982    
1983    =item type
1984    
1985    Relevant object type.
1986    
1987    =item id
1988    
1989    ID of the object in question.
1990    
1991    =item RETURN
1992    
1993    Returns a string that will be recognized as an object ID in the attribute database.
1994    
1995    =back
1996    
1997    =cut
1998    
1999    sub FormID {
2000        # Get the parameters.
2001        my ($type, $id) = @_;
2002        # Declare the return variable.
2003        my $retVal;
2004        # Compute the ID string from the type.
2005        if (grep { $type eq $_ } qw(Feature Genome Subsystem)) {
2006            $retVal = $id;
2007        } else {
2008            $retVal = "$type:$id";
2009        }
2010        # Return the result.
2011        return $retVal;
2012    }
2013    
2014    =head3 GetTargetObject
2015    
2016        my $object = CustomAttributes::GetTargetObject($erdb, $idValue);
2017    
2018    Return the database object corresponding to the specified attribute object ID. The
2019    object type associated with the ID value must correspond to an entity name in the
2020    specified database.
2021    
2022    =over 4
2023    
2024    =item erdb
2025    
2026    B<ERDB> object for accessing the target database.
2027    
2028    =item idValue
2029    
2030    ID value retrieved from the attribute database.
2031    
2032    =item RETURN
2033    
2034    Returns a B<ERDBObject> for the attribute value's target object.
2035    
2036  Name of the entity to which the key belongs. If undefined, all entities will be  =back
2037  examined for the desired key.  
2038    =cut
2039    
2040    sub GetTargetObject {
2041        # Get the parameters.
2042        my ($erdb, $idValue) = @_;
2043        # Declare the return variable.
2044        my $retVal;
2045        # Get the type and ID for the target object.
2046        my ($type, $id) = ParseID($idValue);
2047        # Plug them into the GetEntity method.
2048        $retVal = $erdb->GetEntity($type, $id);
2049        # Return the resulting object.
2050        return $retVal;
2051    }
2052    
2053    =head3 SplitKey
2054    
2055        my ($realKey, $subKey) = $ca->SplitKey($key);
2056    
2057    Split an external key (that is, one passed in by a caller) into the real key and the sub key.
2058    The real and sub keys are separated by a splitter value (usually C<::>). If there is no splitter,
2059    then the sub key is presumed to be an empty string.
2060    
2061    =over 4
2062    
2063  =item key  =item key
2064    
2065  Key to erase.  Incoming key to be split.
2066    
2067    =item RETURN
2068    
2069    Returns a two-element list, the first element of which is the real key and the second element of
2070    which is the sub key.
2071    
2072  =back  =back
2073    
2074  =cut  =cut
2075    
2076  sub EraseAttribute {  sub SplitKey {
2077      # Get the parameters.      # Get the parameters.
2078      my ($self, $entityName, $key) = @_;      my ($self, $key) = @_;
2079      # Determine the relevant entity types.      # Do the split.
2080      my @objects = ();      my ($realKey, $subKey) = split($self->{splitter}, $key, 2);
2081      if (! $entityName) {      # Insure the subkey has a value.
2082          push @objects, $self->GetEntityTypes();      if (! defined $subKey) {
2083      } else {          $subKey = '';
         push @objects, $entityName;  
     }  
     # Loop through the entity types.  
     for my $entityType (@objects) {  
         # Now check for this key in this entity.  
         my %secondaries = $self->GetSecondaryFields($entityType);  
         if (exists $secondaries{$key}) {  
             # We found it, so delete all the values of the key.  
             $self->DeleteValue($entityName, undef, $key);  
2084          }          }
2085        # Return the results.
2086        return ($realKey, $subKey);
2087      }      }
2088      # Return a 1, for backward compatability.  
2089      return 1;  =head3 JoinKey
2090    
2091        my $key = $ca->JoinKey($realKey, $subKey);
2092    
2093    Join a real key and a subkey together to make an external key. The external key is the attribute key
2094    used by the caller. The real key and the subkey are how the keys are represented in the database. The
2095    real key is the key to the B<AttributeKey> entity. The subkey is an attribute of the B<HasValueFor>
2096    relationship.
2097    
2098    =over 4
2099    
2100    =item realKey
2101    
2102    The real attribute key.
2103    
2104    =item subKey
2105    
2106    The subordinate portion of the attribute key.
2107    
2108    =item RETURN
2109    
2110    Returns a single string representing both keys.
2111    
2112    =back
2113    
2114    =cut
2115    
2116    sub JoinKey {
2117        # Get the parameters.
2118        my ($self, $realKey, $subKey) = @_;
2119        # Declare the return variable.
2120        my $retVal;
2121        # Check for a subkey.
2122        if ($subKey eq '') {
2123            # No subkey, so the real key is the key.
2124            $retVal = $realKey;
2125        } else {
2126            # Subkey found, so the two pieces must be joined by a splitter.
2127            $retVal = "$realKey$self->{splitter}$subKey";
2128  }  }
2129        # Return the result.
2130        return $retVal;
2131    }
2132    
2133    
2134    =head3 AttributeTable
2135    
2136        my $tableHtml = CustomAttributes::AttributeTable($cgi, @attrList);
2137    
2138    Format the attribute data into an HTML table.
2139    
2140    =over 4
2141    
2142    =item cgi
2143    
2144    CGI query object used to generate the HTML
2145    
2146    =item attrList
2147    
2148    List of attribute results, in the format returned by the L</GetAttributes> or
2149    L</QueryAttributes> methods.
2150    
2151    =item RETURN
2152    
2153    Returns an HTML table displaying the attribute keys and values.
2154    
2155    =back
2156    
2157    =cut
2158    
2159    sub AttributeTable {
2160        # Get the parameters.
2161        my ($cgi, @attrList) = @_;
2162        # Accumulate the table rows.
2163        my @html = ();
2164        for my $attrData (@attrList) {
2165            # Format the object ID and key.
2166            my @columns = map { CGI::escapeHTML($_) } @{$attrData}[0,1];
2167            # Now we format the values. These remain unchanged unless one of them is a URL.
2168            my $lastValue = scalar(@{$attrData}) - 1;
2169            push @columns, map { $_ =~ /^http:/ ? $cgi->a({ href => $_ }, $_) : $_ } @{$attrData}[2 .. $lastValue];
2170            # Assemble the values into a table row.
2171            push @html, $cgi->Tr($cgi->td(\@columns));
2172        }
2173        # Format the table in the return variable.
2174        my $retVal = $cgi->table({ border => 2 }, $cgi->Tr($cgi->th(['Object', 'Key', 'Values'])), @html);
2175        # Return it.
2176        return $retVal;
2177    }
2178  1;  1;

Legend:
Removed from v.1.7  
changed lines
  Added in v.1.32

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3