[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

Diff of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.7, Wed Nov 15 12:04:05 2006 UTC revision 1.19, Fri Feb 9 22:59:18 2007 UTC
# Line 8  Line 8 
8      use strict;      use strict;
9      use Tracer;      use Tracer;
10      use ERDBLoad;      use ERDBLoad;
11        use Stats;
12    
13  =head1 Custom SEED Attribute Manager  =head1 Custom SEED Attribute Manager
14    
# Line 15  Line 16 
16    
17  The Custom SEED Attributes Manager allows the user to upload and retrieve  The Custom SEED Attributes Manager allows the user to upload and retrieve
18  custom data for SEED objects. It uses the B<ERDB> database system to  custom data for SEED objects. It uses the B<ERDB> database system to
19  store the attributes, which are implemented as multi-valued fields  store the attributes.
20  of ERDB entities.  
21    Attributes are organized by I<attribute key>. Attribute values are
22    assigned to I<objects>. In the real world, objects have types and IDs;
23    however, to the attribute database only the ID matters. This will create
24    a problem if we have a single ID that applies to two objects of different
25    types, but it is more consistent with the original attribute implementation
26    in the SEED (which this implementation replaces).
27    
28    The actual attribute values are stored as a relationship between the attribute
29    keys and the objects. There can be multiple values for a single key/object pair.
30    
31    =head3 Object IDs
32    
33    The object ID is normally represented as
34    
35        I<type>:I<id>
36    
37    where I<type> is the object type (C<Role>, C<Coupling>, etc.) and I<id> is
38    the actual object ID. Note that the object type must consist of only upper- and
39    lower-case letters! Thus, C<GenomeGroup> is a valid object type, but
40    C<genome_group> is not. Given that restriction, the object ID
41    
42        Family:aclame|cluster10
43    
44    would represent the FIG family C<aclame|cluster10>. For historical reasons,
45    there are three exceptions: subsystems, genomes, and features do not need
46    a type. So, for PEG 3361 of Streptomyces coelicolor A3(2), you simply code
47    
48        fig|100226.1.peg.3361
49    
50    The methods L</ParseID> and L</FormID> can be used to make this all seem
51    more consistent. Given any object ID string, L</ParseID> will convert it to an
52    object type and ID, and given any object type and ID, L</FormID> will
53    convert it to an object ID string. The attribute database is pretty
54    freewheeling about what it will allow for an ID; however, for best
55    results, the type should match an entity type from a Sprout genetics
56    database. If this rule is followed, then the database object
57    corresponding to an ID in the attribute database could be retrieved using
58    L</GetTargetObject> method.
59    
60        my $object = CustomAttributes::GetTargetObject($sprout, $idValue);
61    
62    =head3 Retrieval and Logging
63    
64  The full suite of ERDB retrieval capabilities is provided. In addition,  The full suite of ERDB retrieval capabilities is provided. In addition,
65  custom methods are provided specific to this application. To get all  custom methods are provided specific to this application. To get all
66  the values of the attribute C<essential> in a specified B<Feature>, you  the values of the attribute C<essential> in a specified B<Feature>, you
67  would code  would code
68    
69      my @values = $attrDB->GetAttributes([Feature => $fid], 'essential');      my @values = $attrDB->GetAttributes($fid, 'essential');
70    
71    where I<$fid> contains the ID of the desired feature.
72    
73  where I<$fid> contains the ID of the desired feature. Each attribute has  New attribute keys must be defined before they can be used. A web interface
74  an alternate index to allow searching for attributes by value.  is provided for this purpose.
75    
76  New attributes are introduced by updating the database definition at  Major attribute activity is recorded in a log (C<attributes.log>) in the
77  run-time. Attribute values are stored by uploading data from files.  C<$FIG_Config::var> directory. The log reports the user name, time, and
78  A web interface is provided for both these activities.  the details of the operation. The user name will almost always be unknown,
79    except when it is specified in this object's constructor (see L</new>).
80    
81  =head2 FIG_Config Parameters  =head2 FIG_Config Parameters
82    
# Line 76  Line 122 
122    
123  =back  =back
124    
 The DBD file is critical, and must have reasonable contents before we can  
 begin using the system. In the old system, attributes were only provided  
 for Genomes and Features, so the initial XML file was the following.  
   
     <Database>  
       <Title>SEED Custom Attribute Database</Title>  
       <Entities>  
         <Entity name="Feature" keyType="id-string">  
           <Notes>A [i]feature[/i] is a part of the genome  
           that is of special interest. Features may be spread  
           across multiple contigs of a genome, but never across  
           more than one genome. Features can be assigned to roles  
           via spreadsheet cells, and are the targets of  
           annotation.</Notes>  
         </Entity>  
         <Entity name="Genome" keyType="name-string">  
           <Notes>A [i]genome[/i] describes a particular individual  
           organism's DNA.</Notes>  
         </Entity>  
       </Entities>  
     </Database>  
   
 It is not necessary to put any tables into the database; however, you should  
 run  
   
     AttrDBRefresh  
   
 periodically to insure it has the correct Genomes and Features in it. When  
 converting from the old system, use  
   
     AttrDBRefresh -migrate  
   
 to initialize the database and migrate the legacy data. You should only need  
 to do that once.  
   
 =head2 Implementation Note  
   
 The L</Refresh> method reloads the entities in the database. If new  
 entity types are added, that method will need to be adjusted accordingly.  
   
125  =head2 Public Methods  =head2 Public Methods
126    
127  =head3 new  =head3 new
128    
129  C<< my $attrDB = CustomAttributes->new($splitter); >>  C<< my $attrDB = CustomAttributes->new(%options); >>
130    
131  Construct a new CustomAttributes object. This object cannot be used to add or  Construct a new CustomAttributes object. The following options are
132  delete keys because that requires modifying the database design. To do that,  supported.
 you need to use the static L</StoreAttributeKey> or L</DeleteAttributeKey>  
 methods.  
133    
134  =over 4  =over 4
135    
136  =item splitter  =item splitter
137    
138  Value to be used to split attribute values into sections in the  Value to be used to split attribute values into sections in the
139  L</Fig Replacement Methods>. The default is a double colon C<::>.  L</Fig Replacement Methods>. The default is a double colon C<::>,
140  If you do not use the replacement methods, you do not need to  and should only be overridden in extreme circumstances.
141  worry about this parameter.  
142    =item user
143    
144    Name of the current user. This will appear in the attribute log.
145    
146  =back  =back
147    
# Line 142  Line 149 
149    
150  sub new {  sub new {
151      # Get the parameters.      # Get the parameters.
152      my ($class, $splitter) = @_;      my ($class, %options) = @_;
153      # Connect to the database.      # Connect to the database.
154      my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,      my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
155                              $FIG_Config::attrUser, $FIG_Config::attrPass,                              $FIG_Config::attrUser, $FIG_Config::attrPass,
# Line 152  Line 159 
159      my $xmlFileName = $FIG_Config::attrDBD;      my $xmlFileName = $FIG_Config::attrDBD;
160      my $retVal = ERDB::new($class, $dbh, $xmlFileName);      my $retVal = ERDB::new($class, $dbh, $xmlFileName);
161      # Store the splitter value.      # Store the splitter value.
162      $retVal->{splitter} = (defined($splitter) ? $splitter : '::');      $retVal->{splitter} = $options{splitter} || '::';
163        # Store the user name.
164        $retVal->{user} = $options{user} || '<unknown>';
165        Trace("User $retVal->{user} selected for attribute object.") if T(3);
166      # Return the result.      # Return the result.
167      return $retVal;      return $retVal;
168  }  }
169    
170  =head3 StoreAttributeKey  =head3 StoreAttributeKey
171    
172  C<< my $attrDB = CustomAttributes::StoreAttributeKey($entityName, $attributeName, $type, $notes); >>  C<< $attrDB->StoreAttributeKey($attributeName, $type, $notes, \@groups); >>
173    
174  Create or update an attribute for the database. This method will update the database definition  Create or update an attribute for the database.
 XML, but it will not create the table. It will connect to the database so that the caller  
 can upload the attribute values.  
175    
176  =over 4  =over 4
177    
 =item entityName  
   
 Name of the entity containing the attribute. The entity must exist.  
   
178  =item attributeName  =item attributeName
179    
180  Name of the attribute. It must be a valid ERDB field name, consisting entirely of  Name of the attribute. It must be a valid ERDB field name, consisting entirely of
# Line 185  Line 189 
189    
190  Descriptive notes about the attribute. It is presumed to be raw text, not HTML.  Descriptive notes about the attribute. It is presumed to be raw text, not HTML.
191    
192  =item RETURN  =item groups
193    
194  Returns a Custom Attribute Database object if successful. If unsuccessful, an  Reference to a list of the groups to which the attribute should be associated.
195  error will be thrown.  This will replace any groups to which the attribute is currently attached.
196    
197  =back  =back
198    
# Line 196  Line 200 
200    
201  sub StoreAttributeKey {  sub StoreAttributeKey {
202      # Get the parameters.      # Get the parameters.
203      my ($entityName, $attributeName, $type, $notes) = @_;      my ($self, $attributeName, $type, $notes, $groups) = @_;
204        # Declare the return variable.
205        my $retVal;
206      # Get the data type hash.      # Get the data type hash.
207      my %types = ERDB::GetDataTypes();      my %types = ERDB::GetDataTypes();
208      # Validate the initial input values.      # Validate the initial input values.
# Line 206  Line 212 
212          Confess("Missing or incomplete description for $attributeName.");          Confess("Missing or incomplete description for $attributeName.");
213      } elsif (! exists $types{$type}) {      } elsif (! exists $types{$type}) {
214          Confess("Invalid data type \"$type\" for $attributeName.");          Confess("Invalid data type \"$type\" for $attributeName.");
     }  
     # Our next step is to read in the XML for the database defintion. We  
     # need to verify that the named entity exists.  
     my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);  
     my $entityHash = $metadata->{Entities};  
     if (! exists $entityHash->{$entityName}) {  
         Confess("Entity $entityName not found.");  
215      } else {      } else {
216          # Okay, we're ready to begin. Get the entity hash and the field hash.          # Create a variable to hold the action to be displayed for the log (Add or Update).
217          my $entityData = $entityHash->{$entityName};          my $action;
218          my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);          # Okay, we're ready to begin. See if this key exists.
219          # Compute the attribute's relation name.          my $attribute = $self->GetEntity('AttributeKey', $attributeName);
220          my $relName = join("", $entityName, map { ucfirst $_ } split(/-|_/, $attributeName));          if (defined($attribute)) {
221          # Store the attribute's field data. Note the use of the "content" hash for              # It does, so we do an update.
222          # the notes. This is how the XML writer knows Notes is a text tag instead of              $action = "Update Key";
223          # an attribute.              $self->UpdateEntity('AttributeKey', $attributeName,
224          $fieldHash->{$attributeName} = { type => $type, relation => $relName,                                  { description => $notes, 'data-type' => $type });
225                                           Notes => { content => $notes } };              # Detach the key from its current groups.
226          # Insure we have an index for this attribute.              $self->Disconnect('IsInGroup', 'AttributeKey', $attributeName);
227          my $index = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);          } else {
228          if (! defined($index)) {              # It doesn't, so we do an insert.
229              push @{$entityData->{Indexes}}, { IndexFields => [ { name => $attributeName, order => 'ascending' } ],              $action = "Insert Key";
230                                                Notes       => "Alternate index provided for access by $attributeName." };              $self->InsertObject('AttributeKey', { id => $attributeName,
231          }                                  description => $notes, 'data-type' => $type });
232          # Write the XML back out.          }
233          ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);          # Attach the key to the specified groups. (We presume the groups already
234      }          # exist.)
235      # Open a database with the new XML.          for my $group (@{$groups}) {
236      my $retVal = CustomAttributes->new();              $self->InsertObject('IsInGroup', { 'from-link' => $attributeName,
237      return $retVal;                                                 'to-link'   => $group });
 }  
   
 =head3 Refresh  
   
 C<< $attrDB->Refresh($fig); >>  
   
 Refresh the primary entity tables from the FIG data store. This method basically  
 drops and reloads the main tables of the custom attributes database.  
   
 =over 4  
   
 =item fig  
   
 FIG-like object that can be used to find genomes and features.  
   
 =back  
   
 =cut  
   
 sub Refresh {  
     # Get the parameters.  
     my ($self, $fig) = @_;  
     # Create load objects for the genomes and the features.  
     my $loadGenome = ERDBLoad->new($self, 'Genome', $FIG_Config::temp);  
     my $loadFeature = ERDBLoad->new($self, 'Feature', $FIG_Config::temp);  
     # Get the genome list.  
     my @genomes = $fig->genomes();  
     # Loop through the genomes.  
     for my $genomeID (@genomes) {  
         # Put this genome in the genome table.  
         $loadGenome->Put($genomeID);  
         Trace("Processing Genome $genomeID") if T(3);  
         # Put its features into the feature table. Note we have to use a hash to  
         # remove duplicates.  
         my %featureList = map { $_ => 1 } $fig->all_features($genomeID);  
         for my $fid (keys %featureList) {  
             $loadFeature->Put($fid);  
238          }          }
239            # Log the operation.
240            $self->LogOperation($action, $attributeName, "Group list is " . join(" ", @{$groups}));
241      }      }
     # Get a variable for holding statistics objects.  
     my $stats;  
     # Finish the genome load.  
     Trace("Loading Genome relation.") if T(2);  
     $stats = $loadGenome->FinishAndLoad();  
     Trace("Genome table load statistics:\n" . $stats->Show()) if T(3);  
     # Finish the feature load.  
     Trace("Loading Feature relation.") if T(2);  
     $stats = $loadFeature->FinishAndLoad();  
     Trace("Feature table load statistics:\n" . $stats->Show()) if T(3);  
242  }  }
243    
244  =head3 LoadAttributeKey  =head3 LoadAttributeKey
245    
246  C<< my $stats = $attrDB->LoadAttributeKey($entityName, $fieldName, $fh, $keyCol, $dataCol); >>  C<< my $stats = $attrDB->LoadAttributeKey($keyName, $fh, $keyCol, $dataCol, %options); >>
247    
248  Load the specified attribute from the specified file. The file should be a  Load the specified attribute from the specified file. The file should be a
249  tab-delimited file with internal tab and new-line characters escaped. This is  tab-delimited file with internal tab and new-line characters escaped. This is
250  the typical TBL-style file used by most FIG applications. One of the columns  the typical TBL-style file used by most FIG applications. One of the columns
251  in the input file must contain the appropriate key value and the other the  in the input file must contain the appropriate object id value and the other the
252  corresponding attribute value.  corresponding attribute value. The current contents of the attribute database will
253    be erased before loading, unless the options are used to override that behavior.
254    
255  =over 4  =over 4
256    
257  =item entityName  =item keyName
   
 Name of the entity containing the attribute.  
258    
259  =item fieldName  Key of the attribute to load.
   
 Name of the actual attribute.  
260    
261  =item fh  =item fh
262    
263  Open file handle for the input file.  Open file handle for the input file.
264    
265  =item keyCol  =item idCol
266    
267  Index (0-based) of the column containing the key field. The key field should  Index (0-based) of the column containing the ID field. The ID field should
268  contain the ID of an instance of the named entity.  contain the ID of an instance of the named entity.
269    
270  =item dataCol  =item dataCol
271    
272  Index (0-based) of the column containing the data value field.  Index (0-based) of the column containing the data value field.
273    
274    =item options
275    
276    Hash specifying the options for this load.
277    
278  =item RETURN  =item RETURN
279    
280  Returns a statistics object for the load process.  Returns a statistics object for the load process.
281    
282  =back  =back
283    
284    The available options are as follows.
285    
286    =over 4
287    
288    =item keep
289    
290    If specified, the existing attribute values will not be erased.
291    
292    =item archive
293    
294    If specified, the name of a file into which the incoming file should be saved.
295    
296    =back
297    
298  =cut  =cut
299    
300  sub LoadAttributeKey {  sub LoadAttributeKey {
301      # Get the parameters.      # Get the parameters.
302      my ($self, $entityName, $fieldName, $fh, $keyCol, $dataCol) = @_;      my ($self, $keyName, $fh, $idCol, $dataCol, %options) = @_;
303      # Create the return variable.      # Create the return variable.
304      my $retVal;      my $retVal = Stats->new("lineIn", "shortLine");
305      # Insure the entity exists.      # Compute the minimum number of fields required in each input line. The user specifies two
306      my $found = grep { $_ eq $entityName } $self->GetEntityTypes();      # columns, and we need to make sure both columns are in every record.
307      if (! $found) {      my $minCols = ($idCol < $dataCol ? $dataCol : $idCol) + 1;
308          Confess("Entity \"$entityName\" not found in database.");      Trace("Minimum column count is $minCols.") if T(3);
309      } else {      #
310          # Get the field structure for the named entity.      # Insure the attribute key exists.
311          my $fieldHash = $self->GetFieldTable($entityName);      my $found = $self->GetEntity('AttributeKey', $keyName);
312          # Verify that the attribute exists.      if (! defined $found) {
313          if (! exists $fieldHash->{$fieldName}) {          Confess("Attribute key \"$keyName\" not found in database.");
             Confess("Attribute key \"$fieldName\" does not exist in entity $entityName.");  
314          } else {          } else {
315              # Create a loader for the specified attribute. We need the          # Erase the key's current values (unless, of course, the caller specified the "keep" option.
316              # relation name first.          if (! $options{keep}) {
317              my $relName = $fieldHash->{$fieldName}->{relation};              $self->EraseAttribute($keyName);
318              my $loadAttribute = ERDBLoad->new($self, $relName, $FIG_Config::temp);          }
319            # Check for a save file. In the main loop, we'll know a save file is needed if $sh is
320            # defined.
321            my $sh;
322            if ($options{archive}) {
323                $sh = Open(undef, ">$options{archive}");
324                Trace("Attribute $keyName upload saved in $options{archive}.") if T(2);
325            }
326            # Save a list of the object IDs we need to add.
327            my %objectIDs = ();
328              # Loop through the input file.              # Loop through the input file.
329              while (! eof $fh) {              while (! eof $fh) {
330                  # Get the next line of the file.                  # Get the next line of the file.
331                  my @fields = Tracer::GetLine($fh);                  my @fields = Tracer::GetLine($fh);
332                  $loadAttribute->Add("lineIn");              $retVal->Add(lineIn => 1);
333                  # Now we need to validate the line.              my $count = scalar @fields;
334                  if ($#fields < $dataCol) {              Trace("Field count is $count. First field is \"$fields[0]\".") if T(4);
335                      $loadAttribute->Add("shortLine");              # Archive it if necessary.
336                  } elsif (! $self->Exists($entityName, $fields[$keyCol])) {              if (defined $sh) {
337                      $loadAttribute->Add("badKey");                  Tracer::PutLine($sh, \@fields);
338                }
339                # Now we need to check for comments and validate the line.
340                if ($fields[0] =~ /^\s*$/) {
341                    # Blank line: skip it.
342                    $retVal->Add(blank => 1);
343                } elsif (substr($fields[0],0,1) eq '#') {
344                    # Comment line: skip it.
345                    $retVal->Add(comment => 1);
346                } elsif ($count < $minCols) {
347                    # Line is too short: we have an error.
348                    $retVal->Add(shortLine => 1);
349                  } else {                  } else {
350                      # It's valid,so send it to the loader.                  # It's valid, so get the ID and value.
351                      $loadAttribute->Put($fields[$keyCol], $fields[$dataCol]);                  my ($id, $value) = ($fields[$idCol], $fields[$dataCol]);
352                      $loadAttribute->Add("lineUsed");                  # Denote we're using this input line.
353                  }                  $retVal->Add(lineUsed => 1);
354              }                  # Now we insert the attribute.
355              # Finish the load.                  $self->InsertObject('HasValueFor', { 'from-link' => $keyName,
356              $retVal = $loadAttribute->FinishAndLoad();                                                       'to-link' => $id,
357                                                         value => $value });
358                    $retVal->Add(newValue => 1);
359                }
360            }
361            # Log this operation.
362            $self->LogOperation("Load Key", $keyName, $retVal->Display());
363            # If there's an archive, close it.
364            if (defined $sh) {
365                close $sh;
366          }          }
367      }      }
368      # Return the statistics.      # Return the statistics.
# Line 375  Line 372 
372    
373  =head3 DeleteAttributeKey  =head3 DeleteAttributeKey
374    
375  C<< CustomAttributes::DeleteAttributeKey($entityName, $attributeName); >>  C<< my $stats = $attrDB->DeleteAttributeKey($attributeName); >>
376    
377  Delete an attribute from the custom attributes database.  Delete an attribute from the custom attributes database.
378    
379  =over 4  =over 4
380    
 =item entityName  
   
 Name of the entity possessing the attribute.  
   
381  =item attributeName  =item attributeName
382    
383  Name of the attribute to delete.  Name of the attribute to delete.
384    
385    =item RETURN
386    
387    Returns a statistics object describing the effects of the deletion.
388    
389  =back  =back
390    
391  =cut  =cut
392    
393  sub DeleteAttributeKey {  sub DeleteAttributeKey {
394      # Get the parameters.      # Get the parameters.
395      my ($entityName, $attributeName) = @_;      my ($self, $attributeName) = @_;
396      # Read in the XML for the database defintion. We need to verify that      # Delete the attribute key.
397      # the named entity exists and it has the named attribute.      my $retVal = $self->Delete('AttributeKey', $attributeName);
398      my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);      # Log this operation.
399      my $entityHash = $metadata->{Entities};      $self->LogOperation("Delete Key", $attributeName, "Key will no longer be available for use by anyone.");
400      if (! exists $entityHash->{$entityName}) {      # Return the result.
401          Confess("Entity \"$entityName\" not found.");      return $retVal;
402      } else {  
         # Get the field hash.  
         my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);  
         if (! exists $fieldHash->{$attributeName}) {  
             Confess("Attribute key \"$attributeName\" not found in entity $entityName.");  
         } else {  
             # Get the attribute's relation name.  
             my $relName = $fieldHash->{$attributeName}->{relation};  
             # Check for an index.  
             my $indexIdx = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);  
             if (defined($indexIdx)) {  
                 Trace("Index for $attributeName found at position $indexIdx for $entityName.") if T(3);  
                 delete $entityHash->{$entityName}->{Indexes}->[$indexIdx];  
             }  
             # Delete the attribute from the field hash.  
             Trace("Deleting attribute $attributeName from $entityName.") if T(3);  
             delete $fieldHash->{$attributeName};  
             # Write the XML back out.  
             ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);  
             # Insure the relation does not exist in the database. This requires connecting  
             # since we may have to do a table drop.  
             my $attrDB = CustomAttributes->new();  
             Trace("Dropping table $relName.") if T(3);  
             $attrDB->DropRelation($relName);  
         }  
403      }      }
404    
405    =head3 NewName
406    
407    C<< my $text = CustomAttributes::NewName(); >>
408    
409    Return the string used to indicate the user wants to add a new attribute.
410    
411    =cut
412    
413    sub NewName {
414        return "(new)";
415  }  }
416    
417  =head3 ControlForm  =head3 ControlForm
418    
419  C<< my $formHtml = $attrDB->ControlForm($cgi, $name); >>  C<< my $formHtml = $attrDB->ControlForm($cgi, $name, \%keys); >>
420    
421  Return a form that can be used to control the creation and modification of  Return a form that can be used to control the creation and modification of
422  attributes.  attributes. Only a subset of the attribute keys will be displayed, as
423    determined by the incoming list.
424    
425  =over 4  =over 4
426    
# Line 447  Line 432 
432    
433  Name to give to the form. This should be unique for the web page.  Name to give to the form. This should be unique for the web page.
434    
435    =item keys
436    
437    Reference to a hash mapping attribute keys to n-tuples. Each tuple will contain the
438    attribute's data type, its description, and a list of the groups in which it participates.
439    
440  =item RETURN  =item RETURN
441    
442  Returns the HTML for a form that submits instructions to the C<Attributes.cgi> script  Returns the HTML for a form that can be used to  submit instructions to the C<Attributes.cgi> script
443  for loading, creating, or deleting an attribute.  for loading, creating, displaying, changing, or deleting an attribute. Note that only the form
444    controls are generated. The form tags are left to the caller.
445    
446  =back  =back
447    
# Line 458  Line 449 
449    
450  sub ControlForm {  sub ControlForm {
451      # Get the parameters.      # Get the parameters.
452      my ($self, $cgi, $name) = @_;      my ($self, $cgi, $name, $keys) = @_;
453      # Declare the return list.      # Declare the return list.
454      my @retVal = ();      my @retVal = ();
     # Start the form. We use multipart to support the upload control.  
     push @retVal, $cgi->start_multipart_form(-name => $name);  
455      # We'll put the controls in a table. Nothing else ever seems to look nice.      # We'll put the controls in a table. Nothing else ever seems to look nice.
456      push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });      push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
457      # The first row is for selecting the field name.      # The first row is for selecting the field name.
458      push @retVal, $cgi->Tr($cgi->th("Select a Field"),      push @retVal, $cgi->Tr($cgi->th("Select a Field"),
459                             $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', 1,                             $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', $keys,
460                                                       "document.$name.notes.value",                                                       new => 1,
461                                                       "document.$name.dataType.value")));                                                       notes => "document.$name.notes.value",
462                                                         type => "document.$name.dataType.value",
463                                                         groups => "document.$name.groups")));
464      # Now we set up a dropdown for the data types. The values will be the      # Now we set up a dropdown for the data types. The values will be the
465      # data type names, and the labels will be the descriptions.      # data type names, and the labels will be the descriptions.
466      my %types = ERDB::GetDataTypes();      my %types = ERDB::GetDataTypes();
467      my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;      my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
468      my $typeMenu = $cgi->popup_menu(-name   => 'dataType',      my $typeMenu = $cgi->popup_menu(-name   => 'dataType',
469                                      -values => [sort keys %types],                                      -values => [sort keys %types],
470                                      -labels => \%labelMap);                                      -labels => \%labelMap,
471                                        -default => 'string');
472        # Allow the user to specify a new field name. This is required if the
473        # user has selected the "(new)" marker. We put a little scriptlet in here that
474        # selects the (new) marker when the user enters the field.
475        push @retVal, "<script language=\"javaScript\">";
476        my $fieldField = "document.$name.fieldName";
477        my $newName = "\"" . NewName() . "\"";
478        push @retVal, $cgi->Tr($cgi->th("New Field Name"),
479                               $cgi->td($cgi->textfield(-name => 'newName',
480                                                        -size => 30,
481                                                        -value => "",
482                                                        -onFocus => "setIfEmpty($fieldField, $newName);")),
483                                        );
484      push @retVal, $cgi->Tr($cgi->th("Data type"),      push @retVal, $cgi->Tr($cgi->th("Data type"),
485                             $cgi->td($typeMenu));                             $cgi->td($typeMenu));
486      # The next row is for the notes.      # The next row is for the notes.
# Line 485  Line 489 
489                                                     -rows => 6,                                                     -rows => 6,
490                                                     -columns => 80))                                                     -columns => 80))
491                            );                            );
492      # Allow the user to specify a new field name. This is required if the      # Now we have the groups, which are implemented as a checkbox group.
493      # user has selected one of the "(new)" markers.      my @groups = $self->GetGroups();
494      push @retVal, $cgi->Tr($cgi->th("New Field Name"),      push @retVal, $cgi->Tr($cgi->th("Groups"),
495                             $cgi->td($cgi->textfield(-name => 'newName',                             $cgi->td($cgi->checkbox_group(-name=>'groups',
496                                                      -size => 30)),                                      -values=> \@groups))
497                                      );                                      );
498      # If the user wants to upload new values for the field, then we have      # If the user wants to upload new values for the field, then we have
499      # an upload file name and column indicators.      # an upload file name and column indicators.
# Line 506  Line 510 
510                                                      -default => 1)                                                      -default => 1)
511                                     ),                                     ),
512                            );                            );
513      # Now the three buttons: UPDATE, SHOW, and DELETE.      # Now the three buttons: STORE, SHOW, and DELETE.
514      push @retVal, $cgi->Tr($cgi->th("&nbsp;"),      push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
515                             $cgi->td({align => 'center'},                             $cgi->td({align => 'center'},
516                                      $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .                                      $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .
# Line 516  Line 520 
520                            );                            );
521      # Close the table and the form.      # Close the table and the form.
522      push @retVal, $cgi->end_table();      push @retVal, $cgi->end_table();
     push @retVal, $cgi->end_form();  
523      # Return the assembled HTML.      # Return the assembled HTML.
524      return join("\n", @retVal, "");      return join("\n", @retVal, "");
525  }  }
526    
527    =head3 LoadAttributesFrom
528    
529    C<< my $stats = $attrDB->LoadAttributesFrom($fileName, %options); >>
530    
531    Load attributes from the specified tab-delimited file. Each line of the file must
532    contain an object ID in the first column, an attribute key name in the second
533    column, and attribute values in the remaining columns. The attribute values will
534    be assembled into a single value using the splitter code.
535    
536    =over 4
537    
538    =item fileName
539    
540    Name of the file from which to load the attributes.
541    
542    =item options
543    
544    Hash of options for modifying the load process.
545    
546    =item RETURN
547    
548    Returns a statistics object describing the load.
549    
550    =back
551    
552    Permissible option values are as follows.
553    
554    =over 4
555    
556    =item append
557    
558    If TRUE, then the attributes will be appended to existing data; otherwise, the
559    first time a key name is encountered, it will be erased.
560    
561    =back
562    
563    =cut
564    
565    sub LoadAttributesFrom {
566        # Get the parameters.
567        my ($self, $fileName, %options) = @_;
568        # Declare the return variable.
569        my $retVal = Stats->new('keys', 'values');
570        # Check for append mode.
571        my $append = ($options{append} ? 1 : 0);
572        # Create a hash of key names found.
573        my %keyHash = ();
574        # Open the file for input.
575        my $fh = Open(undef, "<$fileName");
576        # Loop through the file.
577        while (! eof $fh) {
578            my ($id, $key, @values) = Tracer::GetLine($fh);
579            $retVal->Add(linesIn => 1);
580            # Do some validation.
581            if (! defined($id)) {
582                # We ignore blank lines.
583                $retVal->Add(blankLines => 1);
584            } elsif (! defined($key)) {
585                # An ID without a key is a serious error.
586                my $lines = $retVal->Ask('linesIn');
587                Confess("Line $lines in $fileName has no attribute key.");
588            } else {
589                # Now we need to check for a new key.
590                if (! exists $keyHash{$key}) {
591                    # This is a new key. Verify that it exists.
592                    if (! $self->Exists('AttributeKey', $key)) {
593                        my $line = $retVal->Ask('linesIn');
594                        Confess("Attribute \"$key\" on line $line of $fileName not found in database.");
595                    } else {
596                        # Make sure we know this is no longer a new key.
597                        $keyHash{$key} = 1;
598                        $retVal->Add(keys => 1);
599                        # If this is NOT append mode, erase the key.
600                        if (! $append) {
601                            $self->EraseAttribute($key);
602                        }
603                    }
604                    Trace("Key $key found.") if T(3);
605                }
606                # Now we know the key is valid. Add this value.
607                $self->AddAttribute($id, $key, @values);
608                my $progress = $retVal->Add(values => 1);
609                Trace("$progress values loaded.") if T(3) && ($progress % 1000 == 0);
610    
611            }
612        }
613        # Return the result.
614        return $retVal;
615    }
616    
617    =head3 BackupKeys
618    
619    C<< my $stats = $attrDB->BackupKeys($fileName, %options); >>
620    
621    Backup the attribute key information from the attribute database.
622    
623    =over 4
624    
625    =item fileName
626    
627    Name of the output file.
628    
629    =item options
630    
631    Options for modifying the backup process.
632    
633    =item RETURN
634    
635    Returns a statistics object for the backup.
636    
637    =back
638    
639    Currently there are no options. The backup is straight to a text file in
640    tab-delimited format. Each key is backup up to two lines. The first line
641    is all of the data from the B<AttributeKey> table. The second is a
642    tab-delimited list of all the groups.
643    
644    =cut
645    
646    sub BackupKeys {
647        # Get the parameters.
648        my ($self, $fileName, %options) = @_;
649        # Declare the return variable.
650        my $retVal = Stats->new();
651        # Open the output file.
652        my $fh = Open(undef, ">$fileName");
653        # Set up to read the keys.
654        my $keyQuery = $self->Get(['AttributeKey'], "", []);
655        # Loop through the keys.
656        while (my $keyData = $keyQuery->Fetch()) {
657            $retVal->Add(key => 1);
658            # Get the fields.
659            my ($id, $type, $description) = $keyData->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
660                                                              'AttributeKey(description)']);
661            # Escape any tabs or new-lines in the description.
662            my $escapedDescription = Tracer::Escape($description);
663            # Write the key data to the output.
664            Tracer::PutLine($fh, [$id, $type, $escapedDescription]);
665            # Get the key's groups.
666            my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?", [$id],
667                                        'IsInGroup(to-link)');
668            $retVal->Add(memberships => scalar(@groups));
669            # Write them to the output. Note we put a marker at the beginning to insure the line
670            # is nonempty.
671            Tracer::PutLine($fh, ['#GROUPS', @groups]);
672        }
673        # Log the operation.
674        $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
675        # Return the result.
676        return $retVal;
677    }
678    
679    =head3 RestoreKeys
680    
681    C<< my $stats = $attrDB->RestoreKeys($fileName, %options); >>
682    
683    Restore the attribute keys and groups from a backup file.
684    
685    =over 4
686    
687    =item fileName
688    
689    Name of the file containing the backed-up keys. Each key has a pair of lines,
690    one containing the key data and one listing its groups.
691    
692    =back
693    
694    =cut
695    
696    sub RestoreKeys {
697        # Get the parameters.
698        my ($self, $fileName, %options) = @_;
699        # Declare the return variable.
700        my $retVal = Stats->new();
701        # Set up a hash to hold the group IDs.
702        my %groups = ();
703        # Open the file.
704        my $fh = Open(undef, "<$fileName");
705        # Loop until we're done.
706        while (! eof $fh) {
707            # Get a key record.
708            my ($id, $dataType, $description) = Tracer::GetLine($fh);
709            if ($id eq '#GROUPS') {
710                Confess("Group record found when key record expected.");
711            } elsif (! defined($description)) {
712                Confess("Invalid format found for key record.");
713            } else {
714                $retVal->Add("keyIn" => 1);
715                # Add this key to the database.
716                $self->InsertObject('AttributeKey', { id => $id, 'data-type' => $dataType,
717                                                      description => Tracer::UnEscape($description) });
718                Trace("Attribute $id stored.") if T(3);
719                # Get the group line.
720                my ($marker, @groups) = Tracer::GetLine($fh);
721                if (! defined($marker)) {
722                    Confess("End of file found where group record expected.");
723                } elsif ($marker ne '#GROUPS') {
724                    Confess("Group record not found after key record.");
725                } else {
726                    $retVal->Add(memberships => scalar(@groups));
727                    # Connect the groups.
728                    for my $group (@groups) {
729                        # Find out if this is a new group.
730                        if (! $groups{$group}) {
731                            $retVal->Add(newGroup => 1);
732                            # Add the group.
733                            $self->InsertObject('AttributeGroup', { id => $group });
734                            Trace("Group $group created.") if T(3);
735                            # Make sure we know it's not new.
736                            $groups{$group} = 1;
737                        }
738                        # Connect the group to our key.
739                        $self->InsertObject('IsInGroup', { 'from-link' => $id, 'to-link' => $group });
740                    }
741                    Trace("$id added to " . scalar(@groups) . " groups.") if T(3);
742                }
743            }
744        }
745        # Log the operation.
746        $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
747        # Return the result.
748        return $retVal;
749    }
750    
751    
752    =head3 BackupAllAttributes
753    
754    C<< my $stats = $attrDB->BackupAllAttributes($fileName, %options); >>
755    
756    Backup all of the attributes to a file. The attributes will be stored in a
757    tab-delimited file suitable for reloading via L</LoadAttributesFrom>.
758    
759    =over 4
760    
761    =item fileName
762    
763    Name of the file to which the attribute data should be backed up.
764    
765    =item options
766    
767    Hash of options for the backup.
768    
769    =item RETURN
770    
771    Returns a statistics object describing the backup.
772    
773    =back
774    
775    Currently there are no options defined.
776    
777    =cut
778    
779    sub BackupAllAttributes {
780        # Get the parameters.
781        my ($self, $fileName, %options) = @_;
782        # Declare the return variable.
783        my $retVal = Stats->new();
784        # Get a list of the keys.
785        my @keys = $self->GetFlat(['AttributeKey'], "", [], 'AttributeKey(id)');
786        Trace(scalar(@keys) . " keys found during backup.") if T(2);
787        # Open the file for output.
788        my $fh = Open(undef, ">$fileName");
789        # Loop through the keys.
790        for my $key (@keys) {
791            Trace("Backing up attribute $key.") if T(3);
792            $retVal->Add(keys => 1);
793            # Loop through this key's values.
794            my $query = $self->Get(['HasValueFor'], "HasValueFor(from-link) = ?", [$key]);
795            my $valuesFound = 0;
796            while (my $line = $query->Fetch()) {
797                $valuesFound++;
798                # Get this row's data.
799                my @row = $line->Values(['HasValueFor(to-link)', 'HasValueFor(from-link)',
800                                         'HasValueFor(value)']);
801                # Write it to the file.
802                Tracer::PutLine($fh, \@row);
803            }
804            Trace("$valuesFound values backed up for key $key.") if T(3);
805            $retVal->Add(values => $valuesFound);
806        }
807        # Log the operation.
808        $self->LogOperation("Backup Data", $fileName, $retVal->Display());
809        # Return the result.
810        return $retVal;
811    }
812    
813  =head3 FieldMenu  =head3 FieldMenu
814    
815  C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $newFlag, $noteControl, $typeControl); >>  C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $keys, %options); >>
816    
817  Return the HTML for a menu to select an attribute field. The menu will  Return the HTML for a menu to select an attribute field. The menu will
818  be a standard SELECT/OPTION thing which is called "popup menu" in the  be a standard SELECT/OPTION thing which is called "popup menu" in the
819  CGI package, but actually looks like a list. The list will contain  CGI package, but actually looks like a list. The list will contain
820  one selectable row per field, grouped by entity.  one selectable row per field.
821    
822  =over 4  =over 4
823    
# Line 545  Line 834 
834  Name to give to the menu. This is the name under which the value will  Name to give to the menu. This is the name under which the value will
835  appear when the form is submitted.  appear when the form is submitted.
836    
837  =item newFlag (optional)  =item keys
838    
839    Reference to a hash mapping each attribute key name to a list reference,
840    the list itself consisting of the attribute data type, its description,
841    and a list of its groups.
842    
843    =item options
844    
845    Hash containing options that modify the generation of the menu.
846    
847    =item RETURN
848    
849    Returns the HTML to create a form field that can be used to select an
850    attribute from the custom attributes system.
851    
852    =back
853    
854    The permissible options are as follows.
855    
856    =over 4
857    
858    =item new
859    
860  If TRUE, then extra rows will be provided to allow the user to select  If TRUE, then extra rows will be provided to allow the user to select
861  a new attribute. In other words, the user can select an existing  a new attribute. In other words, the user can select an existing
862  attribute, or can choose a C<(new)> marker to indicate a field to  attribute, or can choose a C<(new)> marker to indicate a field to
863  be created in the parent entity.  be created in the parent entity.
864    
865  =item noteControl (optional)  =item notes
866    
867  If specified, the name of a variable for displaying the notes attached  If specified, the name of a variable for displaying the notes attached
868  to the field. This must be in Javascript form ready for assignment.  to the field. This must be in Javascript form ready for assignment.
# Line 563  Line 873 
873  it is copied in. Specifying this parameter generates Javascript for  it is copied in. Specifying this parameter generates Javascript for
874  displaying the field description when a field is selected.  displaying the field description when a field is selected.
875    
876  =item typeControl (optional)  =item type
877    
878  If specified, the name of a variable for displaying the field's  If specified, the name of a variable for displaying the field's
879  data type. Data types are a much more controlled vocabulary than  data type. Data types are a much more controlled vocabulary than
# Line 571  Line 881 
881  raw value is put into the specified variable. Otherwise, the same  raw value is put into the specified variable. Otherwise, the same
882  rules apply to this value that apply to I<$noteControl>.  rules apply to this value that apply to I<$noteControl>.
883    
884  =item RETURN  =item groups
885    
886  Returns the HTML to create a form field that can be used to select an  If specified, the name of a multiple-selection list control (also called
887  attribute from the custom attributes system.  a popup menu) which shall be used to display the selected groups.
888    
889  =back  =back
890    
# Line 582  Line 892 
892    
893  sub FieldMenu {  sub FieldMenu {
894      # Get the parameters.      # Get the parameters.
895      my ($self, $cgi, $height, $name, $newFlag, $noteControl, $typeControl) = @_;      my ($self, $cgi, $height, $name, $keys, %options) = @_;
896      # These next two hashes make everything happen. "entities"      # Reformat the list of keys.
897      # maps each entity name to the list of values to be put into its      my %keys = %{$keys};
898      # option group. "labels" maps each entity name to a map from values      # Add the (new) key, if needed.
899      # to labels.      if ($options{new}) {
900      my @entityNames = sort ($self->GetEntityTypes());          $keys{NewName()} = ["string", ""];
901      my %entities = map { $_ => [] } @entityNames;      }
902      my %labels = map { $_ => { }} @entityNames;      # Get a sorted list of key.
903      # Loop through the entities, adding the existing attributes.      my @keys = sort keys %keys;
904      for my $entity (@entityNames) {      # We need to create the name for the onChange function. This function
         # Get this entity's field table.  
         my $fieldHash = $self->GetFieldTable($entity);  
         # Get its field list in our local hashes.  
         my $fieldList = $entities{$entity};  
         my $labelList = $labels{$entity};  
         # Add the NEW fields if we want them.  
         if ($newFlag) {  
             push @{$fieldList}, $entity;  
             $labelList->{$entity} = "(new)";  
         }  
         # Loop through the fields in the hash. We only keep the ones with a  
         # secondary relation name. (In other words, the name of the relation  
         # in which the field appears cannot be the same as the entity name.)  
         for my $fieldName (sort keys %{$fieldHash}) {  
             if ($fieldHash->{$fieldName}->{relation} ne $entity) {  
                 my $value = "$entity/$fieldName";  
                 push @{$fieldList}, $value;  
                 $labelList->{$value} = $fieldName;  
             }  
         }  
     }  
     # Now we have a hash and a list for each entity, and they correspond  
     # exactly to what the $cgi->optgroup function expects.  
     # The last step is to create the name for the onChange function. This function  
905      # may not do anything, but we need to know the name to generate the HTML      # may not do anything, but we need to know the name to generate the HTML
906      # for the menu.      # for the menu.
907      my $changeName = "${name}_setNotes";      my $changeName = "${name}_setNotes";
908      my $retVal = $cgi->popup_menu({name => $name,      my $retVal = $cgi->popup_menu({name => $name,
909                                     size => $height,                                     size => $height,
910                                     onChange => "$changeName(this.value)",                                     onChange => "$changeName(this.value)",
911                                     values => [map { $cgi->optgroup(-name => $_,                                     values => \@keys,
912                                                                     -values => $entities{$_},                                    });
                                                                    -labels => $labels{$_})  
                                                   } @entityNames]}  
                                  );  
913      # Create the change function.      # Create the change function.
914      $retVal .= "\n<script language=\"javascript\">\n";      $retVal .= "\n<script language=\"javascript\">\n";
915      $retVal .= "    function $changeName(fieldValue) {\n";      $retVal .= "    function $changeName(fieldValue) {\n";
916      # The function only has a body if we have a notes control to store the description.      # The function only has a body if we have a control to store data about the
917      if ($noteControl || $typeControl) {      # attribute.
918        if ($options{notes} || $options{type} || $options{groups}) {
919          # Check to see if we're storing HTML or text into the note control.          # Check to see if we're storing HTML or text into the note control.
920            my $noteControl = $options{notes};
921          my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);          my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
922          # We use a CASE statement based on the newly-selected field value. The          # We use a CASE statement based on the newly-selected field value. The
923          # field description will be stored in the JavaScript variable "myText"          # field description will be stored in the JavaScript variable "myText"
# Line 641  Line 926 
926          $retVal .= "        var myText = \"\";\n";          $retVal .= "        var myText = \"\";\n";
927          $retVal .= "        var myType = \"string\";\n";          $retVal .= "        var myType = \"string\";\n";
928          $retVal .= "        switch (fieldValue) {\n";          $retVal .= "        switch (fieldValue) {\n";
929          # Loop through the entities.          # Loop through the keys.
930          for my $entity (@entityNames) {          for my $key (@keys) {
             # Get the entity's field hash. This has the notes in it.  
             my $fieldHash = $self->GetFieldTable($entity);  
             # Loop through the values we might see for this entity's fields.  
             my $fields = $entities{$entity};  
             for my $value (@{$fields}) {  
                 # Only proceed if we have an existing field.  
                 if ($value =~ m!/(.+)$!) {  
                     # Get the field's hash element.  
                     my $element = $fieldHash->{$1};  
931                      # Generate this case.                      # Generate this case.
932                      $retVal .= "        case \"$value\" :\n";              $retVal .= "        case \"$key\" :\n";
933                      # Here we either want to update the note display, the                      # Here we either want to update the note display, the
934                      # type display, or both.              # type display, the group list, or a combination of them.
935                my ($type, $notes, @groups) = @{$keys{$key}};
936                      if ($noteControl) {                      if ($noteControl) {
                         # Here we want the notes updated.  
                         my $notes = $element->{Notes}->{content};  
937                          # Insure it's in the proper form.                          # Insure it's in the proper form.
938                          if ($htmlMode) {                          if ($htmlMode) {
939                              $notes = ERDB::HTMLNote($notes);                              $notes = ERDB::HTMLNote($notes);
# Line 668  Line 943 
943                          $notes =~ s/"/\\"/g;                          $notes =~ s/"/\\"/g;
944                          $retVal .= "           myText = \"$notes\";\n";                          $retVal .= "           myText = \"$notes\";\n";
945                      }                      }
946                      if ($typeControl) {              if ($options{type}) {
947                          # Here we want the type updated.                          # Here we want the type updated.
                         my $type = $element->{type};  
948                          $retVal .= "           myType = \"$type\";\n";                          $retVal .= "           myType = \"$type\";\n";
949                      }                      }
950                if ($options{groups}) {
951                    # Here we want the groups shown. Get a list of this attribute's groups.
952                    # We'll search through this list for each group to see if it belongs with
953                    # our attribute.
954                    my $groupLiteral = "=" . join("=", @groups) . "=";
955                    # Now we need some variables containing useful code for the javascript. It's
956                    # worth knowing we go through a bit of pain to insure $groupField[i] isn't
957                    # parsed as an array element.
958                    my $groupField = $options{groups};
959                    my $currentField = $groupField . "[i]";
960                    # Do the javascript.
961                    $retVal .= "           var groupList = \"$groupLiteral\";\n";
962                    $retVal .= "           for (var i = 0; i < $groupField.length; i++) {\n";
963                    $retVal .= "              var srchString = \"=\" + $currentField.value + \"=\";\n";
964                    $retVal .= "              var srchLoc = groupList.indexOf(srchString);\n";
965                    $retVal .= "              $currentField.checked = (srchLoc >= 0);\n";
966                    $retVal .= "           }\n";
967                }
968                      # Close this case.                      # Close this case.
969                      $retVal .= "           break;\n";                      $retVal .= "           break;\n";
970                  }                  }
             }  
         }  
971          # Close the CASE statement and make the appropriate assignments.          # Close the CASE statement and make the appropriate assignments.
972          $retVal .= "        }\n";          $retVal .= "        }\n";
973          if ($noteControl) {          if ($noteControl) {
974              $retVal .= "        $noteControl = myText;\n";              $retVal .= "        $noteControl = myText;\n";
975          }          }
976          if ($typeControl) {          if ($options{type}) {
977              $retVal .= "        $typeControl = myType;\n";              $retVal .= "        $options{type} = myType;\n";
978          }          }
979      }      }
980      # Terminate the change function.      # Terminate the change function.
# Line 694  Line 984 
984      return $retVal;      return $retVal;
985  }  }
986    
987  =head3 MatchSqlPattern  =head3 GetGroups
988    
989    C<< my @groups = $attrDB->GetGroups(); >>
990    
991    Return a list of the available groups.
992    
993    =cut
994    
995    sub GetGroups {
996        # Get the parameters.
997        my ($self) = @_;
998        # Get the groups.
999        my @retVal = $self->GetFlat(['AttributeGroup'], "", [], 'AttributeGroup(id)');
1000        # Return them.
1001        return @retVal;
1002    }
1003    
1004    =head3 GetAttributeData
1005    
1006  C<< my $matched = CustomAttributes::MatchSqlPattern($value, $pattern); >>  C<< my %keys = $attrDB->GetAttributeData($type, @list); >>
1007    
1008  Determine whether or not a specified value matches an SQL pattern. An SQL  Return attribute data for the selected attributes. The attribute
1009  pattern has two wild card characters: C<%> that matches multiple characters,  data is a hash mapping each attribute key name to a n-tuple containing the
1010  and C<_> that matches a single character. These can be escaped using a  data type, the description, and the groups. This is the same format expected in
1011  backslash (C<\>). We pull this off by converting the SQL pattern to a  the L</FieldMenu> and L</ControlForm> methods for the list of attributes to display.
 PERL regular expression. As per SQL rules, the match is case-insensitive.  
1012    
1013  =over 4  =over 4
1014    
1015  =item value  =item type
1016    
1017  Value to be matched against the pattern. Note that an undefined or empty  Type of attribute criterion: C<name> for attributes whose names begin with the
1018  value will not match anything.  specified string, or C<group> for attributes in the specified group.
1019    
1020  =item pattern  =item list
1021    
1022  SQL pattern against which to match the value. An undefined or empty pattern will  List containing the names of the groups or keys for the desired attributes.
 match everything.  
1023    
1024  =item RETURN  =item RETURN
1025    
1026  Returns TRUE if the value and pattern match, else FALSE.  Returns a hash mapping each attribute key name to its data type, description, and
1027    parent groups.
1028    
1029  =back  =back
1030    
1031  =cut  =cut
1032    
1033  sub MatchSqlPattern {  sub GetAttributeData {
1034      # Get the parameters.      # Get the parameters.
1035      my ($value, $pattern) = @_;      my ($self, $type, @list) = @_;
1036      # Declare the return variable.      # Set up a hash to store the attribute data.
1037      my $retVal;      my %retVal = ();
1038      # Insure we have a pattern.      # Loop through the list items.
1039      if (! defined($pattern) || $pattern eq "") {      for my $item (@list) {
1040          $retVal = 1;          # Set up a query for the desired attributes.
1041      } else {          my $query;
1042          # Break the pattern into pieces around the wildcard characters. Because we          if ($type eq 'name') {
1043          # use parentheses in the split function's delimiter expression, we'll get              # Here we're doing a generic name search. We need to escape it and then tack
1044          # list elements for the delimiters as well as the rest of the string.              # on a %.
1045          my @pieces = split /([_%]|\\[_%])/, $pattern;              my $parm = $item;
1046          # Check some fast special cases.              $parm =~ s/_/\\_/g;
1047          if ($pattern eq '%') {              $parm =~ s/%/\\%/g;
1048              # A null pattern matches everything.              $parm .= "%";
1049              $retVal = 1;              # Ask for matching attributes. (Note that if the user passed in a null string
1050          } elsif (@pieces == 1) {              # he'll get everything.)
1051              # No wildcards, so we have a literal comparison. Note we're case-insensitive.              $query = $self->Get(['AttributeKey'], "AttributeKey(id) LIKE ?", [$parm]);
1052              $retVal = (lc($value) eq lc($pattern));          } elsif ($type eq 'group') {
1053          } elsif (@pieces == 2 && $pieces[1] eq '%') {              $query = $self->Get(['IsInGroup', 'AttributeKey'], "IsInGroup(to-link) = ?", [$item]);
             # A wildcard at the end, so we have a substring match. This is also case-insensitive.  
             $retVal = (lc(substr($value, 0, length($pieces[0]))) eq lc($pieces[0]));  
         } else {  
             # Okay, we have to do it the hard way. Convert each piece to a PERL pattern.  
             my $realPattern = "";  
             for my $piece (@pieces) {  
                 # Determine the type of piece.  
                 if ($piece eq "") {  
                     # Empty pieces are ignored.  
                 } elsif ($piece eq "%") {  
                     # Here we have a multi-character wildcard. Note that it can match  
                     # zero or more characters.  
                     $realPattern .= ".*"  
                 } elsif ($piece eq "_") {  
                     # Here we have a single-character wildcard.  
                     $realPattern .= ".";  
                 } elsif ($piece eq "\\%" || $piece eq "\\_") {  
                     # This is an escape sequence (which is a rare thing, actually).  
                     $realPattern .= substr($piece, 1, 1);  
1054                  } else {                  } else {
1055                      # Here we have raw text.              Confess("Unknown attribute query type \"$type\".");
                     $realPattern .= quotemeta($piece);  
1056                  }                  }
1057            while (my $row = $query->Fetch()) {
1058                # Get this attribute's data.
1059                my ($key, $type, $notes) = $row->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
1060                                                         'AttributeKey(description)']);
1061                # If it's new, get its groups and add it to the return hash.
1062                if (! exists $retVal{$key}) {
1063                    my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?",
1064                                                [$key], 'IsInGroup(to-link)');
1065                    $retVal{$key} = [$type, $notes, @groups];
1066              }              }
             # Do the match.  
             $retVal = ($value =~ /^$realPattern$/i ? 1 : 0);  
1067          }          }
1068      }      }
1069      # Return the result.      # Return the result.
1070      return $retVal;      return %retVal;
1071  }  }
1072    
1073  =head3 MigrateAttributes  =head3 LogOperation
1074    
1075  C<< CustomAttributes::MigrateAttributes($fig); >>  C<< $ca->LogOperation($action, $target, $description); >>
1076    
1077  Migrate all the attributes data from the specified FIG instance. This is a long, slow  Write an operation description to the attribute activity log (C<$FIG_Config::var/attributes.log>).
 method used to convert the old attribute data to the new system. Only attribute  
 keys that are not already in the database will be loaded, and only for entity instances  
 current in the database. To get an accurate capture of the attributes in the given  
 instance, you may want to clear the database and the DBD before starting and  
 run L</Refresh> to populate the entities.  
1078    
1079  =over 4  =over 4
1080    
1081  =item fig  =item action
1082    
1083    Action being logged (e.g. C<Delete Group> or C<Load Key>).
1084    
1085    =item target
1086    
1087    ID of the key or group affected.
1088    
1089  A FIG object that can be used to retrieve attributes for migration purposes.  =item description
1090    
1091    Short description of the action.
1092    
1093  =back  =back
1094    
1095  =cut  =cut
1096    
1097  sub MigrateAttributes {  sub LogOperation {
1098      # Get the parameters.      # Get the parameters.
1099      my ($fig) = @_;      my ($self, $action, $target, $description) = @_;
1100      # Get a list of the objects to migrate. This requires connecting. Note we      # Get the user ID.
1101      # will map each entity type to a file name. The file will contain a list      my $user = $self->{user};
1102      # of the object's IDs so we can get to them when we're not connected to      # Get a timestamp.
1103      # the database.      my $timeString = Tracer::Now();
1104      my $ca = CustomAttributes->new();      # Open the log file for appending.
1105      my %objects = map { $_ => "$FIG_Config::temp/$_.keys.tbl" } $ca->GetEntityTypes();      my $oh = Open(undef, ">>$FIG_Config::var/attributes.log");
1106      # Set up hash of the existing attribute keys for each entity type.      # Write the data to it.
1107      my %oldKeys = ();      Tracer::PutLine($oh, [$timeString, $user, $action, $target, $description]);
1108      # Finally, we have a hash that counts the IDs for each entity type.      # Close the log file.
1109      my %idCounts = map { $_ => 0 } keys %objects;      close $oh;
1110      # Loop through the list, creating key files to read back in.  }
1111      for my $entityType (keys %objects) {  
1112          Trace("Retrieving keys for $entityType.") if T(2);  =head2 Internal Utility Methods
1113          # Create the key file.  
1114          my $idFile = Open(undef, ">$objects{$entityType}");  =head3 _KeywordString
1115          # Loop through the keys.  
1116          my @ids = $ca->GetFlat([$entityType], "", [], "$entityType(id)");  C<< my $keywordString = $ca->_KeywordString($key, $value); >>
1117          for my $id (@ids) {  
1118              print $idFile "$id\n";  Compute the keyword string for a specified key/value pair. This consists of the
1119          }  key name and value converted to lower case with underscores translated to spaces.
1120          close $idFile;  
1121          # In addition to the key file, we must get a list of attributes already  This method is for internal use only. It is called whenever we need to update or
1122          # in the database. This avoids a circularity problem that might occur if the $fig  insert a B<HasValueFor> record.
1123          # object is retrieving from the custom attributes database already.  
1124          my %fields = $ca->GetSecondaryFields($entityType);  =over 4
1125          $oldKeys{$entityType} = \%fields;  
1126          # Finally, we have the ID count.  =item key
1127          $idCounts{$entityType} = scalar @ids;  
1128      }  Name of the relevant attribute key.
1129      # Release the custom attributes database so we can add attributes.  
1130      undef $ca;  =item target
1131      # Loop through the objects.  
1132      for my $entityType (keys %objects) {  ID of the target object to which this key/value pair will be associated.
1133          # Get a hash of all the attributes already in this database. These are  
1134          # left untouched.  =item value
1135          my $myOldKeys = $oldKeys{$entityType};  
1136          # Create a hash to control the load file names for each attribute key we find.  The value to store for this key/object combination.
1137          my %keyHash = ();  
1138          # Set up some counters so we can trace our progress.  =item RETURN
1139          my ($totalIDs, $processedIDs, $keyCount, $valueCount) = ($idCounts{$entityType}, 0, 0, 0);  
1140          # Open this object's ID file.  Returns the value that should be stored as the keyword string for the specified
1141          Trace("Migrating data for $entityType. $totalIDs found.") if T(3);  key/value pair.
1142          my $keysIn = Open(undef, "<$objects{$entityType}");  
1143          while (my $id = <$keysIn>) {  =back
1144              # Remove the EOL characters.  
1145              chomp $id;  =cut
1146              # Get this object's attributes.  
1147              my @allData = $fig->get_attributes($id);  sub _KeywordString {
1148              Trace(scalar(@allData) . " attribute values found for $entityType($id).") if T(4);      # Get the parameters.
1149              # Loop through the attribute values one at a time.      my ($self, $key, $value) = @_;
1150              for my $dataTuple (@allData) {      # Get a copy of the key name and convert underscores to spaces.
1151                  # Get the key, value, and URL. We ignore the first element because that's the      my $keywordString = $key;
1152                  # object ID, and we already know the object ID.      $keywordString =~ s/_/ /g;
1153                  my (undef, $key, $value, $url) = @{$dataTuple};      # Add the value convert it all to lower case.
1154                  # Remove the buggy "1" for $url.      my $retVal = lc "$keywordString $value";
1155                  if ($url eq "1") {      # Return the result.
1156                      $url = undef;      return $retVal;
                 }  
                 # Only proceed if this is not an old key.  
                 if (! $myOldKeys->{$key}) {  
                     # See if we've run into this key before.  
                     if (! exists $keyHash{$key}) {  
                         # Here we need to create the attribute key in the database.  
                         StoreAttributeKey($entityType, $key, 'text',  
                                           "Key migrated automatically from the FIG system. " .  
                                           "Please replace these notes as soon as possible " .  
                                           "with useful text."  
                                          );  
                         # Compute the attribute's load file name and open it for output.  
                         my $fileName = "$FIG_Config::temp/$entityType.$key.load.tbl";  
                         my $fh = Open(undef, ">$fileName");  
                         # Store the file name and handle.  
                         $keyHash{$key} = {h => $fh, name => $fileName};  
                         # Count this key.  
                         $keyCount++;  
                     }  
                     # Smash the value and the URL together.  
                     if (defined($url) && length($url) > 0) {  
                         $value .= "::$url";  
                     }  
                     # Write the attribute value to the load file.  
                     Tracer::PutLine($keyHash{$key}->{h}, [$id, $value]);  
                     $valueCount++;  
                 }  
             }  
             # Now we've finished all the attributes for this object. Count and trace it.  
             $processedIDs++;  
             if ($processedIDs % 500 == 0) {  
                 Trace("$processedIDs of $totalIDs ${entityType}s processed.") if T(3);  
                 Trace("$entityType has $keyCount keys and $valueCount values so far.") if T(3);  
             }  
         }  
         # Now we've finished all the attributes for all objects of this type.  
         Trace("$processedIDs ${entityType}s processed, with $keyCount keys and $valueCount values.") if T(2);  
         # Loop through the files, loading the keys into the database.  
         Trace("Connecting to database.") if T(2);  
         my $objectCA = CustomAttributes->new();  
         Trace("Loading key files.") if T(2);  
         for my $key (sort keys %keyHash) {  
             # Close the key's load file.  
             close $keyHash{$key}->{h};  
             # Reopen it for input.  
             my $fileName = $keyHash{$key}->{name};  
             my $fh = Open(undef, "<$fileName");  
             Trace("Loading $key from $fileName.") if T(3);  
             my $stats = $objectCA->LoadAttributeKey($entityType, $key, $fh, 0, 1);  
             Trace("Statistics for $key of $entityType:\n" . $stats->Show()) if T(3);  
         }  
         # All the keys for this entity type are now loaded.  
         Trace("Key files loaded for $entityType.") if T(2);  
     }  
     # All keys for all entity types are now loaded.  
     Trace("Migration complete.") if T(2);  
1157  }  }
1158    
1159  =head3 ComputeObjectTypeFromID  =head3 _QueryResults
1160    
1161  C<< my ($entityName, $id) = CustomAttributes::ComputeObjectTypeFromID($objectID); >>  C<< my @attributeList = $attrDB->_QueryResults($query, @values); >>
1162    
1163  This method will compute the entity type corresponding to a specified object ID.  Match the results of a B<HasValueFor> query against value criteria and return
1164  If the object ID begins with C<fig|>, it is presumed to be a feature ID. If it  the results. This is an internal method that splits the values coming back
1165  is all digits with a single period, it is presumed to by a genome ID. Otherwise,  and matches the sections against the specified section patterns. It serves
1166  it must be a list reference. In this last case the first list element will be  as the back end to L</GetAttributes> and L</FindAttributes>.
 taken as the entity type and the second will be taken as the actual ID.  
1167    
1168  =over 4  =over 4
1169    
1170  =item objectID  =item query
1171    
1172    A query object that will return the desired B<HasValueFor> records.
1173    
1174  Object ID to examine.  =item values
1175    
1176    List of the desired attribute values, section by section. If C<undef>
1177    or an empty string is specified, all values in that section will match. A
1178    generic match can be requested by placing a percent sign (C<%>) at the end.
1179    In that case, all values that match up to and not including the percent sign
1180    will match. You may also specify a regular expression enclosed
1181    in slashes. All values that match the regular expression will be returned. For
1182    performance reasons, only values have this extra capability.
1183    
1184  =item RETURN  =item RETURN
1185    
1186  Returns a 2-element list consisting of the entity type followed by the specified ID.  Returns a list of tuples. The first element in the tuple is an object ID, the
1187    second is an attribute key, and the remaining elements are the sections of
1188    the attribute value. All of the tuples will match the criteria set forth in
1189    the parameter list.
1190    
1191  =back  =back
1192    
1193  =cut  =cut
1194    
1195  sub ComputeObjectTypeFromID {  sub _QueryResults {
1196      # Get the parameters.      # Get the parameters.
1197      my ($objectID) = @_;      my ($self, $query, @values) = @_;
1198      # Declare the return variables.      # Declare the return value.
1199      my ($entityName, $id);      my @retVal = ();
1200      # Only proceed if the object ID is defined. If it's not, we'll be returning a      # Get the number of value sections we have to match.
1201      # pair of undefs.      my $sectionCount = scalar(@values);
1202      if ($objectID) {      # Loop through the assignments found.
1203          if (ref $objectID eq 'ARRAY') {      while (my $row = $query->Fetch()) {
1204              # Here we have the new-style list reference. Pull out its pieces.          # Get the current row's data.
1205              ($entityName, $id) = @{$objectID};          my ($id, $key, $valueString) = $row->Values(['HasValueFor(to-link)', 'HasValueFor(from-link)',
1206          } else {                                                        'HasValueFor(value)']);
1207              # Here the ID is the outgoing ID, and we need to look at its structure          # Break the value into sections.
1208              # to determine the entity type.          my @sections = split($self->{splitter}, $valueString);
1209              $id = $objectID;          # Match each section against the incoming values. We'll assume we're
1210              if ($objectID =~ /^\d+\.\d+/) {          # okay unless we learn otherwise.
1211                  # Digits with a single period is a genome.          my $matching = 1;
1212                  $entityName = 'Genome';          for (my $i = 0; $i < $sectionCount && $matching; $i++) {
1213              } elsif ($objectID =~ /^fig\|/) {              # We need to check to see if this section is generic.
1214                  # The "fig|" prefix indicates a feature.              my $value = $values[$i];
1215                  $entityName = 'Feature';              Trace("Current value pattern is \"$value\".") if T(4);
1216                if (substr($value, -1, 1) eq '%') {
1217                    Trace("Generic match used.") if T(4);
1218                    # Here we have a generic match.
1219                    my $matchLen = length($values[$i] - 1);
1220                    $matching = substr($sections[$i], 0, $matchLen) eq
1221                                substr($values[$i], 0, $matchLen);
1222                } elsif ($value =~ m#^/(.+)/[a-z]*$#) {
1223                    Trace("Regular expression detected.") if T(4);
1224                    # Here we have a regular expression match.
1225                    my $section = $sections[$i];
1226                    $matching = eval("\$section =~ $value");
1227              } else {              } else {
1228                  # Anything else is illegal!                  # Here we have a strict match.
1229                  Confess("Invalid attribute ID specification \"$objectID\".");                  Trace("Strict match used.") if T(4);
1230                    $matching = ($sections[$i] eq $values[$i]);
1231              }              }
1232          }          }
1233            # If we match, output this row to the return list.
1234            if ($matching) {
1235                push @retVal, [$id, $key, @sections];
1236      }      }
1237      # Return the result.      }
1238      return ($entityName, $id);      # Return the rows found.
1239        return @retVal;
1240  }  }
1241    
1242  =head2 FIG Method Replacements  =head2 FIG Method Replacements
1243    
1244  The following methods are used by B<FIG.pm> to replace the previous attribute functionality.  The following methods are used by B<FIG.pm> to replace the previous attribute functionality.
1245  Some of the old functionality is no longer present. Controlled vocabulary is no longer  Some of the old functionality is no longer present: controlled vocabulary is no longer
1246  supported and there is no longer any searching by URL. Fortunately, neither of these  supported and there is no longer any searching by URL. Fortunately, neither of these
1247  capabilities were used in the old system.  capabilities were used in the old system.
1248    
# Line 988  Line 1256 
1256  value of the splitter parameter on the constructor (L</new>). The default is double  value of the splitter parameter on the constructor (L</new>). The default is double
1257  colons C<::>.  colons C<::>.
1258    
1259  So, for example, an old-style keyword with a /value of C<essential> and a URL of  So, for example, an old-style keyword with a value of C<essential> and a URL of
1260  C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default  C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default
1261  splitter value would be stored as  splitter value would be stored as
1262    
# Line 999  Line 1267 
1267    
1268  =head3 GetAttributes  =head3 GetAttributes
1269    
1270  C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @valuePatterns); >>  C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @values); >>
1271    
1272  In the database, attribute values are sectioned into pieces using a splitter  In the database, attribute values are sectioned into pieces using a splitter
1273  value specified in the constructor (L</new>). This is not a requirement of  value specified in the constructor (L</new>). This is not a requirement of
1274  the attribute system as a whole, merely a convenience for the purpose of  the attribute system as a whole, merely a convenience for the purpose of
1275  these methods. If you are using the static method calls instead of the  these methods. If a value has multiple sections, each section
1276  object-based calls, the splitter will always be the default value of  is matched against the corresponding criterion in the I<@valuePatterns> list.
 double colons (C<::>). If a value has multiple sections, each section  
 is matched against the correspond criterion in the I<@valuePatterns> list.  
1277    
1278  This method returns a series of tuples that match the specified criteria. Each tuple  This method returns a series of tuples that match the specified criteria. Each tuple
1279  will contain an object ID, a key, and one or more values. The parameters to this  will contain an object ID, a key, and one or more values. The parameters to this
1280  method therefore correspond structurally to the values expected in each tuple.  method therefore correspond structurally to the values expected in each tuple. In
1281    addition, you can ask for a generic search by suffixing a percent sign (C<%>) to any
1282    of the parameters. So, for example,
1283    
1284      my @attributeList = GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);      my @attributeList = $attrDB->GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);
1285    
1286  would return something like  would return something like
1287    
# Line 1022  Line 1290 
1290      ['fig}100226.1.peg.1004', 'structure2', 1, 2]      ['fig}100226.1.peg.1004', 'structure2', 1, 2]
1291      ['fig}100226.1.peg.1004', 'structureA', 1, 2]      ['fig}100226.1.peg.1004', 'structureA', 1, 2]
1292    
1293  Use of C<undef> in any position acts as a wild card (all values). In addition,  Use of C<undef> in any position acts as a wild card (all values). You can also specify
1294  the I<$key> and I<@valuePatterns> parameters can contain SQL pattern characters: C<%>, which  a list reference in the ID column. Thus,
1295  matches any sequence of characters, and C<_>, which matches any single character.  
1296  (You can use an escape sequence C<\%> or C<\_> to match an actual percent sign or      my @attributeList = $attrDB->GetAttributes(['100226.1', 'fig|100226.1.%'], 'PUBMED');
1297  underscore.)  
1298    would get the PUBMED attribute data for Streptomyces coelicolor A3(2) and all its
1299    features.
1300    
1301  In addition to values in multiple sections, a single attribute key can have multiple  In addition to values in multiple sections, a single attribute key can have multiple
1302  values, so even  values, so even
1303    
1304      my @attributeList = GetAttributes($peg, 'virulent');      my @attributeList = $attrDB->GetAttributes($peg, 'virulent');
1305    
1306  which has no wildcard in the key or the object ID, may return multiple tuples.  which has no wildcard in the key or the object ID, may return multiple tuples.
1307    
1308  For reasons of backward compatability, we examine the structure of the object ID to  Value matching in this system works very poorly, because of the way multiple values are
1309  determine the entity type. In that case the only two types allowed are C<Genome> and  stored. For the object ID and key name, we create queries that filter for the desired
1310  C<Feature>. An alternative method is to use a list reference, with the list consisting  results. For the values, we do a comparison after the attributes are retrieved from the
1311  of an entity type name and the actual ID. Thus, the above example could equivalently  database. As a result, queries in which filter only on value end up reading the entire
1312  be written as  attribute table to find the desired results.
   
     my @attributeList = GetAttributes([Feature => $peg], 'virulent');  
   
 The list-reference approach allows us to add attributes to other entity types in  
 the future. Doing so, however, will require modifying the L</Refresh> method and  
 updated the database design XML.  
   
 The list-reference approach also allows for a more fault-tolerant approach to  
 getting all objects with a particular attribute.  
   
     my @attributeList = GetAttributes([Feature => undef], 'virulent');  
   
 will only return feature attributes, while  
   
     my @attributeList = GetAttributes(undef, 'virulent');  
   
 could at some point in the future get you attributes for genomes or even subsystems  
 as well as features.  
1313    
1314  =over 4  =over 4
1315    
1316  =item objectID  =item objectID
1317    
1318  ID of the genome or feature whose attributes are desired. In general, an ID that  ID of object whose attributes are desired. If the attributes are desired for multiple
1319  starts with C<fig|> is treated as a feature ID, and an ID that is all digits with a  objects, this parameter can be specified as a list reference. If the attributes are
1320  single period is treated as a genome ID. For other entity types, use a list reference; in  desired for all objects, specify C<undef> or an empty string. Finally, you can specify
1321  this case the first list element is the entity type and the second is the ID. A value of  attributes for a range of object IDs by putting a percent sign (C<%>) at the end.
 C<undef> or an empty string here will match all objects.  
1322    
1323  =item key  =item key
1324    
1325  Attribute key name. Since attributes are stored as fields in the database with a  Attribute key name. A value of C<undef> or an empty string will match all
1326  field name equal to the key name, it is very fast to find a list of all the  attribute keys. If the values are desired for multiple keys, this parameter can be
1327  matching keys. Each key's values require a separate query, however, which may  specified as a list reference. Finally, you can specify attributes for a range of
1328  be a performance problem if the pattern matches a lot of keys. Wild cards are  keys by putting a percent sign (C<%>) at the end.
 acceptable here, and a value of C<undef> or an empty string will match all  
 attribute keys.  
1329    
1330  =item valuePatterns  =item values
1331    
1332  List of the desired attribute values, section by section. If C<undef>  List of the desired attribute values, section by section. If C<undef>
1333  or an empty string is specified, all values in that section will match.  or an empty string is specified, all values in that section will match. A
1334    generic match can be requested by placing a percent sign (C<%>) at the end.
1335    In that case, all values that match up to and not including the percent sign
1336    will match. You may also specify a regular expression enclosed
1337    in slashes. All values that match the regular expression will be returned. For
1338    performance reasons, only values have this extra capability.
1339    
1340  =item RETURN  =item RETURN
1341    
# Line 1096  Line 1350 
1350    
1351  sub GetAttributes {  sub GetAttributes {
1352      # Get the parameters.      # Get the parameters.
1353      my ($self, $objectID, $key, @valuePatterns) = @_;      my ($self, $objectID, $key, @values) = @_;
1354      # Declare the return variable.      # We will create one big honking query. The following hash will build the filter
1355      my @retVal = ();      # clause and a parameter list.
1356      # Determine the entity types for our search.      my %data = ('HasValueFor(from-link)' => $key, 'HasValueFor(to-link)' => $objectID);
1357      my @objects = ();      my @filter = ();
1358      my ($actualObjectID, $computedType);      my @parms = ();
1359      if (! $objectID) {      # This next loop goes through the different fields that can be specified in the
1360          push @objects, $self->GetEntityTypes();      # parameter list and generates filters for each.
1361        for my $field (keys %data) {
1362            # Accumulate filter information for this field. We will OR together all the
1363            # elements accumulated to create the final result.
1364            my @fieldFilter = ();
1365            # Get the specified data from the caller.
1366            my $fieldPattern = $data{$field};
1367            # Only proceed if the pattern is one that won't match everything.
1368            if (defined($fieldPattern) && $fieldPattern ne "" && $fieldPattern ne "%") {
1369                # Convert the pattern to an array.
1370                my @patterns = ();
1371                if (ref $fieldPattern eq 'ARRAY') {
1372                    push @patterns, @{$fieldPattern};
1373      } else {      } else {
1374          ($computedType, $actualObjectID) = ComputeObjectTypeFromID($objectID);                  push @patterns, $fieldPattern;
         push @objects, $computedType;  
1375      }      }
1376      # Loop through the entity types.              # Only proceed if the array is nonempty. The loop will work fine if the
1377      for my $entityType (@objects) {              # array is empty, but when we build the filter string at the end we'll
1378          # Now we need to find all the matching keys. The keys are actually stored in              # get "()" in the filter list, which will result in an SQL syntax error.
1379          # our database object, so this process is fast. Note that our              if (@patterns) {
1380          # MatchSqlPattern method                  # Loop through the individual patterns.
1381          my %secondaries = $self->GetSecondaryFields($entityType);                  for my $pattern (@patterns) {
1382          my @fieldList = grep { MatchSqlPattern($_, $key) } keys %secondaries;                      # Check for a generic request.
1383          # Now we figure out whether or not we need to filter by object. We will always                      if (substr($pattern, -1, 1) ne '%') {
1384          # filter by key to a limited extent, so if we're filtering by object we need an                          # Here we have a normal request.
1385          # AND to join the object ID filter with the key filter.                          push @fieldFilter, "$field = ?";
1386          my $filter = "";                          push @parms, $pattern;
1387          my @params = ();                      } else {
1388          if (defined($actualObjectID)) {                          # Here we have a generate request, so we will use the LIKE operator to
1389              # Here the caller wants to filter on object ID.                          # filter the field to this value pattern.
1390              $filter = "$entityType(id) = ? AND ";                          push @fieldFilter, "$field LIKE ?";
1391              push @params, $actualObjectID;                          # We must convert the pattern value to an SQL match pattern. First
1392          }                          # we get a copy of it.
1393          # It's time to begin making queries. We process one attribute key at a time, because                          my $actualPattern = $pattern;
1394          # each attribute is actually a different field in the database. We know here that                          # Now we escape the underscores. Underscores are an SQL wild card
1395          # all the keys we've collected are for the correct entity because we got them from                          # character, but they are used frequently in key names and object IDs.
1396          # the DBD. That's a good thing, because an invalid key name will cause an SQL error.                          $actualPattern =~ s/_/\\_/g;
1397          for my $key (@fieldList) {                          # Add the escaped pattern to the bound parameter list.
1398              # Get all of the attribute values for this key.                          push @parms, $actualPattern;
1399              my @dataRows = $self->GetAll([$entityType], "$filter$entityType($key) IS NOT NULL",                      }
1400                                           \@params, ["$entityType(id)", "$entityType($key)"]);                  }
1401              # Process each value separately. We need to verify the values and reformat the                  # Form the filter for this field.
1402              # tuples. Note that GetAll will give us one row per matching object ID,                  my $fieldFilterString = join(" OR ", @fieldFilter);
1403              # with the ID first followed by a list of the data values. This is very                  push @filter, "($fieldFilterString)";
             # different from the structure we'll be returning, which has one row  
             # per value.  
             for my $dataRow (@dataRows) {  
                 # Get the object ID and the list of values.  
                 my ($rowObjectID, @dataValues) = @{$dataRow};  
                 # Loop through the values. There will be one result row per attribute value.  
                 for my $dataValue (@dataValues) {  
                     # Separate this value into sections.  
                     my @sections = split("::", $dataValue);  
                     # Loop through the value patterns, looking for a mismatch. Note that  
                     # since we're working through parallel arrays, we are using an index  
                     # loop. As soon as a match fails we stop checking. This means that  
                     # if the value pattern list is longer than the number of sections,  
                     # we will fail as soon as we run out of sections.  
                     my $match = 1;  
                     for (my $i = 0; $i <= $#valuePatterns && $match; $i++) {  
                         $match = MatchSqlPattern($sections[$i], $valuePatterns[$i]);  
                     }  
                     # If we match, we save this value in the output list.  
                     if ($match) {  
                         push @retVal, [$rowObjectID, $key, @sections];  
                     }  
                 }  
                 # Here we've processed all the attribute values for the current object ID.  
1404              }              }
             # Here we've processed all the rows returned by GetAll. In general, there will  
             # be one row per object ID.  
1405          }          }
         # Here we've processed all the matching attribute keys.  
1406      }      }
1407      # Here we've processed all the entity types. That means @retVal has all the matching      # Now @filter contains one or more filter strings and @parms contains the parameter
1408      # results.      # values to bind to them.
1409        my $actualFilter = join(" AND ", @filter);
1410        # Now we're ready to make our query.
1411        my $query = $self->Get(['HasValueFor'], $actualFilter, \@parms);
1412        # Format the results.
1413        my @retVal = $self->_QueryResults($query, @values);
1414        # Return the rows found.
1415      return @retVal;      return @retVal;
1416  }  }
1417    
# Line 1182  Line 1426 
1426    
1427  =item objectID  =item objectID
1428    
1429  ID of the genome or feature to which the attribute is to be added. In general, an ID that  ID of the object to which the attribute is to be added.
 starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods  
 is treated as a genome ID. For IDs of other types, this parameter should be a reference  
 to a 2-tuple consisting of the entity type name followed by the object ID.  
1430    
1431  =item key  =item key
1432    
1433  Attribute key name. This corresponds to the name of a field in the database.  Attribute key name.
1434    
1435  =item values  =item values
1436    
# Line 1212  Line 1453 
1453      } elsif (! @values) {      } elsif (! @values) {
1454          Confess("No values specified in AddAttribute call for key $key.");          Confess("No values specified in AddAttribute call for key $key.");
1455      } else {      } else {
1456          # Okay, now we have some reason to believe we can do this. Start by          # Okay, now we have some reason to believe we can do this. Form the values
1457          # computing the object type and ID.          # into a scalar.
         my ($entityName, $id) = ComputeObjectTypeFromID($objectID);  
         # Form the values into a scalar.  
1458          my $valueString = join($self->{splitter}, @values);          my $valueString = join($self->{splitter}, @values);
1459          # Insert the value.          # Connect the object to the key.
1460          $self->InsertValue($id, "$entityName($key)", $valueString);          $self->InsertObject('HasValueFor', { 'from-link' => $key,
1461                                                 'to-link'   => $objectID,
1462                                                 'value'     => $valueString,
1463                                           });
1464      }      }
1465      # Return a one. We do this for backward compatability.      # Return a one, indicating success. We do this for backward compatability.
1466      return 1;      return 1;
1467  }  }
1468    
# Line 1230  Line 1472 
1472    
1473  Delete the specified attribute key/value combination from the database.  Delete the specified attribute key/value combination from the database.
1474    
 The first form will connect to the database and release it. The second form  
 uses the database connection contained in the object.  
   
1475  =over 4  =over 4
1476    
1477  =item objectID  =item objectID
1478    
1479  ID of the genome or feature to which the attribute is to be added. In general, an ID that  ID of the object whose attribute is to be deleted.
 starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods  
 is treated as a genome ID. For IDs of other types, this parameter should be a reference  
 to a 2-tuple consisting of the entity type name followed by the object ID.  
1480    
1481  =item key  =item key
1482    
1483  Attribute key name. This corresponds to the name of a field in the database.  Attribute key name.
1484    
1485  =item values  =item values
1486    
1487  One or more values to be associated with the key.  One or more values associated with the key. If no values are specified, then all values
1488    will be deleted. Otherwise, only a matching value will be deleted.
1489    
1490  =back  =back
1491    
# Line 1262  Line 1499 
1499          Confess("No object ID specified for DeleteAttribute call.");          Confess("No object ID specified for DeleteAttribute call.");
1500      } elsif (! defined($key)) {      } elsif (! defined($key)) {
1501          Confess("No attribute key specified for DeleteAttribute call.");          Confess("No attribute key specified for DeleteAttribute call.");
1502      } elsif (! @values) {      } elsif (scalar(@values) == 0) {
1503          Confess("No values specified in DeleteAttribute call for key $key.");          # Here we erase the entire key for this object.
1504            $self->DeleteRow('HasValueFor', $key, $objectID);
1505      } else {      } else {
1506          # Now compute the object type and ID.          # Here we erase the matching values.
         my ($entityName, $id) = ComputeObjectTypeFromID($objectID);  
         # Form the values into a scalar.  
1507          my $valueString = join($self->{splitter}, @values);          my $valueString = join($self->{splitter}, @values);
1508          # Delete the value.          $self->DeleteRow('HasValueFor', $key, $objectID, { value => $valueString });
         $self->DeleteValue($entityName, $id, $key, $valueString);  
1509      }      }
1510      # Return a one. This is for backward compatability.      # Return a one. This is for backward compatability.
1511      return 1;      return 1;
1512  }  }
1513    
1514    =head3 DeleteMatchingAttributes
1515    
1516    C<< my @deleted = $attrDB->DeleteMatchingAttributes($objectID, $key, @values); >>
1517    
1518    Delete all attributes that match the specified criteria. This is equivalent to
1519    calling L</GetAttributes> and then invoking L</DeleteAttribute> for each
1520    row found.
1521    
1522    =over 4
1523    
1524    =item objectID
1525    
1526    ID of object whose attributes are to be deleted. If the attributes for multiple
1527    objects are to be deleted, this parameter can be specified as a list reference. If
1528    attributes are to be deleted for all objects, specify C<undef> or an empty string.
1529    Finally, you can delete attributes for a range of object IDs by putting a percent
1530    sign (C<%>) at the end.
1531    
1532    =item key
1533    
1534    Attribute key name. A value of C<undef> or an empty string will match all
1535    attribute keys. If the values are to be deletedfor multiple keys, this parameter can be
1536    specified as a list reference. Finally, you can delete attributes for a range of
1537    keys by putting a percent sign (C<%>) at the end.
1538    
1539    =item values
1540    
1541    List of the desired attribute values, section by section. If C<undef>
1542    or an empty string is specified, all values in that section will match. A
1543    generic match can be requested by placing a percent sign (C<%>) at the end.
1544    In that case, all values that match up to and not including the percent sign
1545    will match. You may also specify a regular expression enclosed
1546    in slashes. All values that match the regular expression will be deleted. For
1547    performance reasons, only values have this extra capability.
1548    
1549    =item RETURN
1550    
1551    Returns a list of tuples for the attributes that were deleted, in the
1552    same form as L</GetAttributes>.
1553    
1554    =back
1555    
1556    =cut
1557    
1558    sub DeleteMatchingAttributes {
1559        # Get the parameters.
1560        my ($self, $objectID, $key, @values) = @_;
1561        # Get the matching attributes.
1562        my @retVal = $self->GetAttributes($objectID, $key, @values);
1563        # Loop through the attributes, deleting them.
1564        for my $tuple (@retVal) {
1565            $self->DeleteAttribute(@{$tuple});
1566        }
1567        # Log this operation.
1568        my $count = @retVal;
1569        $self->LogOperation("Mass Delete", $key, "$count matching attributes deleted.");
1570        # Return the deleted attributes.
1571        return @retVal;
1572    }
1573    
1574  =head3 ChangeAttribute  =head3 ChangeAttribute
1575    
1576  C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>  C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>
# Line 1320  Line 1615 
1615      } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {      } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {
1616          Confess("No new values specified in ChangeAttribute call for key $key.");          Confess("No new values specified in ChangeAttribute call for key $key.");
1617      } else {      } else {
1618          # Okay, now we do the change as a delete/add.          # We do the change as a delete/add.
1619          $self->DeleteAttribute($objectID, $key, @{$oldValues});          $self->DeleteAttribute($objectID, $key, @{$oldValues});
1620          $self->AddAttribute($objectID, $key, @{$newValues});          $self->AddAttribute($objectID, $key, @{$newValues});
1621      }      }
# Line 1330  Line 1625 
1625    
1626  =head3 EraseAttribute  =head3 EraseAttribute
1627    
1628  C<< $attrDB->EraseAttribute($entityName, $key); >>  C<< $attrDB->EraseAttribute($key); >>
1629    
1630  Erase all values for the specified attribute key. This does not remove the  Erase all values for the specified attribute key. This does not remove the
1631  key from the database; it merely removes all the values.  key from the database; it merely removes all the values.
1632    
1633  =over 4  =over 4
1634    
 =item entityName  
   
 Name of the entity to which the key belongs. If undefined, all entities will be  
 examined for the desired key.  
   
1635  =item key  =item key
1636    
1637  Key to erase.  Key to erase.
# Line 1352  Line 1642 
1642    
1643  sub EraseAttribute {  sub EraseAttribute {
1644      # Get the parameters.      # Get the parameters.
1645      my ($self, $entityName, $key) = @_;      my ($self, $key) = @_;
1646      # Determine the relevant entity types.      # Delete everything connected to the key.
1647      my @objects = ();      $self->Disconnect('HasValueFor', 'AttributeKey', $key);
1648      if (! $entityName) {      # Log the operation.
1649          push @objects, $self->GetEntityTypes();      $self->LogOperation("Erase Data", $key);
1650        # Return a 1, for backward compatability.
1651        return 1;
1652    }
1653    
1654    =head3 GetAttributeKeys
1655    
1656    C<< my @keyList = $attrDB->GetAttributeKeys($groupName); >>
1657    
1658    Return a list of the attribute keys for a particular group.
1659    
1660    =over 4
1661    
1662    =item groupName
1663    
1664    Name of the group whose keys are desired.
1665    
1666    =item RETURN
1667    
1668    Returns a list of the attribute keys for the specified group.
1669    
1670    =back
1671    
1672    =cut
1673    
1674    sub GetAttributeKeys {
1675        # Get the parameters.
1676        my ($self, $groupName) = @_;
1677        # Get the attributes for the specified group.
1678        my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(to-link) = ?", [$groupName],
1679                                    'IsInGroup(from-link)');
1680        # Return the keys.
1681        return sort @groups;
1682    }
1683    
1684    =head3 ParseID
1685    
1686    C<< my ($type, $id) = CustomAttributes::ParseID($idValue); >>
1687    
1688    Determine the type and object ID corresponding to an ID value from the attribute database.
1689    Most ID values consist of a type name and an ID, separated by a colon (e.g. C<Family:aclame|cluster10>);
1690    however, Genomes, Features, and Subsystems are not stored with a type name, so we need to
1691    deduce the type from the ID value structure.
1692    
1693    The theory here is that you can plug the ID and type directly into a Sprout database method, as
1694    follows
1695    
1696        my ($type, $id) = CustomAttributes::ParseID($attrList[$num]->[0]);
1697        my $target = $sprout->GetEntity($type, $id);
1698    
1699    =over 4
1700    
1701    =item idValue
1702    
1703    ID value taken from the attribute database.
1704    
1705    =item RETURN
1706    
1707    Returns a two-element list. The first element is the type of object indicated by the ID value,
1708    and the second element is the actual object ID.
1709    
1710    =back
1711    
1712    =cut
1713    
1714    sub ParseID {
1715        # Get the parameters.
1716        my ($idValue) = @_;
1717        # Declare the return variables.
1718        my ($type, $id);
1719        # Parse the incoming ID. We first check for the presence of an entity name. Entity names
1720        # can only contain letters, which helps to insure typed object IDs don't collide with
1721        # subsystem names (which are untyped).
1722        if ($idValue =~ /^([A-Za-z]+):(.+)/) {
1723            # Here we have a typed ID.
1724            ($type, $id) = ($1, $2);
1725        } elsif ($idValue =~ /fig\|/) {
1726            # Here we have a feature ID.
1727            ($type, $id) = (Feature => $idValue);
1728        } elsif ($idValue =~ /\d+\.\d+/) {
1729            # Here we have a genome ID.
1730            ($type, $id) = (Genome => $idValue);
1731      } else {      } else {
1732          push @objects, $entityName;          # The default is a subsystem ID.
1733            ($type, $id) = (Subsystem => $idValue);
1734      }      }
1735      # Loop through the entity types.      # Return the results.
1736      for my $entityType (@objects) {      return ($type, $id);
         # Now check for this key in this entity.  
         my %secondaries = $self->GetSecondaryFields($entityType);  
         if (exists $secondaries{$key}) {  
             # We found it, so delete all the values of the key.  
             $self->DeleteValue($entityName, undef, $key);  
1737          }          }
1738    
1739    =head3 FormID
1740    
1741    C<< my $idValue = CustomAttributes::FormID($type, $id); >>
1742    
1743    Convert an object type and ID pair into an object ID string for the attribute system. Subsystems,
1744    genomes, and features are stored in the database without type information, but all other object IDs
1745    must be prefixed with the object type.
1746    
1747    =over 4
1748    
1749    =item type
1750    
1751    Relevant object type.
1752    
1753    =item id
1754    
1755    ID of the object in question.
1756    
1757    =item RETURN
1758    
1759    Returns a string that will be recognized as an object ID in the attribute database.
1760    
1761    =back
1762    
1763    =cut
1764    
1765    sub FormID {
1766        # Get the parameters.
1767        my ($type, $id) = @_;
1768        # Declare the return variable.
1769        my $retVal;
1770        # Compute the ID string from the type.
1771        if (grep { $type eq $_ } qw(Feature Genome Subsystem)) {
1772            $retVal = $id;
1773        } else {
1774            $retVal = "$type:$id";
1775      }      }
1776      # Return a 1, for backward compatability.      # Return the result.
1777      return 1;      return $retVal;
1778    }
1779    
1780    =head3 GetTargetObject
1781    
1782    C<< my $object = CustomAttributes::GetTargetObject($erdb, $idValue); >>
1783    
1784    Return the database object corresponding to the specified attribute object ID. The
1785    object type associated with the ID value must correspond to an entity name in the
1786    specified database.
1787    
1788    =over 4
1789    
1790    =item erdb
1791    
1792    B<ERDB> object for accessing the target database.
1793    
1794    =item idValue
1795    
1796    ID value retrieved from the attribute database.
1797    
1798    =item RETURN
1799    
1800    Returns a B<DBObject> for the attribute value's target object.
1801    
1802    =back
1803    
1804    =cut
1805    
1806    sub GetTargetObject {
1807        # Get the parameters.
1808        my ($erdb, $idValue) = @_;
1809        # Declare the return variable.
1810        my $retVal;
1811        # Get the type and ID for the target object.
1812        my ($type, $id) = ParseID($idValue);
1813        # Plug them into the GetEntity method.
1814        $retVal = $erdb->GetEntity($type, $id);
1815        # Return the resulting object.
1816        return $retVal;
1817  }  }
1818    
1819  1;  1;

Legend:
Removed from v.1.7  
changed lines
  Added in v.1.19

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3