[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

View of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.27 - (download) (as text) (annotate)
Sun Sep 30 03:46:30 2007 UTC (12 years, 1 month ago) by parrello
Branch: MAIN
Changes since 1.26: +13 -0 lines
Added timings to the statistics output.

#!/usr/bin/perl -w

package CustomAttributes;

    require Exporter;
    use ERDB;
    @ISA = qw(ERDB);
    use strict;
    use Tracer;
    use ERDBLoad;
    use Stats;
    use Time::HiRes;

=head1 Custom SEED Attribute Manager

=head2 Introduction

The Custom SEED Attributes Manager allows the user to upload and retrieve
custom data for SEED objects. It uses the B<ERDB> database system to
store the attributes.

Attributes are organized by I<attribute key>. Attribute values are
assigned to I<objects>. In the real world, objects have types and IDs;
however, to the attribute database only the ID matters. This will create
a problem if we have a single ID that applies to two objects of different
types, but it is more consistent with the original attribute implementation
in the SEED (which this implementation replaces).

The actual attribute values are stored as a relationship between the attribute
keys and the objects. There can be multiple values for a single key/object pair.

=head3 Object IDs

The object ID is normally represented as

    I<type>:I<id>

where I<type> is the object type (C<Role>, C<Coupling>, etc.) and I<id> is
the actual object ID. Note that the object type must consist of only upper- and
lower-case letters! Thus, C<GenomeGroup> is a valid object type, but
C<genome_group> is not. Given that restriction, the object ID

    Family:aclame|cluster10

would represent the FIG family C<aclame|cluster10>. For historical reasons,
there are three exceptions: subsystems, genomes, and features do not need
a type. So, for PEG 3361 of Streptomyces coelicolor A3(2), you simply code

    fig|100226.1.peg.3361

The methods L</ParseID> and L</FormID> can be used to make this all seem
more consistent. Given any object ID string, L</ParseID> will convert it to an
object type and ID, and given any object type and ID, L</FormID> will
convert it to an object ID string. The attribute database is pretty
freewheeling about what it will allow for an ID; however, for best
results, the type should match an entity type from a Sprout genetics
database. If this rule is followed, then the database object
corresponding to an ID in the attribute database could be retrieved using
L</GetTargetObject> method.

    my $object = CustomAttributes::GetTargetObject($sprout, $idValue);

=head3 Retrieval and Logging

The full suite of ERDB retrieval capabilities is provided. In addition,
custom methods are provided specific to this application. To get all
the values of the attribute C<essential> in a specified B<Feature>, you
would code

    my @values = $attrDB->GetAttributes($fid, 'essential');

where I<$fid> contains the ID of the desired feature.

Keys can be split into two pieces using the splitter value defined in the
constructor (the default is C<::>). The first piece of the key is called
the I<real key>. This portion of the key must be defined using the
web interface (C<Attributes.cgi>). The second portion of the key is called
the I<sub key>, and can take any value.

Major attribute activity is recorded in a log (C<attributes.log>) in the
C<$FIG_Config::var> directory. The log reports the user name, time, and
the details of the operation. The user name will almost always be unknown,
the exception being when it is specified in this object's constructor
(see L</new>).

=head2 FIG_Config Parameters

The following configuration parameters are used to manage custom attributes.

=over 4

=item attrDbms

Type of database manager used: C<mysql> for MySQL or C<pg> for PostGres.

=item attrDbName

Name of the attribute database.

=item attrHost

Name of the host server for the database. If omitted, the current host
is used.

=item attrUser

User name for logging in to the database.

=item attrPass

Password for logging in to the database.

=item attrPort

TCP/IP port for accessing the database.

=item attrSock

Socket name used to access the database. If omitted, the default socket
will be used.

=item attrDBD

Fully-qualified file name for the database definition XML file. This file
functions as data to the attribute management process, so if the data is
moved, this file must go with it.

=back

=head2 Public Methods

=head3 new

C<< my $attrDB = CustomAttributes->new(%options); >>

Construct a new CustomAttributes object. The following options are
supported.

=over 4

=item splitter

Value to be used to split attribute values into sections in the
L</Fig Replacement Methods>. The default is a double colon C<::>,
and should only be overridden in extreme circumstances.

=item user

Name of the current user. This will appear in the attribute log.

=back

=cut

sub new {
    # Get the parameters.
    my ($class, %options) = @_;
    # Connect to the database.
    my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
                            $FIG_Config::attrUser, $FIG_Config::attrPass,
                            $FIG_Config::attrPort, $FIG_Config::attrHost,
                            $FIG_Config::attrSock);
    # Create the ERDB object.
    my $xmlFileName = $FIG_Config::attrDBD;
    my $retVal = ERDB::new($class, $dbh, $xmlFileName);
    # Store the splitter value.
    $retVal->{splitter} = $options{splitter} || '::';
    # Store the user name.
    $retVal->{user} = $options{user} || '<unknown>';
    Trace("User $retVal->{user} selected for attribute object.") if T(3);
    # Return the result.
    return $retVal;
}

=head3 StoreAttributeKey

C<< $attrDB->StoreAttributeKey($attributeName, $type, $notes, \@groups); >>

Create or update an attribute for the database.

=over 4

=item attributeName

Name of the attribute (the real key). If it does not exist already, it will be created.

=item type

Data type of the attribute. This must be a valid ERDB data type name.

=item notes

Descriptive notes about the attribute. It is presumed to be raw text, not HTML.

=item groups

Reference to a list of the groups to which the attribute should be associated.
This will replace any groups to which the attribute is currently attached.

=back

=cut

sub StoreAttributeKey {
    # Get the parameters.
    my ($self, $attributeName, $type, $notes, $groups) = @_;
    # Declare the return variable.
    my $retVal;
    # Get the data type hash.
    my %types = ERDB::GetDataTypes();
    # Validate the initial input values.
    if ($attributeName =~ /$self->{splitter}/) {
        Confess("Invalid attribute name \"$attributeName\" specified.");
    } elsif (! $notes || length($notes) < 25) {
        Confess("Missing or incomplete description for $attributeName.");
    } elsif (! exists $types{$type}) {
        Confess("Invalid data type \"$type\" for $attributeName.");
    } else {
        # Create a variable to hold the action to be displayed for the log (Add or Update).
        my $action;
        # Okay, we're ready to begin. See if this key exists.
        my $attribute = $self->GetEntity('AttributeKey', $attributeName);
        if (defined($attribute)) {
            # It does, so we do an update.
            $action = "Update Key";
            $self->UpdateEntity('AttributeKey', $attributeName,
                                { description => $notes, 'data-type' => $type });
            # Detach the key from its current groups.
            $self->Disconnect('IsInGroup', 'AttributeKey', $attributeName);
        } else {
            # It doesn't, so we do an insert.
            $action = "Insert Key";
            $self->InsertObject('AttributeKey', { id => $attributeName,
                                description => $notes, 'data-type' => $type });
        }
        # Attach the key to the specified groups. (We presume the groups already
        # exist.)
        for my $group (@{$groups}) {
            $self->InsertObject('IsInGroup', { 'from-link' => $attributeName,
                                               'to-link'   => $group });
        }
        # Log the operation.
        $self->LogOperation($action, $attributeName, "Group list is " . join(" ", @{$groups}));
    }
}


=head3 DeleteAttributeKey

C<< my $stats = $attrDB->DeleteAttributeKey($attributeName); >>

Delete an attribute from the custom attributes database.

=over 4

=item attributeName

Name of the attribute to delete.

=item RETURN

Returns a statistics object describing the effects of the deletion.

=back

=cut

sub DeleteAttributeKey {
    # Get the parameters.
    my ($self, $attributeName) = @_;
    # Delete the attribute key.
    my $retVal = $self->Delete('AttributeKey', $attributeName);
    # Log this operation.
    $self->LogOperation("Delete Key", $attributeName, "Key will no longer be available for use by anyone.");
    # Return the result.
    return $retVal;
    
}

=head3 NewName

C<< my $text = CustomAttributes::NewName(); >>

Return the string used to indicate the user wants to add a new attribute.

=cut

sub NewName {
    return "(new)";
}

=head3 ControlForm

C<< my $formHtml = $attrDB->ControlForm($cgi, $name, \%keys); >>

Return a form that can be used to control the creation and modification of
attributes. Only a subset of the attribute keys will be displayed, as
determined by the incoming list.

=over 4

=item cgi

CGI query object used to create HTML.

=item name

Name to give to the form. This should be unique for the web page.

=item keys

Reference to a hash mapping attribute keys to n-tuples. Each tuple will contain the
attribute's data type, its description, and a list of the groups in which it participates.

=item RETURN

Returns the HTML for a form that can be used to  submit instructions to the C<Attributes.cgi> script
for loading, creating, displaying, changing, or deleting an attribute. Note that only the form
controls are generated. The form tags are left to the caller.

=back

=cut

sub ControlForm {
    # Get the parameters.
    my ($self, $cgi, $name, $keys) = @_;
    # Declare the return list.
    my @retVal = ();
    # We'll put the controls in a table. Nothing else ever seems to look nice.
    push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
    # The first row is for selecting the field name.
    push @retVal, $cgi->Tr($cgi->th("Select a Field"),
                           $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', $keys,
                                                     new => 1,
                                                     notes => "document.$name.notes.value",
                                                     type => "document.$name.dataType.value",
                                                     groups => "document.$name.groups")));
    # Now we set up a dropdown for the data types. The values will be the
    # data type names, and the labels will be the descriptions.
    my %types = ERDB::GetDataTypes();
    my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
    my $typeMenu = $cgi->popup_menu(-name   => 'dataType',
                                    -values => [sort keys %types],
                                    -labels => \%labelMap,
                                    -default => 'string');
    # Allow the user to specify a new field name. This is required if the
    # user has selected the "(new)" marker.
    my $fieldField = "document.$name.fieldName";
    my $newName = "\"" . NewName() . "\"";
    push @retVal, $cgi->Tr($cgi->th("New Field Name"),
                           $cgi->td($cgi->textfield(-name => 'newName',
                                                    -size => 30,
                                                    -value => "",
                                                    -onFocus => "setIfEmpty($fieldField, $newName);")),
                                    );
    push @retVal, $cgi->Tr($cgi->th("Data type"),
                           $cgi->td($typeMenu));
    # The next row is for the notes.
    push @retVal, $cgi->Tr($cgi->th("Description"),
                           $cgi->td($cgi->textarea(-name => 'notes',
                                                   -rows => 6,
                                                   -columns => 80))
                          );
    # Now we have the groups, which are implemented as a checkbox group.
    my @groups = $self->GetGroups();
    push @retVal, $cgi->Tr($cgi->th("Groups"),
                           $cgi->td($cgi->checkbox_group(-name=>'groups',
                                    -values=> \@groups))
                          );
    # Now the four buttons: STORE, SHOW, ERASE, and DELETE.
    push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
                           $cgi->td({align => 'center'}, join(" ",
                                    $cgi->submit(-name => 'Delete', -value => 'DELETE'),
                                    $cgi->submit(-name => 'Store',  -value => 'STORE'),
                                    $cgi->submit(-name => 'Erase',  -value => 'ERASE'),
                                    $cgi->submit(-name => 'Show',   -value => 'SHOW')
                                   ))
                          );
    # Close the table and the form.
    push @retVal, $cgi->end_table();
    # Return the assembled HTML.
    return join("\n", @retVal, "");
}

=head3 LoadAttributesFrom

C<< my $stats = $attrDB->LoadAttributesFrom($fileName, %options); >>
s
Load attributes from the specified tab-delimited file. Each line of the file must
contain an object ID in the first column, an attribute key name in the second
column, and attribute values in the remaining columns. The attribute values will
be assembled into a single value using the splitter code. In addition, the key names may
contain a splitter. If this is the case, the portion of the key after the splitter is
treated as a subkey.

=over 4

=item fileName

Name of the file from which to load the attributes, or an open handle for the file.
(This last enables the method to be used in conjunction with the CGI form upload
control.)

=item options

Hash of options for modifying the load process.

=item RETURN

Returns a statistics object describing the load.

=back

Permissible option values are as follows.

=over 4

=item append

If TRUE, then the attributes will be appended to existing data; otherwise, the
first time a key name is encountered, it will be erased.

=item archive

If specified, the name of a file into which the incoming data file should be saved.

=item objectType

If specified, the specified object type will be prefixed to each object ID.

=back

=cut

sub LoadAttributesFrom {
    # Get the parameters.
    my ($self, $fileName, %options) = @_;
    # Declare the return variable.
    my $retVal = Stats->new('keys', 'values');
    # Initialize the timers.
    my ($insertTime, $eraseTime, $archiveTime) = (0, 0, 0);
    # Check for append mode.
    my $append = ($options{append} ? 1 : 0);
    # Create a hash of key names found.
    my %keyHash = ();
    # Open the file for input. Note we must anticipate the possibility of an
    # open filehandle being passed in.
    my $fh;
    if (ref $fileName) {
        Trace("Using file opened by caller.") if T(3);
        $fh = $fileName;
    } else {
        Trace("Attributes will be loaded from $fileName.") if T(3);
        $fh = Open(undef, "<$fileName");
    }
    # Now check to see if we need to archive.
    my $ah;
    if ($options{archive}) {
        $ah = Open(undef, ">$options{archive}");
        Trace("Load file will be archived to $options{archive}.") if T(3);
    }
    # Finally, open a database transaction.
    $self->BeginTran();
    # Insure we recover from errors. If an error occurs, we will delete the archive file and
    # roll back the updates.
    eval {
        # Loop through the file.
        while (! eof $fh) {
            # Read the current line.
            my ($id, $key, @values) = Tracer::GetLine($fh);
            $retVal->Add(linesIn => 1);
            # Check to see if we need to fix up the object ID.
            if ($options{objectType}) {
                $id = "$options{objectType}:$id";
            }
            # Archive the line (if necessary).
            if (defined $ah) {
                my $startTime = time();
                Tracer::PutLine($ah, [$id, $key, @values]);
                $archiveTime += time() - $startTime;
            }
            # Do some validation.
            if (! $id) {
                # We ignore blank lines.
                $retVal->Add(blankLines => 1);
            } elsif (substr($id, 0, 1) eq '#') {
                # A line beginning with a pound sign is a comment.
                $retVal->Add(comments => 1);
            } elsif (! defined($key)) {
                # An ID without a key is a serious error.
                my $lines = $retVal->Ask('linesIn');
                Confess("Line $lines in $fileName has no attribute key.");
            } elsif (! @values) {
                # A line with no values is not allowed.
                my $lines = $retVal->Ask('linesIn');
                Trace("Line $lines for key $key has no attribute values.") if T(1);
                $retVal->Add(skipped => 1);
            } else {
                # The key contains a real part and an optional sub-part. We need the real part.
                my ($realKey, $subKey) = $self->SplitKey($key);
                # Now we need to check for a new key.
                if (! exists $keyHash{$realKey}) {
                    if (! $self->Exists('AttributeKey', $realKey)) {
                        my $line = $retVal->Ask('linesIn');
                        Confess("Attribute \"$realKey\" on line $line of $fileName not found in database.");
                    } else {
                        # Make sure we know this is no longer a new key.
                        $keyHash{$realKey} = 1;
                        $retVal->Add(keys => 1);
                        # If this is NOT append mode, erase the key.
                        if (! $append) {
                            my $startTime = time();
                            $self->EraseAttribute($realKey);
                            $eraseTime += time() - $startTime;
                            Trace("Attribute $realKey erased.") if T(3);
                        }
                    }
                    Trace("Key $realKey found.") if T(3);
                }
                # Everything is all set up, so add the value.
                my $startTime = time();
                $self->AddAttribute($id, $key, @values);
                $insertTime += time() - $startTime;
                my $progress = $retVal->Add(values => 1);
                Trace("$progress values loaded.") if T(3) && ($progress % 1000 == 0);
            }
        }
        $retVal->Add(eraseTime  =>  $eraseTime);
        $retVal->Add(insertTime =>  $insertTime);
        $retVal->Add(archiveTime => $archiveTime);
    };
    # Check for an error.
    if ($@) {
        # Here we have an error. Roll back the transaction and delete the archive file.
        my $message = $@;
        Trace("Rolling back attribute updates due to error.") if T(1);
        $self->RollbackTran();
        if (defined $ah) {
            Trace("Deleting archive file $options{archive}.") if T(1);
            close $ah;
            unlink $options{archive};
        }
        Confess("Error during attribute load: $message");
    } else {
        # Here the load worked. Commit the transaction and close the archive file.
        Trace("Committing attribute upload.") if T(2);
        $self->CommitTran();
        if (defined $ah) {
            Trace("Closing archive file $options{archive}.") if T(2);
            close $ah;
        }
    }
    # Return the result.
    return $retVal;
}

=head3 BackupKeys

C<< my $stats = $attrDB->BackupKeys($fileName, %options); >>

Backup the attribute key information from the attribute database.

=over 4

=item fileName

Name of the output file.

=item options

Options for modifying the backup process.

=item RETURN

Returns a statistics object for the backup.

=back

Currently there are no options. The backup is straight to a text file in
tab-delimited format. Each key is backup up to two lines. The first line
is all of the data from the B<AttributeKey> table. The second is a
tab-delimited list of all the groups.

=cut

sub BackupKeys {
    # Get the parameters.
    my ($self, $fileName, %options) = @_;
    # Declare the return variable.
    my $retVal = Stats->new();
    # Open the output file.
    my $fh = Open(undef, ">$fileName");
    # Set up to read the keys.
    my $keyQuery = $self->Get(['AttributeKey'], "", []);
    # Loop through the keys.
    while (my $keyData = $keyQuery->Fetch()) {
        $retVal->Add(key => 1);
        # Get the fields.
        my ($id, $type, $description) = $keyData->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
                                                          'AttributeKey(description)']);
        # Escape any tabs or new-lines in the description.
        my $escapedDescription = Tracer::Escape($description);
        # Write the key data to the output.
        Tracer::PutLine($fh, [$id, $type, $escapedDescription]);
        # Get the key's groups.
        my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?", [$id],
                                    'IsInGroup(to-link)');
        $retVal->Add(memberships => scalar(@groups));
        # Write them to the output. Note we put a marker at the beginning to insure the line
        # is nonempty.
        Tracer::PutLine($fh, ['#GROUPS', @groups]);
    }
    # Log the operation.
    $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
    # Return the result.
    return $retVal;
}

=head3 RestoreKeys

C<< my $stats = $attrDB->RestoreKeys($fileName, %options); >>

Restore the attribute keys and groups from a backup file.

=over 4

=item fileName

Name of the file containing the backed-up keys. Each key has a pair of lines,
one containing the key data and one listing its groups.

=back

=cut

sub RestoreKeys {
    # Get the parameters.
    my ($self, $fileName, %options) = @_;
    # Declare the return variable.
    my $retVal = Stats->new();
    # Set up a hash to hold the group IDs.
    my %groups = ();
    # Open the file.
    my $fh = Open(undef, "<$fileName");
    # Loop until we're done.
    while (! eof $fh) {
        # Get a key record.
        my ($id, $dataType, $description) = Tracer::GetLine($fh);
        if ($id eq '#GROUPS') {
            Confess("Group record found when key record expected.");
        } elsif (! defined($description)) {
            Confess("Invalid format found for key record.");
        } else {
            $retVal->Add("keyIn" => 1);
            # Add this key to the database.
            $self->InsertObject('AttributeKey', { id => $id, 'data-type' => $dataType,
                                                  description => Tracer::UnEscape($description) });
            Trace("Attribute $id stored.") if T(3);
            # Get the group line.
            my ($marker, @groups) = Tracer::GetLine($fh);
            if (! defined($marker)) {
                Confess("End of file found where group record expected.");
            } elsif ($marker ne '#GROUPS') {
                Confess("Group record not found after key record.");
            } else {
                $retVal->Add(memberships => scalar(@groups));
                # Connect the groups.
                for my $group (@groups) {
                    # Find out if this is a new group.
                    if (! $groups{$group}) {
                        $retVal->Add(newGroup => 1);
                        # Add the group.
                        $self->InsertObject('AttributeGroup', { id => $group });
                        Trace("Group $group created.") if T(3);
                        # Make sure we know it's not new.
                        $groups{$group} = 1;
                    }
                    # Connect the group to our key.
                    $self->InsertObject('IsInGroup', { 'from-link' => $id, 'to-link' => $group });
                }
                Trace("$id added to " . scalar(@groups) . " groups.") if T(3);
            }
        }
    }
    # Log the operation.
    $self->LogOperation("Backup Keys", $fileName, $retVal->Display());
    # Return the result.
    return $retVal;
}

=head3 ArchiveFileName

C<< my $fileName = $ca->ArchiveFileName(); >>

Compute a file name for archiving attribute input data. The file will be in the attribute log directory

=cut

sub ArchiveFileName {
    # Get the parameters.
    my ($self) = @_;
    # Declare the return variable.
    my $retVal;
    # We start by turning the timestamp into something usable as a file name.
    my $now = Tracer::Now();
    $now =~ tr/ :\//___/;
    # Next we get the directory name.
    my $dir = "$FIG_Config::var/attributes";
    if (! -e $dir) {
        Trace("Creating attribute file directory $dir.") if T(1);
        mkdir $dir;
    }
    # Put it together with the field name and the time stamp.
    $retVal = "$dir/upload.$now";
    # Modify the file name to insure it's unique.
    my $seq = 0;
    while (-e "$retVal.$seq.tbl") { $seq++ }
    # Use the computed sequence number to get the correct file name.
    $retVal .= ".$seq.tbl";
    # Return the result.
    return $retVal;
}

=head3 BackupAllAttributes

C<< my $stats = $attrDB->BackupAllAttributes($fileName, %options); >>

Backup all of the attributes to a file. The attributes will be stored in a
tab-delimited file suitable for reloading via L</LoadAttributesFrom>.

=over 4

=item fileName

Name of the file to which the attribute data should be backed up.

=item options

Hash of options for the backup.

=item RETURN

Returns a statistics object describing the backup.

=back

Currently there are no options defined.

=cut

sub BackupAllAttributes {
    # Get the parameters.
    my ($self, $fileName, %options) = @_;
    # Declare the return variable.
    my $retVal = Stats->new();
    # Get a list of the keys.
    my @keys = $self->GetFlat(['AttributeKey'], "", [], 'AttributeKey(id)');
    Trace(scalar(@keys) . " keys found during backup.") if T(2);
    # Open the file for output.
    my $fh = Open(undef, ">$fileName");
    # Loop through the keys.
    for my $key (@keys) {
        Trace("Backing up attribute $key.") if T(3);
        $retVal->Add(keys => 1);
        # Loop through this key's values.
        my $query = $self->Get(['HasValueFor'], "HasValueFor(from-link) = ?", [$key]);
        my $valuesFound = 0;
        while (my $line = $query->Fetch()) {
            $valuesFound++;
            # Get this row's data.
            my ($id, $key, $subKey, $value) = $line->Values(['HasValueFor(to-link)',
                                                             'HasValueFor(from-link)',
                                                             'HasValueFor(subkey)',
                                                             'HasValueFor(value)']);
            # Check for a subkey.
            if ($subKey ne '') {
                $key = "$key$self->{splitter}$subKey";
            }   
            # Write it to the file.
            Tracer::PutLine($fh, [$id, $key, $value]);
        }
        Trace("$valuesFound values backed up for key $key.") if T(3);
        $retVal->Add(values => $valuesFound);
    }
    # Log the operation.
    $self->LogOperation("Backup Data", $fileName, $retVal->Display());
    # Return the result.
    return $retVal;
}

=head3 FieldMenu

C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $keys, %options); >>

Return the HTML for a menu to select an attribute field. The menu will
be a standard SELECT/OPTION thing which is called "popup menu" in the
CGI package, but actually looks like a list. The list will contain
one selectable row per field.

=over 4

=item cgi

CGI query object used to generate HTML.

=item height

Number of lines to display in the list.

=item name

Name to give to the menu. This is the name under which the value will
appear when the form is submitted.

=item keys

Reference to a hash mapping each attribute key name to a list reference,
the list itself consisting of the attribute data type, its description,
and a list of its groups.

=item options

Hash containing options that modify the generation of the menu.

=item RETURN

Returns the HTML to create a form field that can be used to select an
attribute from the custom attributes system.

=back

The permissible options are as follows.

=over 4

=item new

If TRUE, then extra rows will be provided to allow the user to select
a new attribute. In other words, the user can select an existing
attribute, or can choose a C<(new)> marker to indicate a field to
be created in the parent entity.

=item notes

If specified, the name of a variable for displaying the notes attached
to the field. This must be in Javascript form ready for assignment.
So, for example, if you have a variable called C<notes> that
represents a paragraph element, you should code C<notes.innerHTML>.
If it actually represents a form field you should code C<notes.value>.
If an C<innerHTML> coding is used, the text will be HTML-escaped before
it is copied in. Specifying this parameter generates Javascript for
displaying the field description when a field is selected.

=item type

If specified, the name of a variable for displaying the field's
data type. Data types are a much more controlled vocabulary than
notes, so there is no worry about HTML translation. Instead, the
raw value is put into the specified variable. Otherwise, the same
rules apply to this value that apply to I<$noteControl>.

=item groups

If specified, the name of a multiple-selection list control (also called
a popup menu) which shall be used to display the selected groups.

=back

=cut

sub FieldMenu {
    # Get the parameters.
    my ($self, $cgi, $height, $name, $keys, %options) = @_;
    # Reformat the list of keys.
    my %keys = %{$keys};
    # Add the (new) key, if needed.
    if ($options{new}) {
        $keys{NewName()} = ["string", ""];
    }
    # Get a sorted list of key.
    my @keys = sort keys %keys;
    # We need to create the name for the onChange function. This function
    # may not do anything, but we need to know the name to generate the HTML
    # for the menu.
    my $changeName = "${name}_setNotes";
    my $retVal = $cgi->popup_menu({name => $name,
                                   size => $height,
                                   onChange => "$changeName(this.value)",
                                   values => \@keys,
                                  });
    # Create the change function.
    $retVal .= "\n<script language=\"javascript\">\n";
    $retVal .= "    function $changeName(fieldValue) {\n";
    # The function only has a body if we have a control to store data about the
    # attribute.
    if ($options{notes} || $options{type} || $options{groups}) {
        # Check to see if we're storing HTML or text into the note control.
        my $noteControl = $options{notes};
        my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
        # We use a CASE statement based on the newly-selected field value. The
        # field description will be stored in the JavaScript variable "myText"
        # and the data type in "myType". Note the default data type is a normal
        # string, but the default notes is an empty string.
        $retVal .= "        var myText = \"\";\n";
        $retVal .= "        var myType = \"string\";\n";
        $retVal .= "        switch (fieldValue) {\n";
        # Loop through the keys.
        for my $key (@keys) {
            # Generate this case.
            $retVal .= "        case \"$key\" :\n";
            # Here we either want to update the note display, the
            # type display, the group list, or a combination of them.
            my ($type, $notes, @groups) = @{$keys{$key}};
            if ($noteControl) {
                # Insure it's in the proper form.
                if ($htmlMode) {
                    $notes = ERDB::HTMLNote($notes);
                }
                # Escape it for use as a string literal.
                $notes =~ s/\n/\\n/g;
                $notes =~ s/"/\\"/g;
                $retVal .= "           myText = \"$notes\";\n";
            }
            if ($options{type}) {
                # Here we want the type updated.
                $retVal .= "           myType = \"$type\";\n";
            }
            if ($options{groups}) {
                # Here we want the groups shown. Get a list of this attribute's groups.
                # We'll search through this list for each group to see if it belongs with
                # our attribute.
                my $groupLiteral = "=" . join("=", @groups) . "=";
                # Now we need some variables containing useful code for the javascript. It's
                # worth knowing we go through a bit of pain to insure $groupField[i] isn't
                # parsed as an array element.
                my $groupField = $options{groups};
                my $currentField = $groupField . "[i]";
                # Do the javascript.
                $retVal .= "           var groupList = \"$groupLiteral\";\n";
                $retVal .= "           for (var i = 0; i < $groupField.length; i++) {\n";
                $retVal .= "              var srchString = \"=\" + $currentField.value + \"=\";\n";
                $retVal .= "              var srchLoc = groupList.indexOf(srchString);\n";
                $retVal .= "              $currentField.checked = (srchLoc >= 0);\n";
                $retVal .= "           }\n";
            }
            # Close this case.
            $retVal .= "           break;\n";
        }
        # Close the CASE statement and make the appropriate assignments.
        $retVal .= "        }\n";
        if ($noteControl) {
            $retVal .= "        $noteControl = myText;\n";
        }
        if ($options{type}) {
            $retVal .= "        $options{type} = myType;\n";
        }
    }
    # Terminate the change function.
    $retVal .= "    }\n";
    $retVal .= "</script>\n";
    # Return the result.
    return $retVal;
}

=head3 GetGroups

C<< my @groups = $attrDB->GetGroups(); >>

Return a list of the available groups.

=cut

sub GetGroups {
    # Get the parameters.
    my ($self) = @_;
    # Get the groups.
    my @retVal = $self->GetFlat(['AttributeGroup'], "", [], 'AttributeGroup(id)');
    # Return them.
    return @retVal;
}

=head3 GetAttributeData

C<< my %keys = $attrDB->GetAttributeData($type, @list); >>

Return attribute data for the selected attributes. The attribute
data is a hash mapping each attribute key name to a n-tuple containing the
data type, the description, and the groups. This is the same format expected in
the L</FieldMenu> and L</ControlForm> methods for the list of attributes to display.

=over 4

=item type

Type of attribute criterion: C<name> for attributes whose names begin with the
specified string, or C<group> for attributes in the specified group.

=item list

List containing the names of the groups or keys for the desired attributes.

=item RETURN

Returns a hash mapping each attribute key name to its data type, description, and
parent groups.

=back

=cut

sub GetAttributeData {
    # Get the parameters.
    my ($self, $type, @list) = @_;
    # Set up a hash to store the attribute data.
    my %retVal = ();
    # Loop through the list items.
    for my $item (@list) {
        # Set up a query for the desired attributes.
        my $query;
        if ($type eq 'name') {
            # Here we're doing a generic name search. We need to escape it and then tack
            # on a %.
            my $parm = $item;
            $parm =~ s/_/\\_/g;
            $parm =~ s/%/\\%/g;
            $parm .= "%";
            # Ask for matching attributes. (Note that if the user passed in a null string
            # he'll get everything.)
            $query = $self->Get(['AttributeKey'], "AttributeKey(id) LIKE ?", [$parm]);
        } elsif ($type eq 'group') {
            $query = $self->Get(['IsInGroup', 'AttributeKey'], "IsInGroup(to-link) = ?", [$item]);
        } else {
            Confess("Unknown attribute query type \"$type\".");
        }
        while (my $row = $query->Fetch()) {
            # Get this attribute's data.
            my ($key, $type, $notes) = $row->Values(['AttributeKey(id)', 'AttributeKey(data-type)',
                                                     'AttributeKey(description)']);
            # If it's new, get its groups and add it to the return hash.
            if (! exists $retVal{$key}) {
                my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(from-link) = ?",
                                            [$key], 'IsInGroup(to-link)');
                $retVal{$key} = [$type, $notes, @groups];
            }
        }
    }
    # Return the result.
    return %retVal;
}

=head3 LogOperation

C<< $ca->LogOperation($action, $target, $description); >>

Write an operation description to the attribute activity log (C<$FIG_Config::var/attributes.log>).

=over 4

=item action

Action being logged (e.g. C<Delete Group> or C<Load Key>).

=item target

ID of the key or group affected.

=item description

Short description of the action.

=back

=cut

sub LogOperation {
    # Get the parameters.
    my ($self, $action, $target, $description) = @_;
    # Get the user ID.
    my $user = $self->{user};
    # Get a timestamp.
    my $timeString = Tracer::Now();
    # Open the log file for appending.
    my $oh = Open(undef, ">>$FIG_Config::var/attributes.log");
    # Write the data to it.
    Tracer::PutLine($oh, [$timeString, $user, $action, $target, $description]);
    # Close the log file.
    close $oh;
}

=head2 Internal Utility Methods

=head3 _KeywordString

C<< my $keywordString = $ca->_KeywordString($key, $value); >>

Compute the keyword string for a specified key/value pair. This consists of the
key name and value converted to lower case with underscores translated to spaces.

This method is for internal use only. It is called whenever we need to update or
insert a B<HasValueFor> record.

=over 4

=item key

Name of the relevant attribute key.

=item target

ID of the target object to which this key/value pair will be associated.

=item value

The value to store for this key/object combination.

=item RETURN

Returns the value that should be stored as the keyword string for the specified
key/value pair.

=back

=cut

sub _KeywordString {
    # Get the parameters.
    my ($self, $key, $value) = @_;
    # Get a copy of the key name and convert underscores to spaces.
    my $keywordString = $key;
    $keywordString =~ s/_/ /g;
    # Add the value convert it all to lower case.
    my $retVal = lc "$keywordString $value";
    # Return the result.
    return $retVal;
}

=head3 _QueryResults

C<< my @attributeList = $attrDB->_QueryResults($query, @values); >>

Match the results of a B<HasValueFor> query against value criteria and return
the results. This is an internal method that splits the values coming back
and matches the sections against the specified section patterns. It serves
as the back end to L</GetAttributes> and L</FindAttributes>.

=over 4

=item query

A query object that will return the desired B<HasValueFor> records.

=item values

List of the desired attribute values, section by section. If C<undef>
or an empty string is specified, all values in that section will match. A
generic match can be requested by placing a percent sign (C<%>) at the end.
In that case, all values that match up to and not including the percent sign
will match. You may also specify a regular expression enclosed
in slashes. All values that match the regular expression will be returned. For
performance reasons, only values have this extra capability.

=item RETURN

Returns a list of tuples. The first element in the tuple is an object ID, the
second is an attribute key, and the remaining elements are the sections of
the attribute value. All of the tuples will match the criteria set forth in
the parameter list.

=back

=cut

sub _QueryResults {
    # Get the parameters.
    my ($self, $query, @values) = @_;
    # Declare the return value.
    my @retVal = ();
    # Get the number of value sections we have to match.
    my $sectionCount = scalar(@values);
    # Loop through the assignments found.
    while (my $row = $query->Fetch()) {
        # Get the current row's data.
        my ($id, $realKey, $subKey, $valueString) = $row->Values(['HasValueFor(to-link)',
                                                                  'HasValueFor(from-link)',
                                                                  'HasValueFor(subkey)',
                                                                  'HasValueFor(value)'
                                                                ]);
        # Form the key from the real key and the sub key.
        my $key = $self->JoinKey($realKey, $subKey);
        # Break the value into sections.
        my @sections = split($self->{splitter}, $valueString);
        # Match each section against the incoming values. We'll assume we're
        # okay unless we learn otherwise.
        my $matching = 1;
        for (my $i = 0; $i < $sectionCount && $matching; $i++) {
            # We need to check to see if this section is generic.
            my $value = $values[$i];
            Trace("Current value pattern is \"$value\".") if T(4);
            if (substr($value, -1, 1) eq '%') {
                Trace("Generic match used.") if T(4);
                # Here we have a generic match.
                my $matchLen = length($values[$i]) - 1;
                $matching = substr($sections[$i], 0, $matchLen) eq
                            substr($values[$i], 0, $matchLen);
            } elsif ($value =~ m#^/(.+)/[a-z]*$#) {
                Trace("Regular expression detected.") if T(4);
                # Here we have a regular expression match.
                my $section = $sections[$i];
                $matching = eval("\$section =~ $value");
            } else {
                # Here we have a strict match.
                Trace("Strict match used.") if T(4);
                $matching = ($sections[$i] eq $values[$i]);
            }
        }
        # If we match, output this row to the return list.
        if ($matching) {
            push @retVal, [$id, $key, @sections];
        }
    }
    # Return the rows found.
    return @retVal;
}

=head2 FIG Method Replacements

The following methods are used by B<FIG.pm> to replace the previous attribute functionality.
Some of the old functionality is no longer present: controlled vocabulary is no longer
supported and there is no longer any searching by URL. Fortunately, neither of these
capabilities were used in the old system.

The methods here are the only ones supported by the B<RemoteCustomAttributes> object.
The idea is that these methods represent attribute manipulation allowed by all users, while
the others are only for privileged users with access to the attribute server.

In the previous implementation, an attribute had a value and a URL. In this implementation,
each attribute has only a value. These methods will treat the value as a list with the individual
elements separated by the value of the splitter parameter on the constructor (L</new>). The default
is double colons C<::>.

So, for example, an old-style keyword with a value of C<essential> and a URL of
C<http://www.sciencemag.org/cgi/content/abstract/293/5538/2266> using the default
splitter value would be stored as

    essential::http://www.sciencemag.org/cgi/content/abstract/293/5538/2266

The best performance is achieved by searching for a particular key for a specified
feature or genome.

=head3 GetAttributes

C<< my @attributeList = $attrDB->GetAttributes($objectID, $key, @values); >>

In the database, attribute values are sectioned into pieces using a splitter
value specified in the constructor (L</new>). This is not a requirement of
the attribute system as a whole, merely a convenience for the purpose of
these methods. If a value has multiple sections, each section
is matched against the corresponding criterion in the I<@valuePatterns> list.

This method returns a series of tuples that match the specified criteria. Each tuple
will contain an object ID, a key, and one or more values. The parameters to this
method therefore correspond structurally to the values expected in each tuple. In
addition, you can ask for a generic search by suffixing a percent sign (C<%>) to any
of the parameters. So, for example,

    my @attributeList = $attrDB->GetAttributes('fig|100226.1.peg.1004', 'structure%', 1, 2);

would return something like

    ['fig}100226.1.peg.1004', 'structure', 1, 2]
    ['fig}100226.1.peg.1004', 'structure1', 1, 2]
    ['fig}100226.1.peg.1004', 'structure2', 1, 2]
    ['fig}100226.1.peg.1004', 'structureA', 1, 2]

Use of C<undef> in any position acts as a wild card (all values). You can also specify
a list reference in the ID column. Thus,

    my @attributeList = $attrDB->GetAttributes(['100226.1', 'fig|100226.1.%'], 'PUBMED');

would get the PUBMED attribute data for Streptomyces coelicolor A3(2) and all its
features.

In addition to values in multiple sections, a single attribute key can have multiple
values, so even

    my @attributeList = $attrDB->GetAttributes($peg, 'virulent');

which has no wildcard in the key or the object ID, may return multiple tuples.

Value matching in this system works very poorly, because of the way multiple values are
stored. For the object ID, key name, and first value, we create queries that filter for the
desired results. On any filtering by value, we must do a comparison after the attributes are
retrieved from the database, since the database has no notion of the multiple values, which
are stored in a single string. As a result, queries in which filter only on value end up
reading a lot more than they need to.

=over 4

=item objectID

ID of object whose attributes are desired. If the attributes are desired for multiple
objects, this parameter can be specified as a list reference. If the attributes are
desired for all objects, specify C<undef> or an empty string. Finally, you can specify
attributes for a range of object IDs by putting a percent sign (C<%>) at the end.

=item key

Attribute key name. A value of C<undef> or an empty string will match all
attribute keys. If the values are desired for multiple keys, this parameter can be
specified as a list reference. Finally, you can specify attributes for a range of
keys by putting a percent sign (C<%>) at the end.

=item values

List of the desired attribute values, section by section. If C<undef>
or an empty string is specified, all values in that section will match. A
generic match can be requested by placing a percent sign (C<%>) at the end.
In that case, all values that match up to and not including the percent sign
will match. You may also specify a regular expression enclosed
in slashes. All values that match the regular expression will be returned. For
performance reasons, only values have this extra capability.

=item RETURN

Returns a list of tuples. The first element in the tuple is an object ID, the
second is an attribute key, and the remaining elements are the sections of
the attribute value. All of the tuples will match the criteria set forth in
the parameter list.

=back

=cut

sub GetAttributes {
    # Get the parameters.
    my ($self, $objectID, $key, @values) = @_;
    # This hash will map "HasValueFor" fields to patterns. We use it to build the
    # SQL statement.
    my %data;
    # Before we do anything else, we must parse the key. The key is treated by the
    # user as a single field, but to us it's actually a real key and a subkey.
    # If the key has no splitter and is exact, the real key is the original key
    # and the subkey is an empty string. If the key has a splitter, it is
    # split into two pieces and each piece is processed separately. If the key has
    # no splitter and is generic, the real key is the incoming key and the subkey
    # is allowed to be wild. Of course, this only matters if an actual key has
    # been specified.
    if (defined $key) {
        if ($key =~ /$self->{splitter}/) {
            # Here we have a two-part key, so we split it normally.
            my ($realKey, $subKey) = $self->SplitKey($key);
            $data{'HasValueFor(from-link)'} = $realKey;
            $data{'HasValueFor(subkey)'} = $subKey;
        } elsif (substr($key, -1, 1) eq '%') {
            $data{'HasValueFor(from-link)'} = $key;
        } else {
            $data{'HasValueFor(from-link)'} = $key;
            $data{'HasValueFor(subkey)'} = '';
        }
    }
    # Add the object ID to the key information.
    $data{'HasValueFor(to-link)'} = $objectID;
    # The first value represents a problem, because we can search it using SQL, but not
    # in the normal way. If the user specifies a generic search or exact match for
    # every alternative value (remember, the values may be specified as a list),
    # then we can create SQL filtering for it. If any of the values are specified
    # as a regular expression, however, that's a problem, because we need to read
    # every value to verify a match.
    if (@values > 0) {
        # Get the first value and put its alternatives in an array.
        my $valueParm = $values[0];
        my @valueList;
        if (ref $valueParm eq 'ARRAY') {
            @valueList = @{$valueParm};
        } else {
            @valueList = ($valueParm);
        }
        # Okay, now we have all the possible criteria for the first value in the list
        # @valueList. We'll copy the values to a new array in which they have been
        # converted to generic requests. If we find a regular-expression match
        # anywhere in the list, we toss the whole thing.
        my @valuePatterns = ();
        my $okValues = 1;
        for my $valuePattern (@valueList) {
            # Check the pattern type.
            if (substr($valuePattern, 0, 1) eq '/') {
                # Regular expressions invalidate the entire process.
                $okValues = 0;
            } elsif (substr($valuePattern, -1, 1) eq '%') {
                # A Generic pattern is passed in unmodified.
                push @valuePatterns, $valuePattern;
            } else {
                # An exact match is converted to generic.
                push @valuePatterns, "$valuePattern%";
            }
        }
        # If everything works, add the value data to the filtering hash.
        if ($okValues) {
            $data{'HasValueFor(value)'} = \@valuePatterns;
        }
    }
    # Create some lists to contain the filter fragments and parameter values.
    my @filter = ();
    my @parms = ();
    # This next loop goes through the different fields that can be specified in the
    # parameter list and generates filters for each. The %data hash that we built above
    # contains all the necessary information to do this.
    for my $field (keys %data) {
        # Accumulate filter information for this field. We will OR together all the
        # elements accumulated to create the final result.
        my @fieldFilter = ();
        # Get the specified data from the caller.
        my $fieldPattern = $data{$field};
        # Only proceed if the pattern is one that won't match everything.
        if (defined($fieldPattern) && $fieldPattern ne "" && $fieldPattern ne "%") {
            # Convert the pattern to an array.
            my @patterns = ();
            if (ref $fieldPattern eq 'ARRAY') {
                push @patterns, @{$fieldPattern};
            } else {
                push @patterns, $fieldPattern;
            }
            # Only proceed if the array is nonempty. The loop will work fine if the
            # array is empty, but when we build the filter string at the end we'll
            # get "()" in the filter list, which will result in an SQL syntax error.
            if (@patterns) {
                # Loop through the individual patterns.
                for my $pattern (@patterns) {
                    # Check for a generic request.
                    if (substr($pattern, -1, 1) ne '%') {
                        # Here we have a normal request.
                        push @fieldFilter, "$field = ?";
                        push @parms, $pattern;
                    } else {
                        # Here we have a generic request, so we will use the LIKE operator to
                        # filter the field to this value pattern.
                        push @fieldFilter, "$field LIKE ?";
                        # We must convert the pattern value to an SQL match pattern. First
                        # we get a copy of it.
                        my $actualPattern = $pattern;
                        # Now we escape the underscores. Underscores are an SQL wild card
                        # character, but they are used frequently in key names and object IDs.
                        $actualPattern =~ s/_/\\_/g;
                        # Add the escaped pattern to the bound parameter list.
                        push @parms, $actualPattern;
                    }
                }
                # Form the filter for this field.
                my $fieldFilterString = join(" OR ", @fieldFilter);
                push @filter, "($fieldFilterString)";
            }
        }
    }
    # Now @filter contains one or more filter strings and @parms contains the parameter
    # values to bind to them.
    my $actualFilter = join(" AND ", @filter);
    # Now we're ready to make our query.
    my $query = $self->Get(['HasValueFor'], $actualFilter, \@parms);
    # Format the results.
    my @retVal = $self->_QueryResults($query, @values);
    # Return the rows found.
    return @retVal;
}

=head3 AddAttribute

C<< $attrDB->AddAttribute($objectID, $key, @values); >>

Add an attribute key/value pair to an object. This method cannot add a new key, merely
add a value to an existing key. Use L</StoreAttributeKey> to create a new key.

=over 4

=item objectID

ID of the object to which the attribute is to be added.

=item key

Attribute key name.

=item values

One or more values to be associated with the key. The values are joined together with
the splitter value before being stored as field values. This enables L</GetAttributes>
to split them apart during retrieval. The splitter value defaults to double colons C<::>.

=back

=cut

sub AddAttribute {
    # Get the parameters.
    my ($self, $objectID, $key, @values) = @_;
    # Don't allow undefs.
    if (! defined($objectID)) {
        Confess("No object ID specified for AddAttribute call.");
    } elsif (! defined($key)) {
        Confess("No attribute key specified for AddAttribute call.");
    } elsif (! @values) {
        Confess("No values specified in AddAttribute call for key $key.");
    } else {
        # Okay, now we have some reason to believe we can do this. Form the values
        # into a scalar.
        my $valueString = join($self->{splitter}, @values);
        # Split up the key.
        my ($realKey, $subKey) = $self->SplitKey($key);
        # Connect the object to the key.
        $self->InsertObject('HasValueFor', { 'from-link' => $realKey,
                                             'to-link'   => $objectID,
                                             'subkey'    => $subKey,
                                             'value'     => $valueString,
                                       });
    }
    # Return a one, indicating success. We do this for backward compatability.
    return 1;
}

=head3 DeleteAttribute

C<< $attrDB->DeleteAttribute($objectID, $key, @values); >>

Delete the specified attribute key/value combination from the database.

=over 4

=item objectID

ID of the object whose attribute is to be deleted.

=item key

Attribute key name.

=item values

One or more values associated with the key. If no values are specified, then all values
will be deleted. Otherwise, only a matching value will be deleted.

=back

=cut

sub DeleteAttribute {
    # Get the parameters.
    my ($self, $objectID, $key, @values) = @_;
    # Don't allow undefs.
    if (! defined($objectID)) {
        Confess("No object ID specified for DeleteAttribute call.");
    } elsif (! defined($key)) {
        Confess("No attribute key specified for DeleteAttribute call.");
    } else {
        # Split the key into the real key and the subkey.
        my ($realKey, $subKey) = $self->SplitKey($key);
        if ($subKey eq '' && scalar(@values) == 0) {
            # Here we erase the entire key for this object.
            $self->DeleteRow('HasValueFor', $key, $objectID);
        } else {
            # Here we erase the matching values.
            my $valueString = join($self->{splitter}, @values);
            $self->DeleteRow('HasValueFor', $realKey, $objectID,
                             { subkey => $subKey, value => $valueString });
        }
    }
    # Return a one. This is for backward compatability.
    return 1;
}

=head3 DeleteMatchingAttributes

C<< my @deleted = $attrDB->DeleteMatchingAttributes($objectID, $key, @values); >>

Delete all attributes that match the specified criteria. This is equivalent to
calling L</GetAttributes> and then invoking L</DeleteAttribute> for each
row found.

=over 4

=item objectID

ID of object whose attributes are to be deleted. If the attributes for multiple
objects are to be deleted, this parameter can be specified as a list reference. If
attributes are to be deleted for all objects, specify C<undef> or an empty string.
Finally, you can delete attributes for a range of object IDs by putting a percent
sign (C<%>) at the end.

=item key

Attribute key name. A value of C<undef> or an empty string will match all
attribute keys. If the values are to be deletedfor multiple keys, this parameter can be
specified as a list reference. Finally, you can delete attributes for a range of
keys by putting a percent sign (C<%>) at the end.

=item values

List of the desired attribute values, section by section. If C<undef>
or an empty string is specified, all values in that section will match. A
generic match can be requested by placing a percent sign (C<%>) at the end.
In that case, all values that match up to and not including the percent sign
will match. You may also specify a regular expression enclosed
in slashes. All values that match the regular expression will be deleted. For
performance reasons, only values have this extra capability.

=item RETURN

Returns a list of tuples for the attributes that were deleted, in the
same form as L</GetAttributes>.

=back

=cut

sub DeleteMatchingAttributes {
    # Get the parameters.
    my ($self, $objectID, $key, @values) = @_;
    # Get the matching attributes.
    my @retVal = $self->GetAttributes($objectID, $key, @values);
    # Loop through the attributes, deleting them.
    for my $tuple (@retVal) {
        $self->DeleteAttribute(@{$tuple});
    }
    # Log this operation.
    my $count = @retVal;
    $self->LogOperation("Mass Delete", $key, "$count matching attributes deleted.");
    # Return the deleted attributes.
    return @retVal;
}

=head3 ChangeAttribute

C<< $attrDB->ChangeAttribute($objectID, $key, \@oldValues, \@newValues); >>

Change the value of an attribute key/value pair for an object.

=over 4

=item objectID

ID of the genome or feature to which the attribute is to be changed. In general, an ID that
starts with C<fig|> is treated as a feature ID, and an ID that is all digits and periods
is treated as a genome ID. For IDs of other types, this parameter should be a reference
to a 2-tuple consisting of the entity type name followed by the object ID.

=item key

Attribute key name. This corresponds to the name of a field in the database.

=item oldValues

One or more values identifying the key/value pair to change.

=item newValues

One or more values to be put in place of the old values.

=back

=cut

sub ChangeAttribute {
    # Get the parameters.
    my ($self, $objectID, $key, $oldValues, $newValues) = @_;
    # Don't allow undefs.
    if (! defined($objectID)) {
        Confess("No object ID specified for ChangeAttribute call.");
    } elsif (! defined($key)) {
        Confess("No attribute key specified for ChangeAttribute call.");
    } elsif (! defined($oldValues) || ref $oldValues ne 'ARRAY') {
        Confess("No old values specified in ChangeAttribute call for key $key.");
    } elsif (! defined($newValues) || ref $newValues ne 'ARRAY') {
        Confess("No new values specified in ChangeAttribute call for key $key.");
    } else {
        # We do the change as a delete/add.
        $self->DeleteAttribute($objectID, $key, @{$oldValues});
        $self->AddAttribute($objectID, $key, @{$newValues});
    }
    # Return a one. We do this for backward compatability.
    return 1;
}

=head3 EraseAttribute

C<< $attrDB->EraseAttribute($key); >>

Erase all values for the specified attribute key. This does not remove the
key from the database; it merely removes all the values.

=over 4

=item key

Key to erase. This must be a real key; that is, it cannot have a subkey
component.

=back

=cut

sub EraseAttribute {
    # Get the parameters.
    my ($self, $key) = @_;
    # Delete everything connected to the key.
    $self->Disconnect('HasValueFor', 'AttributeKey', $key);
    # Log the operation.
    $self->LogOperation("Erase Data", $key);
    # Return a 1, for backward compatability.
    return 1;
}

=head3 GetAttributeKeys

C<< my @keyList = $attrDB->GetAttributeKeys($groupName); >>

Return a list of the attribute keys for a particular group.

=over 4

=item groupName

Name of the group whose keys are desired.

=item RETURN

Returns a list of the attribute keys for the specified group.

=back

=cut

sub GetAttributeKeys {
    # Get the parameters.
    my ($self, $groupName) = @_;
    # Get the attributes for the specified group.
    my @groups = $self->GetFlat(['IsInGroup'], "IsInGroup(to-link) = ?", [$groupName],
                                'IsInGroup(from-link)');
    # Return the keys.
    return sort @groups;
}

=head3 QueryAttributes

C<< my @attributeData = $ca->QueryAttributes($filter, $filterParms); >>

Return the attribute data based on an SQL filter clause. In the filter clause,
the name C<$object> should be used for the object ID, C<$key> should be used for
the key name, C<$subkey> for the subkey value, and C<$value> for the value field.

=over 4

=item filter

Filter clause in the standard ERDB format, except that the field names are C<$object> for
the object ID field, C<$key> for the key name field, C<$subkey> for the subkey field,
and C<$value> for the value field. This abstraction enables us to hide the details of
the database construction from the user.

=item filterParms

Parameters for the filter clause.

=item RETURN

Returns a list of tuples. Each tuple consists of an object ID, a key (with optional subkey), and
one or more attribute values.

=back

=cut

# This hash is used to drive the substitution process.
my %AttributeParms = (object => 'HasValueFor(to-link)',
                      key    => 'HasValueFor(from-link)',
                      subkey => 'HasValueFor(subkey)',
                      value  => 'HasValueFor(value)');

sub QueryAttributes {
    # Get the parameters.
    my ($self, $filter, $filterParms) = @_;
    # Declare the return variable.
    my @retVal = ();
    # Make sue we have filter parameters.
    my $realParms = (defined($filterParms) ? $filterParms : []);
    # Create the query by converting the filter.
    my $realFilter = $filter;
    for my $name (keys %AttributeParms) {
        $realFilter =~ s/\$$name/$AttributeParms{$name}/g;
    }
    my $query = $self->Get(['HasValueFor'], $realFilter, $realParms);
    # Loop through the results, forming the output attribute tuples.
    while (my $result = $query->Fetch()) {
        # Get the four values from this query result row.
        my ($objectID, $key, $subkey, $value) = $result->Values([$AttributeParms{object},
                                                                $AttributeParms{key},
                                                                $AttributeParms{subkey},
                                                                $AttributeParms{value}]);
        # Combine the key and the subkey.
        my $realKey = ($subkey ? $key . $self->{splitter} . $subkey : $key);
        # Split the value.
        my @values = split $self->{splitter}, $value;
        # Output the result.
        push @retVal, [$objectID, $realKey, @values];
    }
    # Return the result.
    return @retVal;
}

=head2 Key and ID Manipulation Methods

=head3 ParseID

C<< my ($type, $id) = CustomAttributes::ParseID($idValue); >>

Determine the type and object ID corresponding to an ID value from the attribute database.
Most ID values consist of a type name and an ID, separated by a colon (e.g. C<Family:aclame|cluster10>);
however, Genomes, Features, and Subsystems are not stored with a type name, so we need to
deduce the type from the ID value structure.

The theory here is that you can plug the ID and type directly into a Sprout database method, as
follows

    my ($type, $id) = CustomAttributes::ParseID($attrList[$num]->[0]);
    my $target = $sprout->GetEntity($type, $id);

=over 4

=item idValue

ID value taken from the attribute database.

=item RETURN

Returns a two-element list. The first element is the type of object indicated by the ID value,
and the second element is the actual object ID.

=back

=cut

sub ParseID {
    # Get the parameters.
    my ($idValue) = @_;
    # Declare the return variables.
    my ($type, $id);
    # Parse the incoming ID. We first check for the presence of an entity name. Entity names
    # can only contain letters, which helps to insure typed object IDs don't collide with
    # subsystem names (which are untyped).
    if ($idValue =~ /^([A-Za-z]+):(.+)/) {
        # Here we have a typed ID.
        ($type, $id) = ($1, $2);
        # Fix the case sensitivity on PDB IDs.
        if ($type eq 'PDB') { $id = lc $id; }
    } elsif ($idValue =~ /fig\|/) {
        # Here we have a feature ID.
        ($type, $id) = (Feature => $idValue);
    } elsif ($idValue =~ /\d+\.\d+/) {
        # Here we have a genome ID.
        ($type, $id) = (Genome => $idValue);
    } else {
        # The default is a subsystem ID.
        ($type, $id) = (Subsystem => $idValue);
    }
    # Return the results.
    return ($type, $id);
}

=head3 FormID

C<< my $idValue = CustomAttributes::FormID($type, $id); >>

Convert an object type and ID pair into an object ID string for the attribute system. Subsystems,
genomes, and features are stored in the database without type information, but all other object IDs
must be prefixed with the object type.

=over 4

=item type

Relevant object type.

=item id

ID of the object in question.

=item RETURN

Returns a string that will be recognized as an object ID in the attribute database.

=back

=cut

sub FormID {
    # Get the parameters.
    my ($type, $id) = @_;
    # Declare the return variable.
    my $retVal;
    # Compute the ID string from the type.
    if (grep { $type eq $_ } qw(Feature Genome Subsystem)) {
        $retVal = $id;
    } else {
        $retVal = "$type:$id";
    }
    # Return the result.
    return $retVal;
}

=head3 GetTargetObject

C<< my $object = CustomAttributes::GetTargetObject($erdb, $idValue); >>

Return the database object corresponding to the specified attribute object ID. The
object type associated with the ID value must correspond to an entity name in the
specified database.

=over 4

=item erdb

B<ERDB> object for accessing the target database.

=item idValue

ID value retrieved from the attribute database.

=item RETURN

Returns a B<ERDBObject> for the attribute value's target object.

=back

=cut

sub GetTargetObject {
    # Get the parameters.
    my ($erdb, $idValue) = @_;
    # Declare the return variable.
    my $retVal;
    # Get the type and ID for the target object.
    my ($type, $id) = ParseID($idValue);
    # Plug them into the GetEntity method.
    $retVal = $erdb->GetEntity($type, $id);
    # Return the resulting object.
    return $retVal;
}

=head3 SplitKey

C<< my ($realKey, $subKey) = $ca->SplitKey($key); >>

Split an external key (that is, one passed in by a caller) into the real key and the sub key.
The real and sub keys are separated by a splitter value (usually C<::>). If there is no splitter,
then the sub key is presumed to be an empty string.

=over 4

=item key

Incoming key to be split.

=item RETURN

Returns a two-element list, the first element of which is the real key and the second element of
which is the sub key.

=back

=cut

sub SplitKey {
    # Get the parameters.
    my ($self, $key) = @_;
    # Do the split.
    my ($realKey, $subKey) = split($self->{splitter}, $key, 2);
    # Insure the subkey has a value.
    if (! defined $subKey) {
        $subKey = '';
    }
    # Return the results.
    return ($realKey, $subKey);
}

=head3 JoinKey

C<< my $key = $ca->JoinKey($realKey, $subKey); >>

Join a real key and a subkey together to make an external key. The external key is the attribute key
used by the caller. The real key and the subkey are how the keys are represented in the database. The
real key is the key to the B<AttributeKey> entity. The subkey is an attribute of the B<HasValueFor>
relationship.

=over 4

=item realKey

The real attribute key.

=item subKey

The subordinate portion of the attribute key.

=item RETURN

Returns a single string representing both keys.

=back

=cut

sub JoinKey {
    # Get the parameters.
    my ($self, $realKey, $subKey) = @_;
    # Declare the return variable.
    my $retVal;
    # Check for a subkey.
    if ($subKey eq '') {
        # No subkey, so the real key is the key.
        $retVal = $realKey;
    } else {
        # Subkey found, so the two pieces must be joined by a splitter.
        $retVal = "$realKey$self->{splitter}$subKey";
    }
    # Return the result.
    return $retVal;
}


=head3 AttributeTable

C<< my $tableHtml = CustomAttributes::AttributeTable($cgi, @attrList); >>

Format the attribute data into an HTML table.

=over 4

=item cgi

CGI query object used to generate the HTML

=item attrList

List of attribute results, in the format returned by the L</GetAttributes> or
L</QueryAttributes> methods.

=item RETURN

Returns an HTML table displaying the attribute keys and values.

=back

=cut

sub AttributeTable {
    # Get the parameters.
    my ($cgi, @attrList) = @_;
    # Accumulate the table rows.
    my @html = ();
    for my $attrData (@attrList) {
        # Format the object ID and key.
        my @columns = map { CGI::escapeHTML($_) } @{$attrData}[0,1];
        # Now we format the values. These remain unchanged unless one of them is a URL.
        my $lastValue = scalar(@{$attrData}) - 1;
        push @columns, map { $_ =~ /^http:/ ? $cgi->a({ href => $_ }, $_) : $_ } @{$attrData}[2 .. $lastValue];
        # Assemble the values into a table row.
        push @html, $cgi->Tr($cgi->td(\@columns));
    }
    # Format the table in the return variable.
    my $retVal = $cgi->table({ border => 2 }, $cgi->Tr($cgi->th(['Object', 'Key', 'Values'])), @html);
    # Return it.
    return $retVal;
}
1;

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3