[Bio] / Sprout / CustomAttributes.pm Repository:
ViewVC logotype

View of /Sprout/CustomAttributes.pm

Parent Directory Parent Directory | Revision Log Revision Log

Revision 1.2 - (download) (as text) (annotate)
Fri Nov 3 16:50:04 2006 UTC (13 years, 5 months ago) by parrello
Branch: MAIN
Changes since 1.1: +17 -2 lines
Enhanced to automatically index each attribute.

#!/usr/bin/perl -w

package CustomAttributes;

    require Exporter;
    use ERDB;
    @ISA = qw(Exporter ERDB);
    use strict;
    use Tracer;
    use FIG;
    use ERDBLoad;

=head1 Custom SEED Attribute Manager

=head2 Introduction

The Custom SEED Attributes Manager allows the user to upload and retrieve
custom data for SEED objects. It uses the B<ERDB> database system to
store the attributes, which are implemented as multi-valued fields
of ERDB entities.

The full suite of ERDB retrieval capabilities is provided. In addition,
custom methods are provided specific to this application. To get all
the values of the attribute C<essential> in the B<Feature> entity, you
would code

    my @values = $attrDB->GetAttributes($fid, Feature => 'essential');

where I<$fid> contains the ID of the desired feature. Each attribute has
an alternate index to allow searching for attributes by value.

New attributes are introduced by updating the database definition at
run-time. Attribute values are stored by uploading data from files.
A web interface is provided for both these activities.

=head2 FIG_Config Parameters

The following configuration parameters are used to manage custom attributes.

=over 4

=item attrDbms

Type of database manager used: C<mysql> for MySQL or C<pg> for PostGres.

=item attrDbName

Name of the attribute database.

=item attrHost

Name of the host server for the database. If omitted, the current host
is used.

=item attrUser

User name for logging in to the database.

=item attrPass

Password for logging in to the database.

=item attrPort

TCP/IP port for accessing the database.

=item attrSock

Socket name used to access the database. If omitted, the default socket
will be used.

=item attrDBD

Fully-qualified file name for the database definition XML file. This file
functions as data to the attribute management process, so if the data is
moved, this file must go with it.


=head2 Impliementation Note

The L</Refresh> method reloads the entities in the database. If new
entity types are added, that method will need to be adjusted accordingly.

=head2 Public Methods

=head3 new

C<< my $attrDB = CustomAttributes->new(); >>

Construct a new CustomAttributes object. This object is only used to load
or access data. To add new attributes, use the static L</NewAttribute>


sub new {
    # Get the parameters.
    my ($class) = @_;
    # Connect to the database.
    my $dbh = DBKernel->new($FIG_Config::attrDbms, $FIG_Config::attrDbName,
                            $FIG_Config::attrUser, $FIG_Config::attrPass,
                            $FIG_Config::attrPort, $FIG_Config::attrHost,
    # Create the ERDB object.
    my $xmlFileName = $FIG_Config::attrDBD;
    my $retVal = ERDB::new($class, $dbh, $xmlFileName);
    # Return the result.
    return $retVal;

=head3 GetAttributes

C<< my @values = $attrDB->GetAttributes($id, $entityName => $attributeName); >>

Return all the values of the specified attribute for the specified entity instance.
A list of vaues will be returned. If the entity instance does not exist or the
attribute has no values, an empty list will be returned. If the attribute name
does not exist, an SQL error will occur.

A typical invocation would look like this:

    my @values = $sttrDB->GetAttributes($fid, Feature => 'essential');

Here the user is asking for the values of the C<essential> attribute for the
B<Feature> with the specified ID. If the identified feature is not essential,
the list returned will be empty. If it is essential, then one or more values
will be returned that describe the essentiality.

=over 4

=item id

ID of the desired entity instance. This identifies the specific object to
be interrogated for attribute values.

=item entityName

Name of the entity. This identifies the the type of the object to be
interrogated for attribute values.

=item attributeName

Name of the desired attribute.

=item RETURN

Returns zero or more strings, each representing a value of the named attribute
for the specified entity instance.



sub GetAttributes {
    # Get the parameters.
    my ($self, $id, $entityName, $attributeName) = @_;
    # Get the data.
    my @retVal = $self->GetEntityValues($entityName, $id, ["$entityName($attributeName)"]);
    # Return the result.
    return @retVal;

=head3 StoreAttribute

C<< my $attrDB = CustomAttributes::StoreAttribute($entityName, $attributeName, $type, $notes); >>

Create or update an attribute for the database. This method will update the database definition
XML, but it will not create the table. It will connect to the database so that the caller
can upload the attribute values.

=over 4

=item entityName

Name of the entity containing the attribute. The entity must exist.

=item attributeName

Name of the attribute. It must be a valid ERDB field name, consisting entirely of
letters, digits, and hyphens, with a letter at the beginning. If it does not
exist already, it will be created.

=item type

Data type of the attribute. This must be a valid ERDB data type name.

=item notes

Descriptive notes about the attribute. It is presumed to be raw text, not HTML.

=item RETURN

Returns a Custom Attribute Database object if successful. If unsuccessful, an
error will be thrown.



sub StoreAttribute {
    # Get the parameters.
    my ($entityName, $attributeName, $type, $notes) = @_;
    # Get the data type hash.
    my %types = ERDB::GetDataTypes();
    # Validate the initial input values.
    if (! ERDB::ValidateFieldName($attributeName)) {
        Confess("Invalid attribute name \"$attributeName\" specified.");
    } elsif (! $notes || length($notes) < 25) {
        Confess("Missing or incomplete description for $attributeName.");
    } elsif (! exists $types{$type}) {
        Confess("Invalid data type \"$type\" for $attributeName.");
    # Our next step is to read in the XML for the database defintion. We
    # need to verify that the named entity exists.
    my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
    my $entityHash = $metadata->{Entities};
    if (! exists $entityHash->{$entityName}) {
        Confess("Entity $entityName not found.");
    } else {
        # Okay, we're ready to begin. Get the entity hash and the field hash.
        my $entityData = $entityHash->{$entityName};
        my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
        # Compute the attribute's relation name.
        my $relName = join("", $entityName, map { ucfirst $_ } split(/-/, $attributeName));
        # Store the attribute's field data. Note the use of the "content" hash for
        # the notes. This is how the XML writer knows Notes is a text tag instead of
        # an attribute.
        $fieldHash->{$attributeName} = { type => $type, relation => $relName,
                                         Notes => { content => $notes } };
        # Insure we have an index for this attribute.
        my $index = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
        if (! defined($index)) {
            push @{$entityData->{Indexes}}, { IndexFields => [ { name => $attributeName, order => 'ascending' } ],
                                              Notes       => "Alternate index provided for access by $attributeName." };
        # Write the XML back out.
        ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
    # Open a database with the new XML.
    my $retVal = CustomAttributes->new();
    return $retVal;

=head3 Refresh

C<< $attrDB->Refresh(); >>

Refresh the primary entity tables from the FIG data store. This method basically
drops and reloads the main tables of the custom attributes database.


sub Refresh {
    # Get the parameters.
    my ($self) = @_;
    # Create load objects for the genomes and the features.
    my $loadGenome = ERDBLoad->new($self, 'Genome', $FIG_Config::temp);
    my $loadFeature = ERDBLoad->new($self, 'Feature', $FIG_Config::temp);
    # Get a FIG object. We'll use this to create the data.
    my $fig = FIG->new();
    # Get the genome list.
    my @genomes = $fig->genomes();
    # Loop through the genomes.
    for my $genomeID (@genomes) {
        # Put this genome in the genome table.
        Trace("Processing Genome $genomeID") if T(3);
        # Put its features into the feature table. Note we have to use a hash to
        # remove duplicates.
        my %featureList = map { $_ => 1 } $fig->all_features($genomeID);
        for my $fid (keys %featureList) {
    # Get a variable for holding statistics objects.
    my $stats;
    # Finish the genome load.
    Trace("Loading Genome relation.") if T(2);
    $stats = $loadGenome->FinishAndLoad();
    Trace("Genome table load statistics:\n" . $stats->Show()) if T(3);
    # Finish the feature load.
    Trace("Loading Feature relation.") if T(2);
    $stats = $loadFeature->FinishAndLoad();
    Trace("Feature table load statistics:\n" . $stats->Show()) if T(3);

=head3 LoadAttribute

C<< my $stats = $attrDB->LoadAttribute($entityName, $fieldName, $fh, $keyCol, $dataCol); >>

Load the specified attribute from the specified file. The file should be a
tab-delimited file with internal tab and new-line characters escaped. This is
the typical TBL-style file used by most FIG applications. One of the columns
in the input file must contain the appropriate key value and the other the
corresponding attribute value.

=over 4

=item entityName

Name of the entity containing the attribute.

=item fieldName

Name of the actual attribute.

=item fh

Open file handle for the input file.

=item keyCol

Index (0-based) of the column containing the key field. The key field should
contain the ID of an instance of the named entity.

=item dataCol

Index (0-based) of the column containing the data value field.

=item RETURN

Returns a statistics object for the load process.



sub LoadAttribute {
    # Get the parameters.
    my ($self, $entityName, $fieldName, $fh, $keyCol, $dataCol) = @_;
    # Create the return variable.
    my $retVal;
    # Insure the entity exists.
    my $found = grep { $_ eq $entityName } $self->GetEntityTypes();
    if (! $found) {
        Confess("Entity \"$entityName\" not found in database.");
    } else {
        # Get the field structure for the named entity.
        my $fieldHash = $self->GetFieldTable($entityName);
        # Verify that the attribute exists.
        if (! exists $fieldHash->{$fieldName}) {
            Confess("Attribute \"$fieldName\" does not exist in entity $entityName.");
        } else {
            # Create a loader for the specified attribute. We need the
            # relation name first.
            my $relName = $fieldHash->{$fieldName}->{relation};
            my $loadAttribute = ERDBLoad->new($self, $relName, $FIG_Config::temp);
            # Loop through the input file.
            while (! eof $fh) {
                # Get the next line of the file.
                my @fields = Tracer::GetLine($fh);
                # Now we need to validate the line.
                if ($#fields < $dataCol) {
                } elsif (! $self->Exists($entityName, $fields[$keyCol])) {
                } else {
                    # It's valid,so send it to the loader.
                    $loadAttribute->Put($fields[$keyCol], $fields[$dataCol]);
            # Finish the load.
            $retVal = $loadAttribute->FinishAndLoad();
    # Return the statistics.
    return $retVal;

=head3 DeleteAttribute

C<< CustomAttributes::DeleteAttribute($entityName, $attributeName); >>

Delete an attribute from the custom attributes database.

=over 4

=item entityName

Name of the entity possessing the attribute.

=item attributeName

Name of the attribute to delete.



sub DeleteAttribute {
    # Get the parameters.
    my ($entityName, $attributeName) = @_;
    # Read in the XML for the database defintion. We need to verify that
    # the named entity exists and it has the named attribute.
    my $metadata = ERDB::ReadMetaXML($FIG_Config::attrDBD);
    my $entityHash = $metadata->{Entities};
    if (! exists $entityHash->{$entityName}) {
        Confess("Entity \"$entityName\" not found.");
    } else {
        # Get the field hash.
        my $fieldHash = ERDB::GetEntityFieldHash($metadata, $entityName);
        if (! exists $fieldHash->{$attributeName}) {
            Confess("Attribute \"$attributeName\" not found in entity $entityName.");
        } else {
            # Get the attribute's relation name.
            my $relName = $fieldHash->{$attributeName}->{relation};
            # Check for an index.
            my $indexIdx = ERDB::FindIndexForEntity($metadata, $entityName, $attributeName);
            if (defined($indexIdx)) {
                Trace("Index for $attributeName found at position $indexIdx for $entityName.") if T(3);
                delete $entityHash->{$entityName}->{Indexes}->[$indexIdx];
            # Delete the attribute from the field hash.
            Trace("Deleting attribute $attributeName from $entityName.") if T(3);
            delete $fieldHash->{$attributeName};
            # Write the XML back out.
            ERDB::WriteMetaXML($metadata, $FIG_Config::attrDBD);
            # Insure the relation does not exist in the database. This requires connecting
            # since we may have to do a table drop.
            my $attrDB = CustomAttributes->new();
            Trace("Dropping table $relName.") if T(3);

=head3 ControlForm

C<< my $formHtml = $attrDB->ControlForm($cgi, $name); >>

Return a form that can be used to control the creation and modification of

=over 4

=item cgi

CGI query object used to create HTML.

=item name

Name to give to the form. This should be unique for the web page.

=item RETURN

Returns the HTML for a form that submits instructions to the C<Attributes.cgi> script
for loading, creating, or deleting an attribute.



sub ControlForm {
    # Get the parameters.
    my ($self, $cgi, $name) = @_;
    # Declare the return list.
    my @retVal = ();
    # Start the form. We use multipart to support the upload control.
    push @retVal, $cgi->start_multipart_form(-name => $name);
    # We'll put the controls in a table. Nothing else ever seems to look nice.
    push @retVal, $cgi->start_table({ border => 2, cellpadding => 2 });
    # The first row is for selecting the field name.
    push @retVal, $cgi->Tr($cgi->th("Select a Field"),
                           $cgi->td($self->FieldMenu($cgi, 10, 'fieldName', 1,
    # Now we set up a dropdown for the data types. The values will be the
    # data type names, and the labels will be the descriptions.
    my %types = ERDB::GetDataTypes();
    my %labelMap = map { $_ => $types{$_}->{notes} } keys %types;
    my $typeMenu = $cgi->popup_menu(-name   => 'dataType',
                                    -values => [sort keys %types],
                                    -labels => \%labelMap);
    push @retVal, $cgi->Tr($cgi->th("Data type"),
    # The next row is for the notes.
    push @retVal, $cgi->Tr($cgi->th("Description"),
                           $cgi->td($cgi->textarea(-name => 'notes',
                                                   -rows => 6,
                                                   -columns => 80))
    # Allow the user to specify a new field name. This is required if the
    # user has selected one of the "(new)" markers.
    push @retVal, $cgi->Tr($cgi->th("New Field Name"),
                           $cgi->td($cgi->textfield(-name => 'newName',
                                                    -size => 30)),
    # If the user wants to upload new values for the field, then we have
    # an upload file name and column indicators.
    push @retVal, $cgi->Tr($cgi->th("Upload Values"),
                           $cgi->td($cgi->filefield(-name => 'newValueFile',
                                                    -size => 20) .
                                    " Key&nbsp;" .
                                    $cgi->textfield(-name => 'keyCol',
                                                    -size => 3,
                                                    -default => 0) .
                                    " Value&nbsp;" .
                                    $cgi->textfield(-name => 'valueCol',
                                                    -size => 3,
                                                    -default => 1)
    # Now the two buttons: UPDATE and DELETE.
    push @retVal, $cgi->Tr($cgi->th("&nbsp;"),
                           $cgi->td({align => 'center'},
                                    $cgi->submit(-name => 'Delete', -value => 'DELETE') . " " .
                                    $cgi->submit(-name => 'Store',  -value => 'STORE')
    # Close the table and the form.
    push @retVal, $cgi->end_table();
    push @retVal, $cgi->end_form();
    # Return the assembled HTML.
    return join("\n", @retVal, "");

=head3 FieldMenu

C<< my $menuHtml = $attrDB->FieldMenu($cgi, $height, $name, $newFlag, $noteControl, $typeControl); >>

Return the HTML for a menu to select an attribute field. The menu will
be a standard SELECT/OPTION thing which is called "popup menu" in the
CGI package, but actually looks like a list. The list will contain
one selectable row per field, grouped by entity.

=over 4

=item cgi

CGI query object used to generate HTML.

=item height

Number of lines to display in the list.

=item name

Name to give to the menu. This is the name under which the value will
appear when the form is submitted.

=item newFlag (optional)

If TRUE, then extra rows will be provided to allow the user to select
a new attribute. In other words, the user can select an existing
attribute, or can choose a C<(new)> marker to indicate a field to
be created in the parent entity.

=item noteControl (optional)

If specified, the name of a variable for displaying the notes attached
to the field. This must be in Javascript form ready for assignment.
So, for example, if you have a variable called C<notes> that
represents a paragraph element, you should code C<notes.innerHTML>.
If it actually represents a form field you should code C<notes.value>.
If an C<innerHTML> coding is used, the text will be HTML-escaped before
it is copied in. Specifying this parameter generates Javascript for
displaying the field description when a field is selected.

=item typeControl (optional)

If specified, the name of a variable for displaying the field's
data type. Data types are a much more controlled vocabulary than
notes, so there is no worry about HTML translation. Instead, the
raw value is put into the specified variable. Otherwise, the same
rules apply to this value that apply to I<$noteControl>.

=item RETURN

Returns the HTML to create a form field that can be used to select an
attribute from the custom attributes system.



sub FieldMenu {
    # Get the parameters.
    my ($self, $cgi, $height, $name, $newFlag, $noteControl, $typeControl) = @_;
    # These next two hashes make everything happen. "entities"
    # maps each entity name to the list of values to be put into its
    # option group. "labels" maps each entity name to a map from values
    # to labels.
    my @entityNames = sort ($self->GetEntityTypes());
    my %entities = map { $_ => [] } @entityNames;
    my %labels = map { $_ => { }} @entityNames;
    # Loop through the entities, adding the existing attributes.
    for my $entity (@entityNames) {
        # Get this entity's field table.
        my $fieldHash = $self->GetFieldTable($entity);
        # Get its field list in our local hashes.
        my $fieldList = $entities{$entity};
        my $labelList = $labels{$entity};
        # Add the NEW fields if we want them.
        if ($newFlag) {
            push @{$fieldList}, $entity;
            $labelList->{$entity} = "(new)";
        # Loop through the fields in the hash. We only keep the ones with a
        # secondary relation name. (In other words, the name of the relation
        # in which the field appears cannot be the same as the entity name.)
        for my $fieldName (sort keys %{$fieldHash}) {
            if ($fieldHash->{$fieldName}->{relation} ne $entity) {
                my $value = "$entity/$fieldName";
                push @{$fieldList}, $value;
                $labelList->{$value} = $fieldName;
    # Now we have a hash and a list for each entity, and they correspond
    # exactly to what the $cgi->optgroup function expects.
    # The last step is to create the name for the onChange function. This function
    # may not do anything, but we need to know the name to generate the HTML
    # for the menu.
    my $changeName = "${name}_setNotes";
    my $retVal = $cgi->popup_menu({name => $name,
                                   size => $height,
                                   onChange => "$changeName(this.value)",
                                   values => [map { $cgi->optgroup(-name => $_,
                                                                   -values => $entities{$_},
                                                                   -labels => $labels{$_})
                                                  } @entityNames]}
    # Create the change function.
    $retVal .= "\n<script language=\"javascript\">\n";
    $retVal .= "    function $changeName(fieldValue) {\n";
    # The function only has a body if we have a notes control to store the description.
    if ($noteControl || $typeControl) {
        # Check to see if we're storing HTML or text into the note control.
        my $htmlMode = ($noteControl && $noteControl =~ /innerHTML$/);
        # We use a CASE statement based on the newly-selected field value. The
        # field description will be stored in the JavaScript variable "myText"
        # and the data type in "myType". Note the default data type is a normal
        # string, but the default notes is an empty string.
        $retVal .= "        var myText = \"\";\n";
        $retVal .= "        var myType = \"string\";\n";
        $retVal .= "        switch (fieldValue) {\n";
        # Loop through the entities.
        for my $entity (@entityNames) {
            # Get the entity's field hash. This has the notes in it.
            my $fieldHash = $self->GetFieldTable($entity);
            # Loop through the values we might see for this entity's fields.
            my $fields = $entities{$entity};
            for my $value (@{$fields}) {
                # Only proceed if we have an existing field.
                if ($value =~ m!/(.+)$!) {
                    # Get the field's hash element.
                    my $element = $fieldHash->{$1};
                    # Generate this case.
                    $retVal .= "        case \"$value\" :\n";
                    # Here we either want to update the note display, the
                    # type display, or both.
                    if ($noteControl) {
                        # Here we want the notes updated.
                        my $notes = $element->{Notes}->{content};
                        # Insure it's in the proper form.
                        if ($htmlMode) {
                            $notes = ERDB::HTMLNote($notes);
                        # Escape it for use as a string literal.
                        $notes =~ s/\n/\\n/g;
                        $notes =~ s/"/\\"/g;
                        $retVal .= "           myText = \"$notes\";\n";
                    if ($typeControl) {
                        # Here we want the type updated.
                        my $type = $element->{type};
                        $retVal .= "           myType = \"$type\";\n";
                    # Close this case.
                    $retVal .= "           break;\n";
        # Close the CASE statement and make the appropriate assignments.
        $retVal .= "        }\n";
        if ($noteControl) {
            $retVal .= "        $noteControl = myText;\n";
        if ($typeControl) {
            $retVal .= "        $typeControl = myType;\n";
    # Terminate the change function.
    $retVal .= "    }\n";
    $retVal .= "</script>\n";
    # Return the result.
    return $retVal;


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3