[Bio] / Sprout / MarkUbiquitous.pl Repository:
ViewVC logotype

View of /Sprout/MarkUbiquitous.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (download) (as text) (annotate)
Thu May 12 20:01:19 2011 UTC (7 years, 11 months ago) by parrello
Branch: MAIN
CVS Tags: mgrast_dev_08112011, mgrast_dev_06072011, mgrast_dev_08022011, rast_rel_2014_0912, rast_rel_2014_0729, mgrast_release_3_1_2, mgrast_release_3_1_1, rast_rel_2011_0928, mgrast_version_3_2, mgrast_dev_12152011, mgrast_dev_10262011, mgrast_release_3_1_0, HEAD
New script to update compounds in the Sapling.

#!/usr/bin/perl -w

=head1 Mark Compounds Ubiquitous

This script runs through the Sapling database and marks as ubiquitous any
compound that participates in more than a specified number of reactions.
This procedure insures that when processing connections between reactions
we do not create an unmanageable explosion of data.

The single positional parameter is the number of reactions required for a
compound to be marked.

The currently-supported command-line options are as follows.

=over 4

=item user

Name suffix to be used for log files. If omitted, the PID is used.

=item trace

Numeric trace level. A higher trace level causes more messages to appear. The
default trace level is 2. Tracing will be directly to the standard output
as well as to a C<trace>I<User>C<.log> file in the FIG temporary directory,
where I<User> is the value of the B<user> option above.

=item sql

If specified, turns on tracing of SQL activity.

=item background

Save the standard and error output to files. The files will be created
in the FIG temporary directory and will be named C<err>I<User>C<.log> and
C<out>I<User>C<.log>, respectively, where I<User> is the value of the
B<user> option above.

=item h

Display this command's parameters and options.

=back

=cut

use strict;
use Tracer;
use Sapling;
use SeedUtils;
use Tracer;

# Get the command-line options and parameters.
my ($options, @parameters) = StandardSetup([], { }, "<count>", @ARGV);
# Create the statistics object.
my $stats = Stats->new();
# Get the Sapling database.
my $sap = Sapling->new();
# Get the count from the parameter list.
my $uCount = $parameters[0] || 20;
# Create the query for looping through the reaction/compound relationships.
my $q = $sap->Get('Involves', "Involves(cofactor) = ? ORDER BY Involves(to-link)", [0]);
# We need to track the current compound and the count of occurrences found.
my ($compound, $count) = ("", 0);
# Loop through the data records.
while (my $record = $q->Fetch()) {
    # Get the new compound ID.
    my $newCompound = $record->PrimaryValue('to-link');
    if ($compound ne $newCompound) {
        # Here we have a new compound. Process the old one.
        if ($compound) {
            CheckCompound($stats, $sap, $compound, $count, $uCount);
        }
        # Initialize for the new compound.
        $compound = $newCompound;
        $count = 0;
        $stats->Add(compounds => 1);
    }
    # Count this record.
    $count++;
    $stats->Add(records => 1);
}
# Check the last compound.
CheckCompound($stats, $sap, $compound, $count, $uCount);
# All done.
Trace("All done:\n" . $stats->Show()) if T(2);

# Check a compound to see if it should be marked ubiquitous.
sub CheckCompound {
    # Get the parameters.
    my ($stats, $sap, $compound, $count, $uCount) = @_;
    # Is the count big enough for the compound to be ubiquitous?
    if ($count >= $uCount) {
        # Get the compound's label.
        my ($label) = $sap->GetFlat('Compound', "Compound(id) = ?", [$compound],
                                    'label');
        # Mark this compound ubiquitous.
        Trace("Marking $compound ($label) ubiquitous: $count reactions found.") if T(3);
        $sap->UpdateEntity('Compound', $compound, ubiquitous => 1);
        $stats->Add(updates => 1);
    }
}


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3