package Tracer;
require Exporter;
@ISA = ('Exporter');
@EXPORT = qw(Trace T TSetup QTrace Confess Cluck Min Max Assert Open OpenDir TICK);
@EXPORT_OK = qw(GetFile GetOptions Merge MergeOptions ParseCommand ParseRecord UnEscape Escape);
use strict;
use Carp qw(longmess croak);
use CGI;
use FIG_Config;
use PageBuilder;
=head1 Tracing and Debugging Helpers
=head2 Introduction
This package provides simple tracing for debugging and reporting purposes. To use it simply call the
L method to set the options and call L to write out trace messages. Each trace
message has a I and I associated with it. In addition, the tracing package itself
has a list of categories and a single trace level set by the B method. Only messages whose trace
level is less than or equal to this package's trace level and whose category is activated will
be written. Thus, a higher trace level on a message indicates that the message
is less likely to be seen. A higher trace level passed to B means more trace messages will
appear. To generate a trace message, use the following syntax.
C<< Trace($message) if T(errors => 4); >>
This statement will produce a trace message if the trace level is 4 or more and the C
category is active. Note that the special category C is always active, so
C<< Trace($message) if T(main => 4); >>
will trace if the trace level is 4 or more.
If the category name is the same as the package name, all you need is the number. So, if the
following call is made in the B package, it will appear if the C category is
active and the trace level is 2 or more.
C<< Trace($message) if T(2); >>
To set up tracing, you call the L method. The method takes as input a trace level, a list
of category names, and a set of options. The trace level and list of category names are
specified as a space-delimited string. Thus
C<< TSetup('3 errors Sprout ERDB', 'HTML'); >>
sets the trace level to 3, activates the C, C, and C categories, and
specifies that messages should be output as HTML paragraphs.
To turn on tracing for ALL categories, use an asterisk. The call below sets every category to
level 3 and writes the output to the standard error output. This sort of thing might be
useful in a CGI environment.
C<< TSetup('3 *', 'WARN'); >>
In addition to HTML and file output for trace messages, you can specify that the trace messages
be queued. The messages can then be retrieved by calling the L method. This approach
is useful if you are building a web page. Instead of having the trace messages interspersed with
the page output, they can be gathered together and displayed at the end of the page. This makes
it easier to debug page formatting problems.
Finally, you can specify that all trace messages be emitted as warnings.
The flexibility of tracing makes it superior to simple use of directives like C and C.
Tracer calls can be left in the code with minimal overhead and then turned on only when needed.
Thus, debugging information is available and easily retrieved even when the application is
being used out in the field.
There is no hard and fast rule on how to use trace levels. The following is therefore only
a suggestion.
=over 4
=item 0 Error
Message indicates an error that may lead to incorrect results or that has stopped the
application entirely.
=item 1 Warning
Message indicates something that is unexpected but that probably did not interfere
with program execution.
=item 2 Notice
Message indicates the beginning or end of a major task.
=item 3 Information
Message indicates a subtask. In the FIG system, a subtask generally relates to a single
genome. This would be a big loop that is not expected to execute more than 500 times or so.
=item 4 Detail
Message indicates a low-level loop iteration.
=back
=cut
# Declare the configuration variables.
my $Destination = "NONE"; # Description of where to send the trace output.
my $TeeFlag = 0; # TRUE if output is going to a file and to the
# standard output
my %Categories = ( main => 1 );
# hash of active category names
my $TraceLevel = 0; # trace level; a higher trace level produces more
# messages
my @Queue = (); # queued list of trace messages.
my $LastCategory = "main"; # name of the last category interrogated
my $SetupCount = 0; # number of times TSetup called
my $AllTrace = 0; # TRUE if we are tracing all categories.
=head2 Public Methods
=head3 TSetup
C<< TSetup($categoryList, $target); >>
This method is used to specify the trace options. The options are stored as package data
and interrogated by the L and L methods.
=over 4
=item categoryList
A string specifying the trace level and the categories to be traced, separated by spaces.
The trace level must come first.
=item target
The destination for the trace output. To send the trace output to a file, specify the file
name preceded by a ">" symbol. If a double symbol is used (">>"), then the data is appended
to the file. Otherwise the file is cleared before tracing begins. Precede the first ">"
symbol with a C<+> to echo output to a file AND to the standard output. In addition to
sending the trace messages to a file, you can specify a special destination. C will
cause tracing to the standard output with each line formatted as an HTML paragraph. C
will cause tracing to the standard output as ordinary text. C will cause trace
messages to be sent to the standard error output as ordinary text. C will cause trace
messages to be stored in a queue for later retrieval by the L method. C will
cause trace messages to be emitted as warnings using the B directive. C will
cause tracing to be suppressed.
=back
=cut
sub TSetup {
# Get the parameters.
my ($categoryList, $target) = @_;
# Parse the category list.
my @categoryData = split /\s+/, $categoryList;
# Extract the trace level.
$TraceLevel = shift @categoryData;
# Presume category-based tracing until we learn otherwise.
$AllTrace = 0;
# Build the category hash. Note that if we find a "*", we turn on non-category
# tracing.
for my $category (@categoryData) {
if ($category eq '*') {
$AllTrace = 1;
} else {
$Categories{lc $category} = 1;
}
}
# Now we need to process the destination information. The most important special
# cases are the single ">", which requires we clear the file first, and the
# "+" prefix which indicates a double echo.
if ($target =~ m/^\+?>>?/) {
if ($target =~ m/^\+/) {
$TeeFlag = 1;
$target = substr($target, 1);
}
if ($target =~ m/^>[^>]/) {
open TRACEFILE, $target;
print TRACEFILE Now() . " Tracing initialized.\n";
close TRACEFILE;
$Destination = ">$target";
} else {
$Destination = $target;
}
} else {
$Destination = uc($target);
}
# Increment the setup counter.
$SetupCount++;
}
=head3 Setups
C<< my $count = Tracer::Setups(); >>
Return the number of times L has been called.
This method allows for the creation of conditional tracing setups where, for example, we
may want to set up tracing if nobody else has done it before us.
=cut
sub Setups {
return $SetupCount;
}
=head3 Open
C<< my $handle = Open($fileHandle, $fileSpec, $message); >>
Open a file.
The I<$fileSpec> is essentially the second argument of the PERL C
function. The mode is specified using Unix-like shell information. So, for
example,
Open(\*LOGFILE, '>>/usr/spool/news/twitlog', "Could not open twit log.");
would open for output appended to the specified file, and
Open(\*DATASTREAM, "| sort -u >$outputFile", "Could not open $outputFile.");
would open a pipe that sorts the records written and removes duplicates. Note
the use of file handle syntax in the Open call. To use anonymous file handles,
code as follows.
my $logFile = Open(undef, '>>/usr/spool/news/twitlog', "Could not open twit log.");
The I<$message> parameter is used if the open fails. If it is set to C<0>, then
the open returns TRUE if successful and FALSE if an error occurred. Otherwise, a
failed open will throw an exception and the third parameter will be used to construct
an error message. If the parameter is omitted, a standard message is constructed
using the file spec.
Could not open "/usr/spool/news/twitlog"
Note that the mode characters are automatically cleaned from the file name.
The actual error message from the file system will be captured and appended to the
message in any case.
Could not open "/usr/spool/news/twitlog": file not found.
In some versions of PERL the only error message we get is a number, which
corresponds to the C++ C value.
Could not open "/usr/spool/news/twitlog": 6.
=over 4
=item fileHandle
File handle. If this parameter is C, a file handle will be generated
and returned as the value of this method.
=item fileSpec
File name and mode, as per the PERL C function.
=item message (optional)
Error message to use if the open fails. If omitted, a standard error message
will be generated. In either case, the error information from the file system
is appended to the message. To specify a conditional open that does not throw
an error if it fails, use C<0>.
=item RETURN
Returns the name of the file handle assigned to the file, or C if the
open failed.
=back
=cut
sub Open {
# Get the parameters.
my ($fileHandle, $fileSpec, $message) = @_;
# Attempt to open the file.
my $rv = open $fileHandle, $fileSpec;
# If the open failed, generate an error message.
if (! $rv) {
# Save the system error message.
my $sysMessage = $!;
# See if we need a default message.
if (!$message) {
# Clean any obvious mode characters and leading spaces from the
# filename.
my ($fileName) = FindNamePart($fileSpec);
$message = "Could not open \"$fileName\"";
}
# Terminate with an error using the supplied message and the
# error message from the file system.
Confess("$message: $!");
}
# Return the file handle.
return $fileHandle;
}
=head3 FindNamePart
C<< my ($fileName, $start, $len) = Tracer::FindNamePart($fileSpec); >>
Extract the portion of a file specification that contains the file name.
A file specification is the string passed to an C call. It specifies the file
mode and name. In a truly complex situation, it can specify a pipe sequence. This
method assumes that the file name is whatever follows the first angle bracket
sequence. So, for example, in the following strings the file name is
C.
>>/usr/fig/myfile.txt
/usr/fig/myfile.txt
If the method cannot find a file name using its normal methods, it will return the
whole incoming string.
=over 4
=item fileSpec
File specification string from which the file name is to be extracted.
=item RETURN
Returns a three-element list. The first element contains the file name portion of
the specified string, or the whole string if a file name cannot be found via normal
methods. The second element contains the start position of the file name portion and
the third element contains the length.
=back
=cut
#: Return Type $;
sub FindNamePart {
# Get the parameters.
my ($fileSpec) = @_;
# Default to the whole input string.
my ($retVal, $pos, $len) = ($fileSpec, 0, length $fileSpec);
# Parse out the file name if we can.
if ($fileSpec =~ m/(<|>>?)(.+?)(\s*)$/) {
$retVal = $2;
$len = length $retVal;
$pos = (length $fileSpec) - (length $3) - $len;
}
# Return the result.
return ($retVal, $pos, $len);
}
=head3 OpenDir
C<< my @files = OpenDir($dirName, $filtered); >>
Open a directory and return all the file names. This function essentially performs
the functions of an C and C. If the I<$filtered> parameter is
set to TRUE, all filenames beginning with a period (C<.>) will be filtered out of
the return list. If the directory does not open, an exception is thrown. So,
for example,
my @files = OpenDir("/Volumes/fig/contigs", 1);
is effectively the same as
opendir(TMP, "/Volumes/fig/contigs") || Confess("Could not open /Volumes/fig/contigs.");
my @files = grep { $_ !~ /^\./ } readdir(TMP);
Similarly, the following code
my @files = grep { $_ =~ /^\d/ } OpenDir("/Volumes/fig/orgs");
Returns the names of all files in C that begin with digits and
automatically throws an error if the directory fails to open.
=over 4
=item dirName
Name of the directory to open.
=item filtered
TRUE if files whose names begin with a period (C<.>) should be automatically removed
from the list, else FALSE.
=back
=cut
#: Return Type @;
sub OpenDir {
# Get the parameters.
my ($dirName, $filtered) = @_;
# Declare the return variable.
my @retVal;
# Open the directory.
if (opendir(my $dirHandle, $dirName)) {
# The directory opened successfully. Get the appropriate list according to the
# strictures of the filter parameter.
if ($filtered) {
@retVal = grep { $_ !~ /^\./ } readdir $dirHandle;
} else {
@retVal = readdir $dirHandle;
}
} else {
# Here the directory would not open.
Confess("Could not open directory $dirName.");
}
# Return the result.
return @retVal;
}
=head3 SetLevel
C<< Tracer::SetLevel($newLevel); >>
Modify the trace level. A higher trace level will cause more messages to appear.
=over 4
=item newLevel
Proposed new trace level.
=back
=cut
sub SetLevel {
$TraceLevel = $_[0];
}
=head3 Now
C<< my $string = Tracer::Now(); >>
Return a displayable time stamp containing the local time.
=cut
sub Now {
my ($sec,$min,$hour,$mday,$mon,$year,$wday,$yday,$isdst) = localtime(time);
my $retVal = _p2($mon+1) . "/" . _p2($mday) . "/" . ($year + 1900) . " " .
_p2($hour) . ":" . _p2($min) . ":" . _p2($sec);
return $retVal;
}
# Pad a number to 2 digits.
sub _p2 {
my ($value) = @_;
$value = "0$value" if ($value < 10);
return $value;
}
=head3 LogErrors
C<< Tracer::LogErrors($fileName); >>
Route the standard error output to a log file.
=over 4
=item fileName
Name of the file to receive the error output.
=back
=cut
sub LogErrors {
# Get the file name.
my ($fileName) = @_;
# Open the file as the standard error output.
open STDERR, '>', $fileName;
}
=head3 ReadOptions
C<< my %options = Tracer::ReadOptions($fileName); >>
Read a set of options from a file. Each option is encoded in a line of text that has the
format
IC<=>IC<; >I
The option name must consist entirely of letters, digits, and the punctuation characters
C<.> and C<_>, and is case sensitive. Blank lines and lines in which the first nonblank
character is a semi-colon will be ignored. The return hash will map each option name to
the corresponding option value.
=over 4
=item fileName
Name of the file containing the option data.
=item RETURN
Returns a hash mapping the option names specified in the file to their corresponding option
value.
=back
=cut
sub ReadOptions {
# Get the parameters.
my ($fileName) = @_;
# Open the file.
(open CONFIGFILE, "<$fileName") || Confess("Could not open option file $fileName.");
# Count the number of records read.
my ($records, $comments) = 0;
# Create the return hash.
my %retVal = ();
# Loop through the file, accumulating key-value pairs.
while (my $line = ) {
# Denote we've read a line.
$records++;
# Determine the line type.
if ($line =~ /^\s*[\n\r]/) {
# A blank line is a comment.
$comments++;
} elsif ($line =~ /^\s*([A-Za-z0-9_\.]+)=([^;]*);/) {
# Here we have an option assignment.
retVal{$1} = $2;
} elsif ($line =~ /^\s*;/) {
# Here we have a text comment.
$comments++;
} else {
# Here we have an invalid line.
Trace("Invalid option statement in record $records.") if T(0);
}
}
# Return the hash created.
return %retVal;
}
=head3 GetOptions
C<< Tracer::GetOptions(\%defaults, \%options); >>
Merge a specified set of options into a table of defaults. This method takes two hash references
as input and uses the data from the second to update the first. If the second does not exist,
there will be no effect. An error will be thrown if one of the entries in the second hash does not
exist in the first.
Consider the following example.
C<< my $optionTable = GetOptions({ dbType => 'mySQL', trace => 0 }, $options); >>
In this example, the variable B<$options> is expected to contain at most two options-- B and
B. The default database type is C and the default trace level is C<0>. If the value of
B<$options> is C<< {dbType => 'Oracle'} >>, then the database type will be changed to C and
the trace level will remain at 0. If B<$options> is undefined, then the database type and trace level
will remain C and C<0>. If, on the other hand, B<$options> is defined as
C<< {databaseType => 'Oracle'} >>
an error will occur because the B option does not exist.
=over 4
=item defaults
Table of default option values.
=item options
Table of overrides, if any.
=item RETURN
Returns a reference to the default table passed in as the first parameter.
=back
=cut
sub GetOptions {
# Get the parameters.
my ($defaults, $options) = @_;
# Check for overrides.
if ($options) {
# Loop through the overrides.
while (my ($option, $setting) = each %{$options}) {
# Insure this override exists.
if (!exists $defaults->{$option}) {
croak "Unrecognized option $option encountered.";
} else {
# Apply the override.
$defaults->{$option} = $setting;
}
}
}
# Return the merged table.
return $defaults;
}
=head3 MergeOptions
C<< Tracer::MergeOptions(\%table, \%defaults); >>
Merge default values into a hash table. This method looks at the key-value pairs in the
second (default) hash, and if a matching key is not found in the first hash, the default
pair is copied in. The process is similar to L, but there is no error-
checking and no return value.
=over 4
=item table
Hash table to be updated with the default values.
=item defaults
Default values to be merged into the first hash table if they are not already present.
=back
=cut
sub MergeOptions {
# Get the parameters.
my ($table, $defaults) = @_;
# Loop through the defaults.
while (my ($key, $value) = each %{$defaults}) {
if (!exists $table->{$key}) {
$table->{$key} = $value;
}
}
}
=head3 Trace
C<< Trace($message); >>
Write a trace message to the target location specified in L. If there has not been
any prior call to B.
=over 4
=item message
Message to write.
=back
=cut
sub Trace {
# Get the parameters.
my ($message) = @_;
# Get the timestamp.
my $timeStamp = Now();
# Format the message. Note we strip off any line terminators at the end.
my $formatted = "$timeStamp <$LastCategory>: " . Strip($message);
# Process according to the destination.
if ($Destination eq "TEXT") {
# Write the message to the standard output.
print "$formatted\n";
} elsif ($Destination eq "ERROR") {
# Write the message to the error output.
print STDERR "$formatted\n";
} elsif ($Destination eq "QUEUE") {
# Push the message into the queue.
push @Queue, "$formatted";
} elsif ($Destination eq "HTML") {
# Convert the message to HTML and write it to the standard output.
my $escapedMessage = CGI::escapeHTML($message);
print "
$formatted
\n";
} elsif ($Destination eq "WARN") {
# Emit the message as a warning.
warn $message;
} elsif ($Destination =~ m/^>>/) {
# Write the trace message to an output file.
(open TRACING, $Destination) || die "Tracing open for \"$Destination\" failed: $!";
print TRACING "$formatted\n";
close TRACING;
# If the Tee flag is on, echo it to the standard output.
if ($TeeFlag) {
print "$formatted\n";
}
}
}
=head3 T
C<< my $switch = T($category, $traceLevel); >>
or
C<< my $switch = T($traceLevel); >>
Return TRUE if the trace level is at or above a specified value and the specified category
is active, else FALSE. If no category is specified, the caller's package name is used.
=over 4
=item category
Category to which the message belongs. If not specified, the caller's package name is
used.
=item traceLevel
Relevant tracing level.
=item RETURN
TRUE if a message at the specified trace level would appear in the trace, else FALSE.
=back
=cut
sub T {
# Declare the return variable.
my $retVal = 0;
# Only proceed if tracing is turned on.
if ($Destination ne "NONE") {
# Get the parameters.
my ($category, $traceLevel) = @_;
if (!defined $traceLevel) {
# Here we have no category, so we need to get the calling package.
# The calling package is normally the first parameter. If it is
# omitted, the first parameter will be the tracelevel. So, the
# first thing we do is shift the so-called category into the
# $traceLevel variable where it belongs.
$traceLevel = $category;
my ($package, $fileName, $line) = caller;
# If there is no calling package, we default to "main".
if (!$package) {
$category = "main";
} else {
$category = $package;
}
}
# Save the category name.
$LastCategory = $category;
# Convert it to lower case before we hash it.
$category = lc $category;
# Use the category and tracelevel to compute the result.
$retVal = ($traceLevel <= $TraceLevel && ($AllTrace || exists $Categories{$category}));
}
# Return the computed result.
return $retVal;
}
=head3 ParseCommand
C<< my ($options, @arguments) = Tracer::ParseCommand(\%optionTable, @inputList); >>
Parse a command line consisting of a list of parameters. The initial parameters may be option
specifiers of the form C<->I