[Bio] / FigKernelScripts / svr_fids_for_md5.pl Repository:
ViewVC logotype

View of /FigKernelScripts/svr_fids_for_md5.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (download) (as text) (annotate)
Tue May 17 20:54:50 2011 UTC (8 years, 6 months ago) by parrello
Branch: MAIN
CVS Tags: mgrast_dev_08112011, mgrast_dev_06072011, mgrast_dev_08022011, rast_rel_2014_0912, myrast_rel40, rast_rel_2014_0729, mgrast_dev_05262011, mgrast_release_3_1_2, mgrast_release_3_1_1, rast_rel_2011_0928, mgrast_version_3_2, mgrast_dev_12152011, mgrast_dev_10262011, mgrast_release_3_1_0, HEAD
New method to return FIG IDs for a protein given its MD5.

use strict;

use Getopt::Long;
use ScriptThing;
use SAPserver;

#
# This is a SAS Component
#


=head1 svr_fids_for_md5

Given a set of md5 protein IDs, compute the FIG IDs of features that produce each
protein. This script takes as input a table containing md5 protein IDs and 
adds a column containing the associated FIG feature IDs.

------
Example:

    svr_fids_for_md5 < md5_file > md5_file_with_fids

------

=back

=head2 Command-Line Options

=over 4

=item -c Column

This is used only if the column containing md5 protein IDs is not the last.

=back

=head2 Output Format

The standard output is a tab-delimited file. It consists of the input
file with an extra column added (the ID of a feature that produces the
specified protein).  Note that this implies that there will
often be multiple output lines for a single input line.

=cut

use SeedUtils;
use SAPserver;
my $sapObject = SAPserver->new();
use Getopt::Long;
use ScriptThing;

my $usage = "usage: svr_fids_for_md5 [-c column]";

my $column;
my $rc  = GetOptions('c=i' => \$column);
if (! $rc) { print STDERR $usage; exit }
my $inFile = $ARGV[0] || '-';
open my $ih, "<$inFile";

while (my @lines = ScriptThing::GetBatch($ih, 1000, $column)) {
    my $md5H = $sapObject->proteins_to_fids(-prots => [map { $_->[0] } @lines] );
    for my $line (@lines) {
        my ($md5, $text) = @$line;
        my $fids = $md5H->{$md5};
        for my $fid (@$fids) {
            print "$text\t$fid\n";
        }
    }
}


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3