[Bio] / FigKernelScripts / svr_all_features.pl Repository:
ViewVC logotype

View of /FigKernelScripts/svr_all_features.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.7 - (download) (as text) (annotate)
Tue Apr 26 19:56:42 2011 UTC (8 years, 6 months ago) by olson
Branch: MAIN
CVS Tags: mgrast_dev_08112011, mgrast_dev_08022011, rast_rel_2014_0912, myrast_rel40, mgrast_dev_05262011, mgrast_version_3_2, mgrast_dev_12152011, mgrast_dev_06072011, rast_rel_2014_0729, mgrast_release_3_1_2, mgrast_release_3_1_1, mgrast_release_3_1_0, rast_rel_2011_0928, mgrast_dev_10262011, HEAD
Changes since 1.6: +55 -11 lines
Deobfuscate the command line logic. Submit requests in chunks so as to not
be part of an easy denial of service attack.

use strict;
use Data::Dumper;
use Carp;

#
# This is a SAS Component
#


=head1 svr_all_features Genome Type

Get a list of Feature IDs for all features of a given type in a given genome
or a list of genomes.

If a genome ID is specified in the command line, there is no input. Otherwise,
this script takes as input a tab-delimited file with genome IDs at the end of
each line.

The output is a file of feature IDs, one ID per line.

------
Example: svr_all_features 3702.1 peg | svr_function_of

would produce a 2-column table.  The first column would contain
PEG IDs for genes occurring in genome 3702.1, and the second
would contain the functions of those genes.
------

=head2 Command-Line Options

=over 4

=item Genome

A genome that is in the SEED. The ID must be a valid SEED genome ID of the
form /^\d+\.\d+$/ (i.e., of the form xxxx.yyy)

=item Type

The type of the features sought (e.g., peg or rna)

=back

=head2 Output Format

The standard output is a file where each line just contains a feature ID.

=cut


use SeedEnv;
my $sapObject = SAPserver->new();

my $usage = "usage: svr_all_features [Genome] Type";

my($genome, $type);

if (@ARGV == 1)
{
    $type = shift;
}
elsif (@ARGV == 2)
{
    $genome = shift;
    $type = shift;

    process_genomes($type, [$genome]);
    exit;
}
else
{
    die $usage;
}

#
# If we get here, we are reading genomes from STDIN. Process in batches
# to make this code friendlier on the servers for large input files.
#

my $batch_size = 10;

my @genomes;

while (<STDIN>)
{
    chomp;
    my @cols = split(/\t/);
    my $genome = $cols[-1];
    push(@genomes, $genome);
    if (@genomes >= $batch_size)
    {
	process_genomes($type, \@genomes);
	@genomes = ();
    }
}
if (@genomes)
{
    process_genomes($type, \@genomes);
}
   
sub process_genomes
{
    my($type, $genomes) = @_;

    # print STDERR "Request @$genomes\n";
    my $fidHash = $sapObject->all_features(-ids => $genomes, -type => $type);

    foreach my $gid (@$genomes)
    {
	foreach my $fid (sort { &SeedUtils::by_fig_id($a, $b) } @{$fidHash->{$gid}} )
	{
	    print "$fid\n";
	}
    }
}


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3