[Bio] / FigKernelScripts / svr_upstream.pl Repository:
ViewVC logotype

View of /FigKernelScripts/svr_upstream.pl

Parent Directory Parent Directory | Revision Log Revision Log

Revision 1.8 - (download) (as text) (annotate)
Wed Jul 28 16:21:29 2010 UTC (9 years, 6 months ago) by parrello
Branch: MAIN
CVS Tags: mgrast_dev_08112011, mgrast_dev_08022011, myrast_rel40, mgrast_dev_05262011, mgrast_dev_04082011, rast_rel_2010_0928, mgrast_version_3_2, mgrast_dev_12152011, mgrast_dev_06072011, mgrast_dev_02212011, rast_rel_2010_1206, mgrast_release_3_0, mgrast_dev_03252011, rast_rel_2011_0119, mgrast_release_3_0_4, mgrast_release_3_0_2, mgrast_release_3_0_3, mgrast_release_3_0_1, mgrast_dev_03312011, mgrast_release_3_1_2, mgrast_release_3_1_1, mgrast_release_3_1_0, mgrast_dev_04132011, mgrast_dev_04012011, rast_rel_2010_0827, myrast_33, rast_rel_2011_0928, mgrast_dev_04052011, mgrast_dev_02222011, mgrast_dev_10262011
Changes since 1.7: +10 -3 lines
More server script stuff. Thought I'd already installed these last week...

#!/usr/bin/perl -w
use strict;

use Getopt::Long;
use SAPserver;
use ScriptThing;

#	This is a SAS Component.

=head1 svr_upstream

    svr_upstream <gene_ids.tbl >upstream_dna.fasta

Retrieve upstream regions from the Sapling Server.

This script takes as input a tab-delimited file with feature IDs at the end. For
each feature ID, the upstream DNA is computed and written to the output file in
FASTA format. Sections of DNA that occur inside a feature are shown in upper
case. DNA between known features is shown in lower case.

This is a pipe command. Input is from the standard input and output is to the
standard output.

=head2 Command-Line Options

=over 4

=item skipGene

If specified, then only the upstream region is output. Otherwise, the upstream
region and the feature interior are output together.

=item size

Number of base pairs to show in the upstream region. The default is C<200>.

=item url

The URL for the Sapling server, if it is to be different from the default.

=item c

Column index. If specified, indicates that the input IDs should be taken from the
indicated column instead of the last column. The first column is column 1.



# Parse the command-line options.
my $skipGene = '';
my $size = 200;
my $url = '';
my $column = 0;
my $opted =  GetOptions('skipGene' => \$skipGene, 'size=i' => \$size, 'url=s' => \$url,
                        'c=i' => \$column);
if (! $opted) {
    print "usage: svr_upstream [--skipGene] [--c=N] [--size=200] [-url=http://...] <input >output\n";
} else {
    # Fix STDIN.
    # Get the server object.
    my $sapServer = SAPserver->new(url => $url);
    # The main loop processes chunks of input, 1000 lines at a time.
    while (my @tuples = ScriptThing::GetBatch(\*STDIN, undef, $column)) {
        # Compute the comment strings.
        my %comments = ScriptThing::CommentHash(\@tuples);
        # Ask the server for results.
        my $document = $sapServer->upstream(-ids => [ map { $_->[0] } @tuples],
                                            -size => $size,
                                            -fasta => 1,
                                            -comments => \%comments,
                                            -skipGene => $skipGene);
        # Loop through the tuples, producing output.
        for my $tuple (@tuples) {
            # Get the feature ID.
            my $fid = $tuple->[0];
            # Get this feature's FASTA.
            my $fasta = $document->{$fid};
            # Did we get something?
            if (! $fasta) {
                # No. Write an error notification.
                print STDERR "Not found: $fid\n";
            } else {
                # Yes. output the FASTA.
                print "$fasta";

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3