[Bio] / FigKernelScripts / svr_by_taxonomy.pl Repository:
ViewVC logotype

View of /FigKernelScripts/svr_by_taxonomy.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.3 - (download) (as text) (annotate)
Wed Oct 6 18:34:55 2010 UTC (9 years, 1 month ago) by overbeek
Branch: MAIN
CVS Tags: mgrast_dev_08112011, mgrast_dev_08022011, rast_rel_2014_0912, myrast_rel40, mgrast_dev_05262011, mgrast_dev_04082011, mgrast_version_3_2, mgrast_dev_12152011, mgrast_dev_06072011, rast_rel_2014_0729, mgrast_dev_02212011, rast_rel_2010_1206, mgrast_release_3_0, mgrast_dev_03252011, rast_rel_2011_0119, mgrast_release_3_0_4, mgrast_release_3_0_2, mgrast_release_3_0_3, mgrast_release_3_0_1, mgrast_dev_03312011, mgrast_release_3_1_2, mgrast_release_3_1_1, mgrast_release_3_1_0, mgrast_dev_04132011, mgrast_dev_04012011, myrast_33, rast_rel_2011_0928, mgrast_dev_04052011, mgrast_dev_02222011, mgrast_dev_10262011, HEAD
Changes since 1.2: +3 -6 lines
fix parameter passing

use strict;
use Data::Dumper;
use Carp;

#
# This is a SAS Component
#


=head1 svr_by_taxonomy

Separate a list by taxonomy

------

Example: svr_by_taxonomy taxonomy < file of genome taxonomies > has taxonomy 2> does not

would split the incoming file into those containing the given taxonomy and those without

------

The standard input should be a tab-separated table (i.e., each line 
is a tab-separated set of fields).  Normally, the last field in each
line would contain the thing being tested 
If some other column contains the taxonomy, use

    -c N

where N is the column (from 1) that contains the taxonomy in each case.

This is a pipe command. The input is taken from the standard input, and the
output is to the standard output.

=head2 Command-Line Options

=over 4

=item -c Column

This is used only if the column containing ID's is not the last.

=back

=head2 Output Format

The standard output is an echo of the lines in the incoming file that have the given taxonomy.
Lines are written here only if there is an exact, case-insensitive match to one of the tax components
The standard error file is an echo of the lines in the incoming file that do not have the given taxonomy 

=cut

use SeedUtils;
use SAPserver;
my $sapObject = SAPserver->new();

my $usage = "usage: svr_by_taxonomy [-c column]";
use Getopt::Long;

my $column;
my $rc  = GetOptions('c=i' => \$column);
if (! $rc) { print STDERR $usage; exit }

my $tax = shift @ARGV;
if (!$tax) { die "No taxonomy specified\n"}


my @lines = map { chomp; [split(/\t/,$_)] } <STDIN>;
if (! $column)  { $column = @{$lines[0]} }

foreach $_ (@lines)
{
    my @tax = split("; ", $_->[$column-1]);
    if (grep(/^$tax$/i, @tax)) {
    #if ($_->[$column-1] =~ /$tax/) {
	 print join("\t",@$_), "\n";
    } else {
	 print STDERR join("\t",@$_), "\n";
    }
}

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3