[Bio] / Sprout / ContigDNA.pm Repository:
ViewVC logotype

View of /Sprout/ContigDNA.pm

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.1 - (download) (as text) (annotate)
Sat Sep 10 07:40:17 2005 UTC (14 years, 2 months ago) by parrello
Branch: MAIN
Created to enable location testing for contigs not yet in the database.

#!/usr/bin/perl -w

package ContigDNA;

    use strict;
    use Tracer;
    use FIG;
    use Genome;

=head1 Contig DNA manipulation object

=head2 Introduction

This method creates a pseudo-FIG object for use by the B<FullLocation> object. It
supports the L</contig_ln> and L</dna_seq> methods.

Note that it keeps all the data structures in memory, so it should not be used except for
a few genomes with small number of gen

=cut

=head2 Public Methods

=head3 new

C<< my $cdna = ContigDNA->new($genID1 => $file1, $genID2 => $file, ... $genIDN => $fileN ); >>

Construct a new pseudo-FIG object from a set of FASTA files. The incoming parameters come
in pairs. The first of each pair is the ID of a genome; the second is the name of a FASTA
file containing the contig data.

=over 4

=item genID1, genID2, ... genIDN

IDs of the genomes whose contigs are to be read into memory.

=item file1, file2, ... fileN

Names of the FASTA files containing the genome data.

=cut

sub new {
    # Get the parameters.
    my ($class, %files) = @_;
    # Create the object.
    my $retVal = { };
    # Loop through the genomes.
    for my $genID (keys %files) {
        # Get the name of this genome's FASTA file.
        my $file = $files{$genID};
        # Read it in using the Genome object.
        my $genomeData = Genome->new($genID, $file, "");
        # Put it in the hash.
        $retVal->{$genID} = $genomeData;
    }
    # Bless and return it.
    bless $retVal, $class;
    return $retVal;
}

=head3 dna_seq

C<< my $sequence = $sfxlate->dna_seq($genomeID, @locations); >>

Return the sequence represented by a list of locations. The locations
should be in the standard form I<contigID>C<_>I<begin>I<dir>I<end>.

=over 4

=item genomeID

ID of the relevant genome.

=item location1, location2, ... locationN

List of locations to be included in the DNA sequence.

=item RETURN

Returns a string specifying the DNA nucleotides in the specified locations.

=back

=cut
#: Return Type $;
sub dna_seq {
    # Get the parameters.
    my ($self, $genomeID, @locations) = @_;
    # Get the Genome object.
    my $genome = $self->{$genomeID};
    # Start the return string.
    my $retVal = "";
    # Loop through the locations.
    for my $location (@locations) {
        # Get the DNA.
        my $dna = $genome->GetLocation($location);
        # Add it to the return string.
        $retVal .= $dna;
    }
    # Return the result.
    return $retVal;
}

=head3 contig_ln

C<< my $length = $sfxlate->contig_ln($genomeID, $contig); >>

Return the length of the specified contig.

=over 4

=item genomeID

ID of the genome to which the contig belongs.

=item contigID

ID of the contig whose length is desired.

=item RETURN

Returns the length (in base pairs) of the specified contig in the specified
genome.

=back

=cut
#: Return Type $;
sub contig_ln {
    my($self, $genomeID, $contigID) = @_;
    my $genome = $self->{$genomeID};
    return $genome->ContigLen($contigID);
}

1;


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3