[Bio] / FigKernelPackages / BasicLocation.pm Repository:
ViewVC logotype

Annotation of /FigKernelPackages/BasicLocation.pm

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.10 - (view) (download) (as text)

1 : olson 1.5 #
2 :     # Copyright (c) 2003-2006 University of Chicago and Fellowship
3 :     # for Interpretations of Genomes. All Rights Reserved.
4 :     #
5 :     # This file is part of the SEED Toolkit.
6 :     #
7 :     # The SEED Toolkit is free software. You can redistribute
8 :     # it and/or modify it under the terms of the SEED Toolkit
9 :     # Public License.
10 :     #
11 :     # You should have received a copy of the SEED Toolkit Public License
12 :     # along with this program; if not write to the University of Chicago
13 :     # at info@ci.uchicago.edu or the Fellowship for Interpretation of
14 :     # Genomes at veronika@thefig.info or download a copy from
15 :     # http://www.theseed.org/LICENSE.TXT.
16 :     #
17 :    
18 : parrello 1.1 package BasicLocation;
19 :    
20 :     use strict;
21 :     use Tracer;
22 :    
23 :     =head1 Basic Location Object
24 :    
25 :     =head2 Introduction
26 :    
27 :     A I<basic location> defines a region of a contig. Because of the way genetic features
28 :     work, a basic location has a direction associated with it. The traditional method for encoding
29 :     a basic location is to specify the contig ID, the starting offset, and the ending offset. If the
30 :     start is before the end, we have a forward basic location, and if the start is after the end, we
31 :     have a backward basic location.
32 :    
33 :     =over 4
34 :    
35 :     =item C<RED_1_400>
36 :    
37 :     is the first 400 nucleotides of the RED contig, processed forward.
38 :    
39 :     =item C<NC_000913_500_401>
40 :    
41 :     is a 100-nucleotide region in NC_000913 that is processed backward from the 500th nucleotide.
42 :    
43 :     =back
44 :    
45 :     Note that even though they are called "offsets", location indices are 1-based.
46 :     Note also that the possibility of an underscore in the contig ID makes the parsing a little
47 :     tricky.
48 :    
49 :     The Sprout uses a slightly different format designed to allow for the possibility of
50 :     zero-length regions. Instead of a starting and ending position, we specify the start position,
51 :     the direction (C<+> or C<->), and the length. The Sprout versions of the two example locations
52 :     above are
53 :    
54 :     =over 4
55 :    
56 :     =item C<RED_1+400>
57 :    
58 :     corresponds to C<RED_1_400>
59 :    
60 :     =item C<NC_000913_500-100>
61 :    
62 :     corresponds to C<NC_000913_500_401>
63 :    
64 :     =back
65 :    
66 :     Working with the raw location string is difficult because it can have one of two formats
67 :     and it is constantly necessary to ask if the location is forward or backward. The basic location
68 :     object seeks to resolve these differences by providing a single interface that can be
69 :     used regardless of the format or direction.
70 :    
71 :     It is frequently useful to keep additional data about a basic location while it is being passed
72 :     around. The basic location object is a PERL hash, and this additional data is kept in the object
73 :     by adding hash keys. The internal values used by the object have keys preceded by an
74 :     underscore, so any keys not beginning with underscores are considered to be additional
75 :     values. The additional values are called I<augments>.
76 :    
77 :     When a basic location is in its string form, the augments can be tacked on using parentheses
78 :     enclosing a comma-delimited list of assignments. For example, say we want to describe
79 :     the first 400 base pairs in the contig B<RED>, and include the fact that it is the second
80 :     segment of feature B<fig|12345.1.peg.2>. We could use the key C<fid> for the feature ID and
81 :     C<idx> for the segment index (0-based), in which case the location string would be
82 :    
83 :     RED_1+400(fid=fig|12345.1.peg.2,idx=1)
84 :    
85 :     When this location string is converted to a location object in the variable C<$loc>, we
86 :     would have
87 :    
88 :     $loc->{fig} eq 'fig|12345.1.peg.2'
89 :     $loc->{idx} == 1
90 :    
91 :     Spaces can be added for readability. The above augmented location string can also be
92 :     coded as
93 :    
94 :     RED_1+400(fid = fig|12345.1.peg.2, idx = 1)
95 :    
96 :     A basic location is frequently part of a full location. Full locations are described by the
97 :     B<FullLocation> object. A full location is a list of basic locations associated with a genome
98 :     and a FIG-like object. If the parent full location is known, we can access the basic location's
99 :     raw DNA. To construct a basic location that is part of a full location, we add the parent full
100 :     location and the basic location's index to the constructor. In the constructor below,
101 :     C<$parent> points to the parent full location.
102 :    
103 :     my $secondLocation = BasicLocation->new("RED_450+100", $parent, 1);
104 :    
105 :     =cut
106 :    
107 :     #: Constructor BasicLocation->new('NC_000913_499_400');
108 :    
109 :     =head2 Public Methods
110 :    
111 :     =head3 new
112 :    
113 :     C<< my $loc = BasicLocation->new($locString, $parentLocation, $idx); >>
114 :    
115 :     Construct a basic location from a location string. A location string has the form
116 :     I<contigID>C<_>I<begin>I<dir>I<len> where I<begin> is the starting position,
117 :     I<dir> is C<+> for a forward transcription or C<-> for a backward transcription,
118 :     and I<len> is the length. So, for example, C<1999.1_NC123_4000+200> describes a
119 :     location beginning at position 4000 of contig C<1999.1_NC123> and ending at
120 :     position 4199. Similarly, C<1999.1_NC123_2000-400> describes a location in the
121 :     same contig starting at position 2000 and ending at position 1601.
122 :    
123 :     Augments can be specified as part of the location string using parentheses and
124 :     a comma-delimited list of assignments. For example, the following constructor
125 :     creates a location augmented by a feature ID called C<fid> and an index value
126 :     called C<idx>.
127 :    
128 :     my $loc = BasicLocation->new("NC_000913_499_400(fid = fig|83333.1.peg.10, idx => 2)");
129 :    
130 :     All fields internal to the location object have names beginning with an
131 :     underscore (C<_>), so as long as the value name begins with a letter,
132 :     there should be no conflict.
133 :    
134 :     =over 4
135 :    
136 :     =item locString
137 :    
138 :     Location string, as described above.
139 :    
140 :     =item parentLocation (optional)
141 :    
142 :     Full location that this basic location is part of (if any).
143 :    
144 :     =item idx (optional)
145 :    
146 :     Index of this basic location in the parent full location.
147 :    
148 :     =back
149 :    
150 :     C<< my $loc = BasicLocation->new($location, $contigID); >>
151 :    
152 :     Construct a location by copying another location and plugging in a new contig ID.
153 :    
154 :     =over 4
155 :    
156 :     =item location
157 :    
158 :     Location whose data is to be copied.
159 :    
160 :     =item contigID (optional)
161 :    
162 :     ID of the new contig to be plugged in.
163 :    
164 :     =back
165 :    
166 :     C<< my $loc = BasicLocation->new($contigID, $beg, $dir, $len, $augments, $parentLocation, $idx); >>
167 :    
168 :     Construct a location from specific data elements, in particular the contig ID, the starting
169 :     offset, the direction, and the length.
170 :    
171 :     =over 4
172 :    
173 :     =item contigID
174 :    
175 :     ID of the contig on which the location occurs.
176 :    
177 :     =item beg
178 :    
179 :     Starting offset of the location.
180 :    
181 :     =item dir
182 :    
183 :     Direction of the location: C<+> for a forward location and C<-> for a backward location. If
184 :     C<_> is specified instead, it will be presumed that the fourth argument is an endpoint and not
185 :     a length.
186 :    
187 :     =item len
188 :    
189 :     Length of the location. If the direction is an underscore (C<_>), it will be the endpoint
190 :     instead of the length.
191 :    
192 :     =item augments (optional)
193 :    
194 :     Reference to a hash containing any augment values for the location.
195 :    
196 :     =item parentLocation (optional)
197 :    
198 :     Full location that this basic location is part of (if any).
199 :    
200 :     =item idx (optional)
201 :    
202 :     Index of this basic location in the parent full location.
203 :    
204 :     =back
205 :    
206 :     =cut
207 :    
208 :     sub new {
209 :     # Get the parameters.
210 :     my ($class, @p) = @_;
211 :     # Declare the data variables.
212 :     my ($contigID, $beg, $dir, $len, $end, $parent, $idx, $augments, $augmentString);
213 :     # Determine the signature type.
214 :     if (@p >= 4) {
215 :     # Here we have specific incoming data.
216 :     ($contigID, $beg, $dir, $len, $augments, $parent, $idx) = @p;
217 :     } elsif (UNIVERSAL::isa($p[0],__PACKAGE__)) {
218 :     # Here we have a source location and possibly a new contig ID.
219 :     $contigID = (defined $p[1] ? $p[1] : $p[0]->{_contigID});
220 :     ($beg, $dir, $len) = ($p[0]->{_beg}, $p[0]->{_dir}, $p[0]->{_len});
221 :     if (exists $p[0]->{_parent}) {
222 :     ($parent, $idx) = ($p[0]->{_parent}, $p[0]->{_idx});
223 :     }
224 :     # Get the augments (if any) from the source location. We want these
225 :     # copied to the new location.
226 :     $augments = { };
227 :     for my $key (keys %{$p[0]}) {
228 :     if (substr($key, 0, 1) ne '_') {
229 :     $augments->{$key} = $p[0]->{$key};
230 :     }
231 :     }
232 :     } else {
233 :     # Here we have a source string and possibly augments. We first parse
234 :     # the source string.
235 :     $p[0] =~ /^(.+)_(\d+)(\+|\-|_)(\d+)($|\(.*\)$)/;
236 :     ($contigID, $beg, $dir, $len, $augmentString) = ($1, $2, $3, $4, $5);
237 :     # Check for augments.
238 :     if ($augmentString) {
239 :     # Here we have an augment string. First, we strip the enclosing
240 :     # parentheses.
241 :     $augmentString = substr $augmentString, 1, length($augmentString) - 2;
242 :     # Now we parse out the assignments and put them in a hash.
243 :     my %augmentHash = map { split /\s*,\s*/, $_ } split /\s*=\s*/, $augmentString;
244 :     $augments = \%augmentHash;
245 :     }
246 :     # Pull in the parent location and index, if applicable.
247 :     ($parent, $idx) = ($p[1], $p[2]);
248 :     }
249 :     # Determine the format.
250 :     if ($dir eq '_') {
251 :     # Here we have the old format. The endpoint was parsed as the length.
252 :     $end = $len;
253 :     # Compare the start and end to get the direction and compute the true length.
254 :     if ($beg > $end) {
255 :     ($dir, $len) = ('-', $beg - $end + 1);
256 :     } else {
257 :     ($dir, $len) = ('+', $end - $beg + 1);
258 :     }
259 :     } else {
260 :     # Here we have the new format. We compute the endpoint
261 :     # from the direction.
262 :     $end = (($dir eq '+') ? $beg + $len - 1 : $beg - $len + 1);
263 :     }
264 :     # Create the return structure.
265 :     my $retVal = { _contigID => $contigID, _beg => $beg, _dir => $dir,
266 :     _end => $end, _len => $len, _parent => $parent,
267 :     _idx => $idx };
268 :     # Add the augments.
269 :     if ($augments) {
270 :     for my $key (keys %{$augments}) {
271 :     $retVal->{$key} = $augments->{$key};
272 :     }
273 :     }
274 :     # Bless the location with the appropriate package name.
275 :     if ($dir eq '+') {
276 :     bless $retVal, "FBasicLocation";
277 :     } else {
278 :     bless $retVal, "BBasicLocation";
279 :     }
280 :     # Return the blessed object.
281 :     return $retVal;
282 :     }
283 :    
284 :     =head3 Contig
285 :    
286 :     C<< my $contigID = $loc->Contig; >>
287 :    
288 :     Return the location's contig ID.
289 :    
290 :     =cut
291 :     #: Return Type $;
292 :     sub Contig {
293 :     return $_[0]->{_contigID};
294 :     }
295 :    
296 :     =head3 Begin
297 :    
298 :     C<< my $beg = $loc->Begin; >>
299 :    
300 :     Return the location's starting offset.
301 :    
302 :     =cut
303 :     #: Return Type $;
304 :     sub Begin {
305 :     return $_[0]->{_beg};
306 :     }
307 :    
308 :     =head3 Dir
309 :    
310 :     C<< my $dirChar = $loc->Dir; >>
311 :    
312 :     Return the location's direction: C<+> for a forward location and C<-> for a backward one.
313 :    
314 :     =cut
315 :     #: Return Type $;
316 :     sub Dir {
317 :     return $_[0]->{_dir};
318 :     }
319 :    
320 :     =head3 Length
321 :    
322 :     C<< my $len = $loc->Length; >>
323 :    
324 :     Return the location's length (in nucleotides).
325 :    
326 :     =cut
327 :     #: Return Type $;
328 :     sub Length {
329 :     return $_[0]->{_len};
330 :     }
331 :    
332 :     =head3 EndPoint
333 :    
334 :     C<< my $offset = $loc->EndPoint; >>
335 :    
336 :     Return the location's ending offset.
337 :    
338 :     =cut
339 :     #: Return Type $;
340 :     sub EndPoint {
341 :     return $_[0]->{_end};
342 :     }
343 :    
344 :     =head3 Parent
345 :    
346 :     C<< my $parentLocation = $loc->Parent; >>
347 :    
348 :     Return the full location containing this basic location (if any).
349 :    
350 :     =cut
351 :     #: Return Type %;
352 :     sub Parent {
353 :     return $_[0]->{_parent};
354 :     }
355 :    
356 :     =head3 Index
357 :    
358 :     C<< my $idx = $loc->Index; >>
359 :    
360 :     Return the index of this basic location inside the parent location (if any).
361 :    
362 :     =cut
363 :     #: Return Type $;
364 :     sub Index {
365 :     return $_[0]->{_idx};
366 :     }
367 :    
368 :     =head3 String
369 :    
370 :     C<< my $string = $loc->String; >>
371 :    
372 :     Return a Sprout-format string representation of this location.
373 :    
374 :     =cut
375 :     #: Return Type $;
376 :     sub String {
377 :     my ($self) = @_;
378 :     return $self->{_contigID} . "_" . $self->{_beg} . $self->{_dir} . $self->{_len};
379 :     }
380 :    
381 :     =head3 SeedString
382 :    
383 :     C<< my $string = $loc->SeedString; >>
384 :    
385 :     Return a SEED-format string representation of this location.
386 :    
387 :     =cut
388 :     #: Return Type $;
389 :     sub SeedString {
390 :     my ($self) = @_;
391 :     return $self->{_contigID} . "_" . $self->{_beg} . "_" . $self->{_end};
392 :     }
393 :    
394 :     =head3 AugmentString
395 :    
396 :     C<< my $string = $loc->AugmentString; >>
397 :    
398 :     Return a Sprout-format string representation of this location with augment data
399 :     included. The augment data will be appended as a comma-delimited list of assignments
400 :     enclosed in parentheses, the exact format expected by the single-argument location object
401 :     constructor L</new>.
402 :    
403 :     =cut
404 :     #: Return Type $;
405 :     sub AugmentString {
406 :     # Get this instance.
407 :     my ($self) = @_;
408 :     # Get the pure location string.
409 :     my $retVal = $self->String;
410 :     # Create the augment string. We build it from all the key-value pairs in the hash
411 :     # for which the key does not being with an underscore.
412 :     my @augmentStrings = ();
413 :     for my $key (sort keys %{$self}) {
414 :     if (substr($key,0,1) ne "_") {
415 :     push @augmentStrings, "$key = " . $self->{$key};
416 :     }
417 :     }
418 :     # If any augments were found, we concatenate them to the result string.
419 :     if (@augmentStrings > 0) {
420 :     $retVal .= "(" . join(", ", @augmentStrings) . ")";
421 :     }
422 :     # Return the result.
423 :     return $retVal;
424 :     }
425 :    
426 :     =head3 IfValid
427 :    
428 :     C<< my $distance = IfValid($distance); >>
429 :    
430 :     Return a distance if it is a valid offset inside this location, and an undefined value otherwise.
431 :    
432 :     =over 4
433 :    
434 :     =item distance
435 :    
436 :     Relevant distance inside this location.
437 :    
438 :     =item RETURN
439 :    
440 :     Returns the incoming distance if it is non-negative and less than the location length, and an
441 :     undefined value otherwise.
442 :    
443 :     =back
444 :    
445 :     =cut
446 :     #: Return Type $;
447 :     sub IfValid {
448 :     # Get the parameters.
449 :     my ($self, $distance) = @_;
450 :     # Return the appropriate result.
451 :     return (($distance >= 0 && $distance < $self->{_len}) ? $distance : undef);
452 :     }
453 :    
454 :     =head3 Cmp
455 :    
456 :     C<< my $compare = BasicLocation::Cmp($a, $b); >>
457 :    
458 :     Compare two locations.
459 :    
460 :     The ordering principle for locations is that they are sorted first by contig ID, then by
461 :     leftmost position, in reverse order by length, and then by direction. The effect is that
462 :     within a contig, the locations are ordered first and foremost in the way they would
463 :     appear when displayed in a picture of the contig and second in such a way that embedded
464 :     locations come after the locations in which they are embedded. In the case of two
465 : parrello 1.3 locations that represent the exact same base pairs, the forward (C<+>) location is
466 :     arbitrarily placed first.
467 : parrello 1.1
468 :     =over 4
469 :    
470 :     =item a
471 :    
472 :     First location to compare.
473 :    
474 :     =item b
475 :    
476 :     Second location to compare.
477 :    
478 :     =item RETURN
479 :    
480 :     Returns a negative number if the B<a> location sorts first, a positive number if the
481 :     B<b> location sorts first, and zero if the two locations are the same.
482 :    
483 :     =back
484 :    
485 :     =cut
486 :     #: Return Type $;
487 :     sub Cmp {
488 :     # Get the parameters.
489 :     my ($a, $b) = @_;
490 :     # Compare the locations.
491 :     my $retVal = ($a->Contig cmp $b->Contig);
492 :     if ($retVal == 0) {
493 :     $retVal = ($a->Left <=> $b->Left);
494 :     if ($retVal == 0) {
495 :     $retVal = ($b->Length <=> $a->Length);
496 :     if ($retVal == 0) {
497 :     $retVal = ($a->Begin <=> $b->Begin);
498 :     }
499 :     }
500 :     }
501 :     # Return the result.
502 :     return $retVal;
503 :     }
504 :    
505 :     =head3 Matches
506 :    
507 :     C<< my $flag = BasicLocation::Matches($locA, $locB); >>
508 :    
509 :     Return TRUE if the two locations contain the same data, else FALSE. Augment data is included
510 :     in the comparison.
511 :    
512 :     =over 4
513 :    
514 :     =item locA, locB
515 :    
516 :     Locations to compare.
517 :    
518 :     =item RETURN
519 :    
520 :     Returns TRUE if the two locations contain the same data, else FALSE.
521 :    
522 :     =back
523 :    
524 :     =cut
525 :     #: Return Type $;
526 :     sub Matches {
527 :     # Get the parameters.
528 :     my ($locA, $locB) = @_;
529 :     # Declare the return variable.
530 :     my $retVal = 0;
531 :     # Verify that the major data items are the same.
532 :     if ($locA->Contig eq $locB->Contig && $locA->Begin eq $locB->Begin &&
533 :     $locA->Dir eq $locB->Dir && $locA->Length == $locB->Length) {
534 :     # Here the locations are the same, so we need to check augment data.
535 :     # First, we loop through all the augment keys in the A location.
536 :     my @aKeys = grep { /^[^_]/ } keys %{$locA};
537 :     # Assume we have a match until we find a mis-match.
538 :     $retVal = 1;
539 :     for (my $i = 0; $i <= $#aKeys && $retVal; $i++) {
540 :     my $aKey = $aKeys[$i];
541 :     $retVal = ((exists $locB->{$aKey}) && ($locA->{$aKey} eq $locB->{$aKey}));
542 :     }
543 :     # If we're still matching, verify that B doesn't have any
544 :     # keys not in A.
545 :     my @bKeys = keys %{$locB};
546 :     for (my $i = 0; $i <= $#bKeys && $retVal; $i++) {
547 :     $retVal = exists $locA->{$bKeys[$i]};
548 :     }
549 :     }
550 :     # Return the result.
551 :     return $retVal;
552 :     }
553 :    
554 :     =head3 Attach
555 :    
556 :     C<< my = $loc->Attach($parent, $idx); >>
557 :    
558 :     Point this basic location to a parent full location. The basic location will B<not> be
559 :     inserted into the full location's data structures.
560 :    
561 :     =over 4
562 :    
563 :     =item parent
564 :    
565 :     Parent full location to which this location should be attached.
566 :    
567 :     =item idx
568 :    
569 :     Index of this location in the full location.
570 :    
571 :     =back
572 :    
573 :     =cut
574 :     #: Return Type ;
575 :     sub Attach {
576 :     # Get the parameters.
577 :     my ($self, $parent, $idx) = @_;
578 :     # Save the parent location and index in our data structures.
579 :     $self->{_idx} = $idx;
580 :     $self->{_parent} = $parent;
581 :     }
582 :    
583 : parrello 1.7 =head3 FixContig
584 :    
585 :     C<< $loc->FixContig($genomeID); >>
586 :    
587 :     Insure the genome ID is included in the Contig string. Some portions of the system
588 :     store the contig ID in the form I<genome>C<:>I<contig>, while some use only the contig ID.
589 :     If this location's contig ID includes a genome ID, nothing will happen, but if it does
590 :     note, the caller-specified genome ID will be prefixed to the contig string.
591 :    
592 :     =over 4
593 :    
594 :     =item genomeID
595 :    
596 :     ID of the genome for this location's contig.
597 :    
598 :     =back
599 :    
600 :     =cut
601 :     #: Return Type ;
602 :     sub FixContig {
603 :     # Get the parameters.
604 :     my ($self, $genomeID) = @_;
605 :     # Check the contig string for the presence of a genome ID.
606 :     my $contigID = $self->{_contigID};
607 :     if ($contigID !~ /:/) {
608 :     # There's no colon, so we have to prefix the genome ID.
609 :     $self->{_contigID} = "$genomeID:$contigID";
610 :     }
611 :     }
612 :    
613 : parrello 1.8 =head3 Parse
614 :    
615 :     C<< my ($contig, $beg, $end) = BasicLocation::Parse($locString); >>
616 :    
617 :     Parse a location string and return the contig ID, start position, and end position.
618 :    
619 :     =over 4
620 :    
621 :     =item locString
622 :    
623 :     Location string to parse. It may be either Sprout-style or SEED-style.
624 :    
625 :     =item RETURN
626 :    
627 :     Returns the contig ID, start position, and end position as a three-element list.
628 :    
629 :     =back
630 :    
631 :     =cut
632 :     #: Return Type @;
633 :     sub Parse {
634 :     # Get the parameters.
635 :     my ($locString) = @_;
636 :     # Create a location object from the string.
637 :     my $loc = BasicLocation->new($locString);
638 :     # Return the desired data.
639 :     return ($loc->Contig, $loc->Begin, $loc->EndPoint);
640 :     }
641 :    
642 : parrello 1.9 =head3 Overlap
643 :    
644 :     C<< my $len = $loc->Overlap($b,$e); >>
645 :    
646 :     Determine how many positions in this location overlap the specified region. The region is defined
647 :     by its leftmost and rightmost positions.
648 :    
649 :     =over 4
650 :    
651 :     =item b
652 :    
653 :     Leftmost position in the region to check.
654 :    
655 :     =item e
656 :    
657 :     Rightmost position in the region to check.
658 :    
659 :     =item RETURN
660 :    
661 :     Returns the number of overlapping positions, or 0 if there is no overlap.
662 :    
663 :     =back
664 :    
665 :     =cut
666 :     #: Return Type $%;
667 :     sub Overlap {
668 :     # Get the parameters.
669 :     my ($self, $b, $e) = @_;
670 :     # Declare the return variable.
671 :     my $retVal;
672 :     # Process according to the type of overlap.
673 :     if ($b < $self->Left) {
674 :     # Here the other region extends to our left.
675 :     if ($e >= $self->Left) {
676 :     # The other region's right is past our left, so we have overlap. The overlap length
677 :     # is determined by whether or not we are wholly inside the region.
678 :     if ($e < $self->Right) {
679 :     $retVal = $e - $self->Left + 1;
680 :     } else {
681 :     $retVal = $self->Length;
682 :     }
683 :     } else {
684 :     # The other region ends before we start, so no overlap.
685 :     $retVal = 0;
686 :     }
687 :     } elsif ($b > $self->Right) {
688 :     # The other region starts after we end, so no overlap.
689 :     $retVal = 0;
690 :     } else {
691 :     # The other region starts inside us.
692 :     $retVal = $self->Right - $b + 1;
693 :     }
694 :     # Return the result.
695 :     return $retVal;
696 :     }
697 :    
698 : parrello 1.10 =head3 Merge
699 :    
700 :     C<< $loc->Merge($loc2); >>
701 :    
702 :     Merge another location into this one. The result will include all bases in both
703 :     locations and will have the same direction as this location. It is assumed both
704 :     locations share the same contig.
705 :    
706 :     =over 4
707 :    
708 :     =item loc2
709 :    
710 :     Location to merge into this one.
711 :    
712 :     =back
713 :    
714 :     =cut
715 :    
716 :     sub Merge {
717 :     # Get the parameters.
718 :     my ($self, $loc2) = @_;
719 :     # Get a copy of the other location.
720 :     my $other = BasicLocation->new($loc2);
721 :     # Fix the direction so it matches.
722 :     if ($self->Dir != $other->Dir) {
723 :     $other->Reverse;
724 :     }
725 :     # Combine the other location with this one.
726 :     $self->Combine($other);
727 :     }
728 :    
729 : parrello 1.1 =head2 Virtual Methods
730 :    
731 :     These methods are implemented by the subclasses. They are included here for documentation
732 :     purposes.
733 :    
734 :     =head3 Left
735 :    
736 :     C<< my $leftPoint = $loc->Left; >>
737 :    
738 :     Return the offset of the leftmost point of the location.
739 :    
740 :     =cut
741 :     #: Return Type $;
742 :    
743 :     =head3 Right
744 :    
745 :     C<< my $rightPoint = $loc->Right; >>
746 :    
747 :     Return the offset of the rightmost point of the location.
748 :    
749 :     =cut
750 :     #: Return Type $;
751 :    
752 :     =head3 Compare
753 :    
754 :     C<< my ($distance, $cmp) = $loc->Compare($point); >>
755 :    
756 :     Determine the relative location of the specified point on the contig. Returns a distance,
757 :     which indicates the location relative to the leftmost point of the contig, and a comparison
758 :     number, which is negative if the point is to the left of the location, zero if the point is
759 :     inside the location, and positive if the point is to the right of the location.
760 :    
761 :     =cut
762 :     #: Return Type @;
763 :    
764 :     =head3 Split
765 :    
766 :     C<< my $newLocation = $loc->Split($offset); >>
767 :    
768 :     Split this location into two smaller ones at the specified offset from the left endpoint. The
769 :     new location split off of it will be returned.
770 :    
771 :     =over 4
772 :    
773 :     =item offset
774 :    
775 :     Offset into the location from the left endpoint of the point at which it should be split.
776 :    
777 :     =item RETURN
778 :    
779 :     The new location split off of this one.
780 :    
781 :     =back
782 :    
783 :     =cut
784 :     #: Return Type $%;
785 :    
786 :     =head3 Reverse
787 :    
788 :     C<< $loc->Reverse; >>
789 :    
790 :     Change the polarity of the location. The location will have the same nucleotide range, but
791 :     the direction will be changed.
792 :    
793 :     =cut
794 :     #: Return Type ;
795 :    
796 :     =head3 Index
797 :    
798 :     C<< my $index = $loc->Index($point); >>
799 :    
800 :     Return the index of the specified point in this location. The value returned is the distance
801 :     from the beginning. If the specified point is not in the location, an undefined value is returned.
802 :    
803 :     =over 4
804 :    
805 :     =item point
806 :    
807 :     Offset into the contig of the point in question.
808 :    
809 :     =item RETURN
810 :    
811 :     Returns the distance of the point from the beginning of the location, or an undefined value if the
812 :     point is outside the location.
813 :    
814 :     =back
815 :    
816 :     =cut
817 :     #: Return Type $;
818 :    
819 :     =head3 PointOffset
820 :    
821 :     C<< my $offset = $loc->PointOffset($index); >>
822 :    
823 :     Return the offset into the contig of the point at the specified position in the location. A position
824 :     of 0 will return the beginning point, a position of 1 returns the point next to that, and a position
825 :     1 less than the length will return the ending point.
826 :    
827 :     =over 4
828 :    
829 :     =item index
830 :    
831 :     Index into the location of the relevant point.
832 :    
833 :     =item RETURN
834 :    
835 :     Returns an offset into the contig of the specified point in the location.
836 :    
837 :     =back
838 :    
839 :     =cut
840 :     #: Return Type $;
841 :    
842 :     =head3 Peel
843 :    
844 :     C<< my $peel = $loc->Peel($length); >>
845 :    
846 :     Peel a specified number of positions off the beginning of the location. Peeling splits
847 :     a location at a specified offset from the beginning, while splitting takes it at a
848 :     specified offset from the left point. If the specified length is equal to or longer
849 :     than the location's length, an undefined value will be returned.
850 :    
851 :     =over 4
852 :    
853 :     =item length
854 :    
855 :     Number of positions to split from the location.
856 :    
857 :     =item RETURN
858 :    
859 :     Returns a new location formed by splitting positions off of the existing location, which is
860 :     shortened accordingly. If the specified length is longer than the location's length, an
861 :     undefined value is returned and the location is not modified.
862 :    
863 :     =back
864 :    
865 :     =cut
866 :     #: Return Type $%;
867 :    
868 : parrello 1.2 =head3 SetBegin
869 :    
870 :     C<< $loc->SetBegin($newBegin); >>
871 :    
872 :     Change the begin point of this location without changing the endpoint.
873 :    
874 :     =over 4
875 :    
876 :     =item newBegin
877 :    
878 :     Proposed new beginning point.
879 :    
880 :     =back
881 :    
882 :     =cut
883 :     #: Return Type ;
884 :    
885 :     =head3 SetEnd
886 :    
887 :     C<< $loc->SetEnd($newEnd); >>
888 :    
889 :     Change the endpoint of this location without changing the begin point.
890 :    
891 :     =over 4
892 :    
893 :     =item newEnd
894 :    
895 :     Proposed new ending point.
896 :    
897 :     =back
898 :    
899 :     =cut
900 :     #: Return Type ;
901 :    
902 : parrello 1.6 =head3 Widen
903 :    
904 :     C<< my = $loc->Widen($distance, $max); >>
905 :    
906 :     Add the specified distance to each end of the location, taking care not to
907 :     extend past either end of the contig. The contig length must be provided
908 :     to insure we don't fall off the far end; otherwise, only the leftward
909 :     expansion is limited.
910 :    
911 :     =over 4
912 :    
913 :     =item distance
914 :    
915 :     Number of positions to add to both ends of the location.
916 :    
917 :     =item max (optional)
918 :    
919 :     Maximum possible value for the right end of the location.
920 :    
921 :     =back
922 :    
923 :     =cut
924 :     #: Return Type ;
925 :    
926 : parrello 1.9 =head3 Upstream
927 :    
928 :     C<< my $newLoc = $loc->Upstream($distance, $max); >>
929 :    
930 :     Return a new location upstream of the given location, taking care not to
931 :     extend past either end of the contig.
932 :    
933 :     =over 4
934 :    
935 :     =item distance
936 :    
937 :     Number of positions to add to the front (upstream) of the location.
938 :    
939 :     =item max (optional)
940 :    
941 :     Maximum possible value for the right end of the location.
942 :    
943 :     =item RETURN
944 :    
945 :     Returns a new location object whose last position is next to the first
946 :     position of this location.
947 :    
948 :     =back
949 :    
950 :     =cut
951 :     #: Return Type $%;
952 :    
953 :     =head3 Truncate
954 :    
955 :     C<< $loc->Truncate($len); >>
956 :    
957 :     Truncate the location to a new length. If the length is larger than the location length, then
958 :     the location is not changed.
959 :    
960 :     =over 4
961 :    
962 :     =item len
963 :    
964 :     Proposed new length for the location.
965 :    
966 :     =back
967 :    
968 :     =cut
969 :     #: Return Type $%;
970 :    
971 : parrello 1.10 =head3 Combine
972 :    
973 :     C<< $loc->Combine($other); >>
974 :    
975 :     Combine another location with this one. The result will contain all bases in both
976 :     original locations. Both locations must have the same contig ID and direction.
977 :    
978 :     =over 4
979 :    
980 :     =item other
981 :    
982 :     Other location to combine with this one.
983 :    
984 :     =back
985 :    
986 :     =cut
987 :    
988 : parrello 1.1 1;
989 :    
990 :     package FBasicLocation;
991 :    
992 :     @FBasicLocation::ISA = qw(BasicLocation);
993 :    
994 :     =head1 Forward Basic Location Object
995 :    
996 :     =head2 Introduction
997 :    
998 :     A I<forward location object> is a location in a contig that is transcribed from left to right.
999 :     It is a subclass of the B<BasicLocation> object, and contains methods that require different
1000 :     implementation for a forward location than a backward location.
1001 :    
1002 :     =head2 Override Methods
1003 :    
1004 :     =head3 Left
1005 :    
1006 :     C<< my $leftPoint = $loc->Left; >>
1007 :    
1008 :     Return the offset of the leftmost point of the location.
1009 :    
1010 :     =cut
1011 :    
1012 :     sub Left {
1013 :     return $_[0]->{_beg};
1014 :     }
1015 :    
1016 :     =head3 Right
1017 :    
1018 :     C<< my $rightPoint = $loc->Right; >>
1019 :    
1020 :     Return the offset of the rightmost point of the location.
1021 :    
1022 :     =cut
1023 :    
1024 :     sub Right {
1025 :     return $_[0]->{_end};
1026 :     }
1027 :    
1028 :     =head3 Compare
1029 :    
1030 :     C<< my ($distance, $cmp) = $loc->Compare($point); >>
1031 :    
1032 :     Determine the relative location of the specified point on the contig. Returns a distance,
1033 :     which indicates the location relative to the leftmost point of the contig, and a comparison
1034 :     number, which is negative if the point is to the left of the location, zero if the point is
1035 :     inside the location, and positive if the point is to the right of the location.
1036 :    
1037 :     =cut
1038 :    
1039 :     sub Compare {
1040 :     # Get the parameters.
1041 :     my ($self, $point) = @_;
1042 :     # Compute the distance from the begin (leftmost) point.
1043 :     my $distance = $point - $self->{_beg};
1044 :     # Set the comparison value. The distance works unless it is positive and less than
1045 :     # the length. In that case, it's inside the location so we want to return 0.
1046 :     my $cmp = (defined $self->IfValid($distance) ? 0 : $distance);
1047 :     # Return the results.
1048 :     return ($distance, $cmp);
1049 :     }
1050 :    
1051 :     =head3 Split
1052 :    
1053 :     C<< my $newLocation = $loc->Split($offset); >>
1054 :    
1055 :     Split this location into two smaller ones at the specified offset from the left endpoint. The
1056 :     new location split off of it will be returned. If the offset is at either end of the location,
1057 :     no split will occur and an underfined value will be returned.
1058 :    
1059 :     =over 4
1060 :    
1061 :     =item offset
1062 :    
1063 :     Offset into the location from the left endpoint of the point at which it should be split.
1064 :    
1065 :     =item RETURN
1066 :    
1067 :     The new location split off of this one, or an undefined value if no split was necessary.
1068 :    
1069 :     =back
1070 :    
1071 :     =cut
1072 :    
1073 :     sub Split {
1074 :     # Get the parameters.
1075 :     my ($self, $offset) = @_;
1076 :     # Declare the return variable.
1077 :     my $retVal;
1078 :     # Only proceed if a split is necessary.
1079 :     if ($offset > 0 && $offset < $self->{_len}) {
1080 :     # Save the current starting point.
1081 :     my $oldBegin = $self->{_beg};
1082 :     # Update this location's starting point and length.
1083 :     $self->{_beg} += $offset;
1084 :     $self->{_len} -= $offset;
1085 :     # Create the new location.
1086 :     $retVal = BasicLocation->new($self->{_contigID}, $oldBegin, '+', $offset);
1087 :     }
1088 :     # Return the new location object.
1089 :     return $retVal;
1090 :     }
1091 :    
1092 :    
1093 :     =head3 Peel
1094 :    
1095 :     C<< my $peel = $loc->Peel($length); >>
1096 :    
1097 :     Peel a specified number of positions off the beginning of the location. Peeling splits
1098 :     a location at a specified offset from the beginning, while splitting takes it at a
1099 :     specified offset from the left point. If the specified length is equal to or longer
1100 :     than the location's length, an undefined value will be returned.
1101 :    
1102 :     =over 4
1103 :    
1104 :     =item length
1105 :    
1106 :     Number of positions to split from the location.
1107 :    
1108 :     =item RETURN
1109 :    
1110 :     Returns a new location formed by splitting positions off of the existing location, which is
1111 :     shortened accordingly. If the specified length is longer than the location's length, an
1112 :     undefined value is returned and the location is not modified.
1113 :    
1114 :     =back
1115 :    
1116 :     =cut
1117 :     #: Return Type $%;
1118 :     sub Peel {
1119 :     # Get the parameters.
1120 :     my ($self, $length) = @_;
1121 :     # Declare the return variable.
1122 :     my $retVal;
1123 :     # Only proceed if a split is necessary.
1124 :     if ($length < $self->{_len}) {
1125 :     # Save the current begin point.
1126 :     my $oldBegpoint = $self->{_beg};
1127 :     # Update this location's begin point and length. We're peeling from the beginning,
1128 :     # which for this type of location means the segment is chopped off the left end.
1129 :     $self->{_beg} += $length;
1130 :     $self->{_len} -= $length;
1131 :     # Create the new location.
1132 :     $retVal = BasicLocation->new($self->{_contigID}, $oldBegpoint, '+', $length);
1133 :     }
1134 :     # Return the new location object.
1135 :     return $retVal;
1136 :     }
1137 :    
1138 :    
1139 :     =head3 Reverse
1140 :    
1141 :     C<< $loc->Reverse; >>
1142 :    
1143 :     Change the polarity of the location. The location will have the same nucleotide range, but
1144 :     the direction will be changed.
1145 :    
1146 :     =cut
1147 :    
1148 :     sub Reverse {
1149 :     # Get the parameters.
1150 :     my ($self) = @_;
1151 :     # Swap the beginning and end, then update the direction.
1152 :     ($self->{_beg}, $self->{_end}) = ($self->{_end}, $self->{_beg});
1153 :     $self->{_dir} = '-';
1154 :     # Re-bless us as a forward location.
1155 :     bless $self, "BBasicLocation";
1156 :     }
1157 :    
1158 :     =head3 Index
1159 :    
1160 :     C<< my $index = $loc->Index($point); >>
1161 :    
1162 :     Return the index of the specified point in this location. The value returned is the distance
1163 :     from the beginning. If the specified point is not in the location, an undefined value is returned.
1164 :    
1165 :     =over 4
1166 :    
1167 :     =item point
1168 :    
1169 :     Offset into the contig of the point in question.
1170 :    
1171 :     =item RETURN
1172 :    
1173 :     Returns the distance of the point from the beginning of the location, or an undefined value if the
1174 :     point is outside the location.
1175 :    
1176 :     =back
1177 :    
1178 :     =cut
1179 :    
1180 :     sub Index {
1181 :     # Get the parameters.
1182 :     my ($self, $point) = @_;
1183 :     # Compute the distance from the beginning. Because we are in a forward location, this
1184 :     # means subtracting the beginning's offset from the point's offset.
1185 :     my $retVal = $self->IfValid($point - $self->{_beg});
1186 :     # Return the result.
1187 :     return $retVal;
1188 :     }
1189 :    
1190 :     =head3 PointOffset
1191 :    
1192 :     C<< my $offset = $loc->PointOffset($index); >>
1193 :    
1194 :     Return the offset into the contig of the point at the specified position in the location. A position
1195 :     of 0 will return the beginning point, a position of 1 returns the point next to that, and a position
1196 :     1 less than the length will return the ending point.
1197 :    
1198 :     =over 4
1199 :    
1200 :     =item index
1201 :    
1202 :     Index into the location of the relevant point.
1203 :    
1204 :     =item RETURN
1205 :    
1206 :     Returns an offset into the contig of the specified point in the location.
1207 :    
1208 :     =back
1209 :    
1210 :     =cut
1211 :    
1212 :     sub PointOffset {
1213 :     # Get the parameters.
1214 :     my ($self, $index) = @_;
1215 :     # Return the offset. This is a forward location, so we add it to the begin point.
1216 :     return $self->{_beg} + $index;
1217 :     }
1218 :    
1219 : parrello 1.2
1220 :     =head3 SetBegin
1221 :    
1222 :     C<< $loc->SetBegin($newBegin); >>
1223 :    
1224 :     Change the begin point of this location without changing the endpoint.
1225 :    
1226 :     =over 4
1227 :    
1228 :     =item newBegin
1229 :    
1230 :     Proposed new beginning point.
1231 :    
1232 :     =back
1233 :    
1234 :     =cut
1235 :     #: Return Type ;
1236 :     sub SetBegin {
1237 :     # Get the parameters.
1238 :     my ($self, $newBegin) = @_;
1239 :     # Update the begin point.
1240 :     $self->{_beg} = $newBegin;
1241 :     # Adjust the length.
1242 :     $self->{_len} = $self->{_end} - $self->{_beg} + 1;
1243 :     }
1244 :    
1245 :     =head3 SetEnd
1246 :    
1247 :     C<< $loc->SetEnd($newEnd); >>
1248 :    
1249 :     Change the endpoint of this location without changing the begin point.
1250 :    
1251 :     =over 4
1252 :    
1253 :     =item newEnd
1254 :    
1255 :     Proposed new ending point.
1256 :    
1257 :     =back
1258 :    
1259 :     =cut
1260 :     #: Return Type ;
1261 :     sub SetEnd {
1262 :     # Get the parameters.
1263 :     my ($self, $newEnd) = @_;
1264 :     # Update the end point.
1265 :     $self->{_end} = $newEnd;
1266 :     # Adjust the length.
1267 :     $self->{_len} = $self->{_end} - $self->{_beg} + 1;
1268 :     }
1269 :    
1270 : parrello 1.6 =head3 Widen
1271 :    
1272 :     C<< my = $loc->Widen($distance, $max); >>
1273 :    
1274 :     Add the specified distance to each end of the location, taking care not to
1275 :     extend past either end of the contig. The contig length must be provided
1276 :     to insure we don't fall off the far end; otherwise, only the leftward
1277 :     expansion is limited.
1278 :    
1279 :     =over 4
1280 :    
1281 :     =item distance
1282 :    
1283 :     Number of positions to add to both ends of the location.
1284 :    
1285 :     =item max (optional)
1286 :    
1287 :     Maximum possible value for the right end of the location.
1288 :    
1289 :     =back
1290 :    
1291 :     =cut
1292 :     #: Return Type ;
1293 :     sub Widen {
1294 :     # Get the parameters.
1295 :     my ($self, $distance, $max) = @_;
1296 :     # Subtract the distance from the begin point.
1297 :     my $newBegin = FIG::max(1, $self->Begin - $distance);
1298 :     $self->SetBegin($newBegin);
1299 :     # Add the distance to the end point, keeping track of the maximum.
1300 :     my $newEnd = $self->EndPoint + $distance;
1301 :     if ($max && $newEnd > $max) {
1302 :     $newEnd = $max;
1303 :     }
1304 :     $self->SetEnd($newEnd);
1305 :     }
1306 : parrello 1.2
1307 : parrello 1.9 =head3 Upstream
1308 :    
1309 :     C<< my $newLoc = $loc->Upstream($distance, $max); >>
1310 :    
1311 :     Return a new location upstream of the given location, taking care not to
1312 :     extend past either end of the contig.
1313 :    
1314 :     =over 4
1315 :    
1316 :     =item distance
1317 :    
1318 :     Number of positions to add to the front (upstream) of the location.
1319 :    
1320 :     =item max (optional)
1321 :    
1322 :     Maximum possible value for the right end of the location.
1323 :    
1324 :     =item RETURN
1325 :    
1326 :     Returns a new location object whose last position is next to the first
1327 :     position of this location.
1328 :    
1329 :     =back
1330 :    
1331 :     =cut
1332 :     #: Return Type $%;
1333 :    
1334 :     sub Upstream {
1335 :     # Get the parameters.
1336 :     my ($self, $distance, $max) = @_;
1337 :     # Subtract the distance from the begin point, keeping the position safe.
1338 :     my $newBegin = $self->Begin - $distance;
1339 :     if ($newBegin <= 0) {
1340 :     $newBegin = 1;
1341 :     }
1342 :     # Compute the new length. It may be zero.
1343 :     my $len = $self->Begin - $newBegin;
1344 :     # Return the result.
1345 :     return BasicLocation->new($self->Contig, $newBegin, "+", $len);
1346 :     }
1347 :    
1348 :     =head3 Truncate
1349 :    
1350 :     C<< $loc->Truncate($len); >>
1351 :    
1352 :     Truncate the location to a new length. If the length is larger than the location length, then
1353 :     the location is not changed.
1354 :    
1355 :     =over 4
1356 :    
1357 :     =item len
1358 :    
1359 :     Proposed new length for the location.
1360 :    
1361 :     =back
1362 :    
1363 :     =cut
1364 :     #: Return Type $%;
1365 :     sub Truncate {
1366 :     # Get the parameters.
1367 :     my ($self, $len) = @_;
1368 :     # Only proceed if the new length would be shorter.
1369 :     if ($len < $self->Length) {
1370 :     $self->SetEnd($self->Begin + $len - 1);
1371 :     }
1372 :     }
1373 :    
1374 : parrello 1.10 =head3 Combine
1375 :    
1376 :     C<< $loc->Combine($other); >>
1377 :    
1378 :     Combine another location with this one. The result will contain all bases in both
1379 :     original locations. Both locations must have the same contig ID and direction.
1380 :    
1381 :     =over 4
1382 :    
1383 :     =item other
1384 :    
1385 :     Other location to combine with this one.
1386 :    
1387 :     =back
1388 :    
1389 :     =cut
1390 :    
1391 :     sub Combine {
1392 :     # Get the parameters.
1393 :     my ($self, $other) = @_;
1394 :     # If the other location ends past our end, move the endpoint.
1395 :     if ($other->EndPoint > $self->EndPoint) {
1396 :     $self->SetEnd($other->EndPoint);
1397 :     }
1398 :     # If the other location starts before our begin, move the begin point.
1399 :     if ($other->Begin < $self->Begin) {
1400 :     $self->SetBegin($other->Begin);
1401 :     }
1402 :     }
1403 :    
1404 : parrello 1.1 1;
1405 :    
1406 :     package BBasicLocation;
1407 :    
1408 :     @BBasicLocation::ISA = qw(BasicLocation);
1409 :    
1410 :     =head1 Backward Basic BasicLocation Object
1411 :    
1412 :     =head2 Introduction
1413 :    
1414 :     A I<backward location object> is a location in a contig that is transcribed from right to left.
1415 :     It is a subclass of the B<BasicLocation> object, and contains methods that require different
1416 :     implementation for a forward location than a backward location.
1417 :    
1418 :     =head2 Override Methods
1419 :    
1420 :     =head3 Left
1421 :    
1422 :     C<< my $leftPoint = $loc->Left; >>
1423 :    
1424 :     Return the offset of the leftmost point of the location.
1425 :    
1426 :     =cut
1427 :    
1428 :     sub Left {
1429 :     return $_[0]->{_end};
1430 :     }
1431 :    
1432 :     =head3 Right
1433 :    
1434 :     C<< my $rightPoint = $loc->Right; >>
1435 :    
1436 :     Return the offset of the rightmost point of the location.
1437 :    
1438 :     =cut
1439 :    
1440 :     sub Right {
1441 :     return $_[0]->{_beg};
1442 :     }
1443 :    
1444 :     =head3 Compare
1445 :    
1446 :     C<< my ($distance, $cmp) = $loc->Compare($point); >>
1447 :    
1448 :     Determine the relative location of the specified point on the contig. Returns a distance,
1449 :     which indicates the location relative to the leftmost point of the contig, and a comparison
1450 :     number, which is negative if the point is to the left of the location, zero if the point is
1451 :     inside the location, and positive if the point is to the right of the location.
1452 :    
1453 :     =cut
1454 :    
1455 :     sub Compare {
1456 :     # Get the parameters.
1457 :     my ($self, $point) = @_;
1458 :     # Compute the distance from the end (leftmost) point.
1459 :     my $distance = $point - $self->{_end};
1460 :     # Set the comparison value. The distance works unless it is positive and less than
1461 :     # the length. In that case, it's inside the location so we want to return 0.
1462 :     my $cmp = (defined $self->IfValid($distance) ? 0 : $distance);
1463 :     # Return the results.
1464 :     return ($distance, $cmp);
1465 :     }
1466 :    
1467 :     =head3 Split
1468 :    
1469 :     C<< my $newLocation = $loc->Split($offset); >>
1470 :    
1471 :     Split this location into two smaller ones at the specified offset from the left endpoint. The
1472 :     new location split off of it will be returned. If the offset is at either end of the location,
1473 :     no split will occur and an underfined value will be returned.
1474 :    
1475 :     =over 4
1476 :    
1477 :     =item offset
1478 :    
1479 :     Offset into the location from the left endpoint of the point at which it should be split.
1480 :    
1481 :     =item RETURN
1482 :    
1483 :     The new location split off of this one, or an undefined value if no split was necessary.
1484 :    
1485 :     =back
1486 :    
1487 :     =cut
1488 :    
1489 :     sub Split {
1490 :     # Get the parameters.
1491 :     my ($self, $offset) = @_;
1492 :     # Declare the return variable.
1493 :     my $retVal;
1494 :     # Only proceed if a split is necessary.
1495 :     if ($offset > 0 && $offset < $self->{_len}) {
1496 :     # Save the current ending point.
1497 :     my $oldEndpoint = $self->{_end};
1498 :     # Update this location's ending point and length.
1499 :     $self->{_end} += $offset;
1500 :     $self->{_len} -= $offset;
1501 :     # Create the new location.
1502 :     $retVal = BasicLocation->new($self->{_contigID}, $oldEndpoint + $offset - 1, '-', $offset);
1503 :     }
1504 :     # Return the new location object.
1505 :     return $retVal;
1506 :     }
1507 :    
1508 :     =head3 Peel
1509 :    
1510 :     C<< my $peel = $loc->Peel($length); >>
1511 :    
1512 :     Peel a specified number of positions off the beginning of the location. Peeling splits
1513 :     a location at a specified offset from the beginning, while splitting takes it at a
1514 :     specified offset from the left point. If the specified length is equal to or longer
1515 :     than the location's length, an undefined value will be returned.
1516 :    
1517 :     =over 4
1518 :    
1519 :     =item length
1520 :    
1521 :     Number of positions to split from the location.
1522 :    
1523 :     =item RETURN
1524 :    
1525 :     Returns a new location formed by splitting positions off of the existing location, which is
1526 :     shortened accordingly. If the specified length is longer than the location's length, an
1527 :     undefined value is returned and the location is not modified.
1528 :    
1529 :     =back
1530 :    
1531 :     =cut
1532 :     #: Return Type $%;
1533 :     sub Peel {
1534 :     # Get the parameters.
1535 :     my ($self, $length) = @_;
1536 :     # Declare the return variable.
1537 :     my $retVal;
1538 :     # Only proceed if a split is necessary.
1539 :     if ($length < $self->{_len}) {
1540 :     # Save the current begin point.
1541 :     my $oldBegpoint = $self->{_beg};
1542 :     # Update this location's begin point and length. We're peeling from the beginning,
1543 :     # which for this type of location means the segment is chopped off the right end.
1544 :     $self->{_beg} -= $length;
1545 :     $self->{_len} -= $length;
1546 :     # Create the new location.
1547 : parrello 1.4 $retVal = BasicLocation->new($self->{_contigID}, $oldBegpoint, '-', $length);
1548 : parrello 1.1 }
1549 :     # Return the new location object.
1550 :     return $retVal;
1551 :     }
1552 :    
1553 :     =head3 Reverse
1554 :    
1555 :     C<< $loc->Reverse; >>
1556 :    
1557 :     Change the polarity of the location. The location will have the same nucleotide range, but
1558 :     the direction will be changed.
1559 :    
1560 :     =cut
1561 :    
1562 :     sub Reverse {
1563 :     # Get the parameters.
1564 :     my ($self) = @_;
1565 :     # Swap the beginning and end, then update the direction.
1566 :     ($self->{_beg}, $self->{_end}) = ($self->{_end}, $self->{_beg});
1567 :     $self->{_dir} = '+';
1568 :     # Re-bless us as a forward location.
1569 :     bless $self, "FBasicLocation";
1570 :     }
1571 :    
1572 :     =head3 Index
1573 :    
1574 :     C<< my $index = $loc->Index($point); >>
1575 :    
1576 :     Return the index of the specified point in this location. The value returned is the distance
1577 :     from the beginning. If the specified point is not in the location, an undefined value is returned.
1578 :    
1579 :     =over 4
1580 :    
1581 :     =item point
1582 :    
1583 :     Offset into the contig of the point in question.
1584 :    
1585 :     =item RETURN
1586 :    
1587 :     Returns the distance of the point from the beginning of the location, or an undefined value if the
1588 :     point is outside the location.
1589 :    
1590 :     =back
1591 :    
1592 :     =cut
1593 :    
1594 :     sub Index {
1595 :     # Get the parameters.
1596 :     my ($self, $point) = @_;
1597 :     # Compute the distance from the beginning. Because we are in a backward location, this
1598 :     # means subtracting the point's offset from the beginning's offset.
1599 :     my $retVal = $self->IfValid($self->{_beg} - $point);
1600 :     # Return the result.
1601 :     return $retVal;
1602 :     }
1603 :    
1604 :     =head3 PointOffset
1605 :    
1606 :     C<< my $offset = $loc->PointOffset($index); >>
1607 :    
1608 :     Return the offset into the contig of the point at the specified position in the location. A position
1609 :     of 0 will return the beginning point, a position of 1 returns the point next to that, and a position
1610 :     1 less than the length will return the ending point.
1611 :    
1612 :     =over 4
1613 :    
1614 :     =item index
1615 :    
1616 :     Index into the location of the relevant point.
1617 :    
1618 :     =item RETURN
1619 :    
1620 :     Returns an offset into the contig of the specified point in the location.
1621 :    
1622 :     =back
1623 :    
1624 :     =cut
1625 :    
1626 :     sub PointOffset {
1627 :     # Get the parameters.
1628 :     my ($self, $index) = @_;
1629 :     # Return the offset. This is a backward location, so we subtract it from the begin point.
1630 :     return $self->{_beg} - $index;
1631 :     }
1632 :    
1633 : parrello 1.2 =head3 SetBegin
1634 :    
1635 :     C<< $loc->SetBegin($newBegin); >>
1636 :    
1637 :     Change the begin point of this location without changing the endpoint.
1638 :    
1639 :     =over 4
1640 :    
1641 :     =item newBegin
1642 :    
1643 :     Proposed new beginning point.
1644 :    
1645 :     =back
1646 :    
1647 :     =cut
1648 :     #: Return Type ;
1649 :     sub SetBegin {
1650 :     # Get the parameters.
1651 :     my ($self, $newBegin) = @_;
1652 :     # Update the begin point.
1653 :     $self->{_beg} = $newBegin;
1654 :     # Adjust the length.
1655 :     $self->{_len} = $self->{_beg} - $self->{_end} + 1;
1656 :     }
1657 :    
1658 :     =head3 SetEnd
1659 :    
1660 :     C<< $loc->SetEnd($newEnd); >>
1661 :    
1662 :     Change the endpoint of this location without changing the begin point.
1663 :    
1664 :     =over 4
1665 :    
1666 :     =item newEnd
1667 :    
1668 :     Proposed new ending point.
1669 :    
1670 :     =back
1671 :    
1672 :     =cut
1673 :     #: Return Type ;
1674 :     sub SetEnd {
1675 :     # Get the parameters.
1676 :     my ($self, $newEnd) = @_;
1677 :     # Update the end point.
1678 :     $self->{_end} = $newEnd;
1679 :     # Adjust the length.
1680 :     $self->{_len} = $self->{_beg} - $self->{_end} + 1;
1681 :     }
1682 :    
1683 : parrello 1.6 =head3 Widen
1684 :    
1685 :     C<< my = $loc->Widen($distance, $max); >>
1686 :    
1687 :     Add the specified distance to each end of the location, taking care not to
1688 :     extend past either end of the contig. The contig length must be provided
1689 :     to insure we don't fall off the far end; otherwise, only the leftward
1690 :     expansion is limited.
1691 :    
1692 :     =over 4
1693 : parrello 1.2
1694 : parrello 1.6 =item distance
1695 :    
1696 :     Number of positions to add to both ends of the location.
1697 :    
1698 :     =item max (optional)
1699 :    
1700 :     Maximum possible value for the right end of the location.
1701 :    
1702 :     =back
1703 :    
1704 :     =cut
1705 :     #: Return Type ;
1706 :     sub Widen {
1707 :     # Get the parameters.
1708 :     my ($self, $distance, $max) = @_;
1709 :     # Subtract the distance from the end point.
1710 :     my $newEnd = FIG::max(1, $self->End - $distance);
1711 :     $self->SetEnd($newEnd);
1712 :     # Add the distance to the begin point, keeping track of the maximum.
1713 :     my $newBegin = $self->Begin + $distance;
1714 :     if ($max && $newBegin > $max) {
1715 :     $newBegin = $max;
1716 :     }
1717 :     $self->SetBegin($newEnd);
1718 :     }
1719 : parrello 1.2
1720 : parrello 1.9 =head3 Upstream
1721 :    
1722 :     C<< my $newLoc = $loc->Upstream($distance, $max); >>
1723 :    
1724 :     Return a new location upstream of the given location, taking care not to
1725 :     extend past either end of the contig.
1726 :    
1727 :     =over 4
1728 :    
1729 :     =item distance
1730 :    
1731 :     Number of positions to add to the front (upstream) of the location.
1732 :    
1733 :     =item max (optional)
1734 :    
1735 :     Maximum possible value for the right end of the location.
1736 :    
1737 :     =item RETURN
1738 :    
1739 :     Returns a new location object whose last position is next to the first
1740 :     position of this location.
1741 :    
1742 :     =back
1743 :    
1744 :     =cut
1745 :     #: Return Type $%;
1746 :    
1747 :     sub Upstream {
1748 :     # Get the parameters.
1749 :     my ($self, $distance, $max) = @_;
1750 :     # Add the distance to the begin point, keeping the position safe.
1751 :     my $newBegin = $self->Begin + $distance;
1752 :     if ($max && $newBegin > $max) {
1753 :     $newBegin = $max;
1754 :     }
1755 :     # Compute the new length. It may be zero.
1756 :     my $len = $newBegin - $self->Begin;
1757 :     # Return the result.
1758 :     return BasicLocation->new($self->Contig, $newBegin, "-", $len);
1759 :     }
1760 :    
1761 :     =head3 Truncate
1762 :    
1763 :     C<< $loc->Truncate($len); >>
1764 :    
1765 :     Truncate the location to a new length. If the length is larger than the location length, then
1766 :     the location is not changed.
1767 :    
1768 :     =over 4
1769 :    
1770 :     =item len
1771 :    
1772 :     Proposed new length for the location.
1773 :    
1774 :     =back
1775 :    
1776 :     =cut
1777 :     #: Return Type $%;
1778 :     sub Truncate {
1779 :     # Get the parameters.
1780 :     my ($self, $len) = @_;
1781 :     # Only proceed if the new length would be shorter.
1782 :     if ($len < $self->Length) {
1783 :     $self->SetEnd($self->Begin - $len + 1);
1784 :     }
1785 :     }
1786 :    
1787 : parrello 1.10 =head3 Combine
1788 :    
1789 :     C<< $loc->Combine($other); >>
1790 :    
1791 :     Combine another location with this one. The result will contain all bases in both
1792 :     original locations. Both locations must have the same contig ID and direction.
1793 :    
1794 :     =over 4
1795 :    
1796 :     =item other
1797 :    
1798 :     Other location to combine with this one.
1799 :    
1800 :     =back
1801 :    
1802 :     =cut
1803 :    
1804 :     sub Combine {
1805 :     # Get the parameters.
1806 :     my ($self, $other) = @_;
1807 :     # If the other location ends past our end, move the endpoint.
1808 :     if ($other->EndPoint < $self->EndPoint) {
1809 :     $self->SetEnd($other->EndPoint);
1810 :     }
1811 :     # If the other location starts before our begin, move the begin point.
1812 :     if ($other->Begin > $self->Begin) {
1813 :     $self->SetBegin($other->Begin);
1814 :     }
1815 :     }
1816 :    
1817 : parrello 1.1 1;

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3