[Bio] / FigKernelScripts / TransactFeatures.pl Repository:
ViewVC logotype

Annotation of /FigKernelScripts/TransactFeatures.pl

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.2 - (view) (download) (as text)

1 : parrello 1.1 #!/usr/bin/perl -w
2 :    
3 :     =head1 Add / Delete / Change Features
4 :    
5 :     This method will run through a set of transaction files, adding, deleting, and changing
6 :     features in the FIG data store. The command takes three input parameters. The first is
7 :     a command. The second specifies a directory full of transaction files. The third
8 :     specifies a file that tells us which feature IDs are available for each organism.
9 :    
10 :     C<TransactFeatures> I<[options]> I<command> I<transactionDirectory> I<idFile>
11 :    
12 :     The supported commands are
13 :    
14 :     =over 4
15 :    
16 :     =item count
17 :    
18 :     Count the number of IDs needed to process the ADD and CHANGE transactions. This
19 :     will produce an listing of the number of feature IDs needed for each
20 :     organism and feature type. This command is mostly a sanity check: it provides
21 :     useful statistics without changing anything.
22 :    
23 :     =item register
24 :    
25 :     Create an ID file by requesting IDs from the clearinghouse. This performs the
26 :     same function as B<count>, but takes the additional step of creating an ID
27 :     file that can be used to process the transactions.
28 :    
29 :     =item process
30 :    
31 :     Process the transactions and update the FIG data store. This will also create
32 :     a copy of each transaction file in which the pseudo-IDs have been replaced by
33 :     real IDs.
34 :    
35 :     =back
36 :    
37 :     =head2 The Transaction File
38 :    
39 :     Each transaction file is a standard tab-delimited file, one transaction per line. The
40 :     name of the file is C<tbl_diff_>I<org> where I<org> is an organism ID. All records in
41 :     the transaction file refer to transactions against the organism encoded in the file
42 :     name.
43 :    
44 :     The file must specify IDs for new features, but the real IDs cannot be known until
45 :     they are requested from the SEED clearing house. Therefore, each new ID is specified
46 :     in a special format consisting of the feature type (C<peg>, C<rna>, and so forth)
47 :     followed by a dot and the 0-based ordinal number of the new ID within that
48 :     feature type. So, for example, if the transaction file consists of a delete,
49 :     a change, and two adds, it might look like this
50 :    
51 :     delete fig|83333.1.peg.2
52 :     change fig|83333.1.peg.6 peg.0 ...
53 :     add peg.1 ...
54 :     add rna.0 ...
55 :    
56 :     Note that the old feature IDs do not participate in the numbering process, and the RNA
57 :     numbering is independent of the PEG numbering. In the discussion below of transaction
58 :     types, a field named I<newID> will always indicate one of these type/number pairs.
59 :     So, the field setup for the B<chang> command is
60 :    
61 :     change fid newID locations aliases translation
62 :    
63 :     And the I<newID> corresponds to the C<peg.6> in the example above.
64 :    
65 :     The first field of each record is the transaction type. The list of subsequent fields
66 :     depends on this type.
67 :    
68 :     =over 4
69 :    
70 :     =item DELETE fid
71 :    
72 :     Deletes a feature. The feature is marked as deleted in the FIG database, which
73 :     causes it to be skipped or ignored by most of the SEED software. The ID of the
74 :     feature to be deleted is the second field (I<fid>).
75 :    
76 :     =item ADD newID locations translation
77 :    
78 :     Adds a new feature. The I<newID> indicates the feature type and its ordinal number.
79 :     The location is a comma-separated list of location strings. The translation is the
80 :     protein translation for the location. If the translation is omitted, then it will
81 :     be generated from the location information in the normal way.
82 :    
83 :     =item CHANGE fid newID locations aliases translation
84 :    
85 :     Changes an existing feature. The current copy of the feature is marked as deleted,
86 :     and a new feature is created with a new ID. All annotations and assignments are
87 :     transferred from the deleted feature to the new one. The location is a
88 :     comma-separated list of location strings. The aliases are specified as a comma-delimited
89 :     list of alternate names for the feature. These replace any existing aliases for the
90 :     old feature. If the alias list is omitted, no aliases will be assigned to the new
91 :     feature. The translation is the protein translation for the location. If the
92 :     translation is omitted, then it will be generated from the location information in the
93 :     normal way.
94 :    
95 :     =back
96 :    
97 :     =head2 The ID File
98 :    
99 :     The ID file is a tab-delimited file containing one record for each feature type
100 :     of each organism that has a transaction file. Each record consists of three
101 :     fields.
102 :    
103 :     =over 4
104 :    
105 :     =item orgID
106 :    
107 :     The ID of the organism being updated.
108 :    
109 :     =item ftype
110 :    
111 :     The relevant feature type.
112 :    
113 :     =item firstNumber
114 :    
115 :     The first available ID number for the organism and feature type.
116 :    
117 :     =back
118 :    
119 :     This file's primary purpose is that it tells us how to create the feature IDs
120 :     for features we'll be adding to the data store, whether it be via a straight
121 :     B<add> or a B<chang> that deletes an old ID and recreates the feature with a
122 :     new ID.
123 :    
124 :     If we need new IDs for an organism not listed in this ID file, an error will be
125 :     thrown.
126 :    
127 :     =head2 Command-Line Options
128 :    
129 :     The command-line options for this script are as follows.
130 :    
131 :     =over 4
132 :    
133 :     =item trace
134 :    
135 :     Numeric trace level. A higher trace level causes more messages to appear. The
136 :     default trace level is 3.
137 :    
138 : parrello 1.2 =item safe
139 :    
140 :     Wrap each organism's processing in a database transaction. This makes the process
141 :     slightly more restartable than it would be otherwise.
142 :    
143 : parrello 1.1 =cut
144 :    
145 :     use strict;
146 :     use Tracer;
147 :     use DocUtils;
148 :     use TestUtils;
149 :     use Cwd;
150 :     use File::Copy;
151 :     use File::Path;
152 :     use FIG;
153 :     use Stats;
154 :    
155 :     # Get the command-line options.
156 : parrello 1.2 my ($options, @parameters) = Tracer::ParseCommand({ trace => 3, safe => 0 }, @ARGV);
157 : parrello 1.1 # Set up tracing.
158 :     my $traceLevel = $options->{trace};
159 :     TSetup("$traceLevel Tracer DocUtils FIG", "TEXT");
160 :     # Get the FIG object.
161 :     my $fig = FIG->new();
162 : parrello 1.2 # Get its database handle.
163 :     my $dbh = $fig->db_handle;
164 : parrello 1.1 # Get the command.
165 :     my $mainCommand = lc shift @parameters;
166 :     Trace("$mainCommand command specified.") if T(2);
167 :    
168 :     # Create the ID table. This maps each organism/ftype pair to the currently-
169 :     # available ID number. If we're counting, we leave it empty. If we're not
170 :     # counting, we need to read it in.
171 :     my %idHash = ();
172 :     if ($mainCommand eq 'process') {
173 :     my $inCount = 0;
174 :     Open(\*IDFILE, "<$parameters[1]");
175 :     while (my $idRecord = <IDFILE>) {
176 :     chomp $idRecord;
177 :     my ($orgID, $ftype, $firstNumber) = split /\t/, $idRecord;
178 :     $idHash{"$orgID.$ftype"} = $firstNumber;
179 :     $inCount++;
180 :     }
181 :     Trace("$inCount ID ranges read in from $parameters[1].") if T(2);
182 :     }
183 :    
184 :     # Create some counters we can use for statistical purposes.
185 :     my $stats = Stats->new("genomes", "add", "change", "delete");
186 :     # Verify that the organism directory exists.
187 :     if (! -d $parameters[0]) {
188 :     Confess("Directory of genome files \"$parameters[0]\" not found.");
189 :     } else {
190 :     # Here we have a valid directory, so we need the list of transaction
191 :     # files in it.
192 :     my $orgsFound = 0;
193 :     my %transFiles = ();
194 :     my @transDirectory = OpenDir($parameters[0], 1);
195 :     # The next step is to create a hash of organism IDs to file names. This
196 :     # saves us some painful parsing later.
197 :     for my $transFileName (@transDirectory) {
198 :     if ($transFileName =~ /^tbl_diff_(\d+\.\d+)$/) {
199 :     $transFiles{$1} = "$parameters[0]/$transFileName";
200 :     $orgsFound++;
201 :     }
202 :     }
203 :     Trace("$orgsFound genome transaction files found in directory $parameters[0].") if T(2);
204 :     if (! $orgsFound) {
205 :     Confess("No \"tbl_diff\" files found in directory $parameters[1].");
206 :     } else {
207 :     # Loop through the organisms.
208 :     for my $genomeID (sort keys %transFiles) {
209 :     Trace("Processing changes for $genomeID.") if T(3);
210 :     # Create a statistics object for this organism.
211 :     my $orgStats = Stats->new("add", "change", "delete");
212 :     # Create a control block for passing around our key data.
213 :     my $controlBlock = { stats => $orgStats, genomeID => $genomeID,
214 :     idHash => \%idHash, options => $options,
215 :     fig => $fig, command => $mainCommand };
216 :     # Open the organism file.
217 :     my $orgFileName = $transFiles{$genomeID};
218 :     Open(\*TRANS, "<$orgFileName");
219 :     my $tranCount = 0;
220 :     # If we're processing rather than counting, open a file for
221 : parrello 1.2 # writing out corrected transactions and optionally start a
222 :     # database transaction.
223 : parrello 1.1 if ($mainCommand eq 'process') {
224 :     Open(\*TRANSOUT, ">$orgFileName.tbl");
225 : parrello 1.2 if ($options->{safe}) {
226 :     $dbh->begin_tran();
227 :     }
228 : parrello 1.1 }
229 :     # Loop through the organism's data.
230 :     while (my $transaction = <TRANS>) {
231 :     # Parse the record.
232 :     chomp $transaction;
233 :     my @fields = split /\t/, $transaction;
234 :     $tranCount++;
235 :     # Save the record number in the control block.
236 :     $controlBlock->{line} = $tranCount;
237 :     # Process according to the transaction type.
238 :     my $command = lc shift @fields;
239 :     if ($command eq 'add') {
240 :     Add($controlBlock, @fields);
241 :     } elsif ($command eq 'delete') {
242 :     Delete($controlBlock, @fields);
243 :     } elsif ($command eq 'change') {
244 :     Change($controlBlock, @fields);
245 :     } else {
246 :     $orgStats->AddMessage("Invalid command $command in line $tranCount for genome $genomeID");
247 :     }
248 :     $orgStats->Add($command, 1);
249 :     }
250 :     Trace("Statistics for $genomeID\n\n" . $orgStats->Show()) if T(3);
251 :     # Merge the statistics for this run into the globals statistics object.
252 :     $stats->Accumulate($orgStats);
253 :     $stats->Add("genomes", 1);
254 : parrello 1.2 # Close the transaction input file.
255 : parrello 1.1 close TRANS;
256 : parrello 1.2 # If we're processing, close the transaction output file
257 :     # and optionally end the database transaction.
258 : parrello 1.1 if ($mainCommand eq 'process') {
259 :     close TRANSOUT;
260 : parrello 1.2 if ($options->{safe}) {
261 :     $dbh->commit_tran();
262 :     }
263 : parrello 1.1 }
264 :     }
265 :     }
266 :     Trace("Statistics for this run\n\n" . $stats->Show()) if T(1);
267 :     # If we're counting, we need to write out the counts file or allocate IDs
268 :     # from the clearinghouse.
269 :     if ($mainCommand ne "process") {
270 :     # Loop through the ID hash, printing the counts. We will also write them
271 :     # to a file called "counts.tbl".
272 :     my $countfile = "$parameters[0]/counts.tbl";
273 :     Open(\*COUNTFILE, ">$countfile");
274 :     print "\nTable of Counts\n";
275 :     for my $idKey (keys %idHash) {
276 :     $idKey =~ /^(\d+\.\d+)\.([a-z]+)$/;
277 :     my ($org, $ftype) = ($1, $2);
278 :     my $count = $idHash{$idKey};
279 :     print "$idKey\t$count\n";
280 :     print COUNTFILE "$org\t$ftype\t$count\n";
281 :     }
282 :     close COUNTFILE;
283 :     if ($mainCommand eq "register") {
284 :     # Here we are registering as well as counting. This process also produces
285 :     # the ID file.
286 :     Trace("Submitting ID file to clearing house.") if T(2);
287 :     system("register_features_batch <$countfile >$parameters[1]");
288 :     Trace("Clearing house request complete.") if T(2);
289 :     }
290 :     }
291 :     Trace("Processing complete.") if T(1);
292 :     }
293 :    
294 :     =head2 Utility Methods
295 :    
296 :     =head3 Add
297 :    
298 :     C<< Add($controlBlock, $newID, $locations, $translation); >>
299 :    
300 :     Add a new feature to the data store.
301 :    
302 :     =over 4
303 :    
304 :     =item controlBlock
305 :    
306 :     Reference to a hash containing the data structures required to manage feature
307 :     transactions.
308 :    
309 :     =item newID
310 :    
311 :     ID to give to the new feature.
312 :    
313 :     =item locations
314 :    
315 :     Location of the new feature, in the form of a comma-separated list of location
316 :     strings in SEED format.
317 :    
318 :     =item translation (optional)
319 :    
320 :     Protein translation string for the new feature. If this field is omitted and
321 :     the feature is a peg, the translation will be generated by normal means.
322 :    
323 :     =back
324 :    
325 :     =cut
326 :    
327 :     sub Add {
328 :     my ($controlBlock, $newID, $locations, $translation) = @_;
329 :     my $fig = $controlBlock->{fig};
330 :     # Extract the feature type and ordinal number from the new ID.
331 :     my ($ftype, $ordinal, $key) = ParseNewID($controlBlock, $newID);
332 :     # If we're counting, we need to count the ID. Otherwise, we need to
333 :     # add the new feature.
334 :     if ($controlBlock->{command} ne 'process') {
335 :     $controlBlock->{idHash}->{$key}++;
336 :     } else {
337 :     # Here we need to add the new feature.
338 :     my $realID = AddFeature($controlBlock, $ordinal, $key, $ftype,
339 :     "", $locations, $translation);
340 :     Trace("Feature $realID added for pseudo-ID $newID.") if T(4);
341 :     # Write a corrected transaction to the transaction output file.
342 :     print TRANSOUT "add\t$realID\t$locations\t$translation\n";
343 :     }
344 :     }
345 :    
346 :     =head3 Change
347 :    
348 :     C<< Change($controlBlock, $fid, $newID, $locations, $aliases, $translation); >>
349 :    
350 :     Replace a feature to the data store. The feature will be marked for deletion and
351 :     a new feature will be put in its place.
352 :    
353 :     This is a much more complicated process than adding a feature. In addition to
354 :     the add, we have to create new aliases and transfer across the assignment and
355 :     the annotations.
356 :    
357 :     =over 4
358 :    
359 :     =item controlBlock
360 :    
361 :     Reference to a hash containing the data structures required to manage feature
362 :     transactions.
363 :    
364 :     =item fid
365 :    
366 :     ID of the feature being changed.
367 :    
368 :     =item newID
369 :    
370 :     New ID to give to the feature.
371 :    
372 :     =item locations
373 :    
374 :     New location to give to the feature, in the form of a comma-separated list of location
375 :     strings in SEED format.
376 :    
377 :     =item aliases (optional)
378 :    
379 :     A new list of alias names for the feature.
380 :    
381 :     =item translation (optional)
382 :    
383 :     New protein translation string for the feature. If this field is omitted and
384 :     the feature is a peg, the translation will be generated by normal means.
385 :    
386 :     =back
387 :    
388 :     =cut
389 :    
390 :     sub Change {
391 :     my ($controlBlock, $fid, $newID, $locations, $aliases, $translation) = @_;
392 :     my $fig = $controlBlock->{fig};
393 :     # Extract the feature type and ordinal number from the new ID.
394 :     my ($ftype, $ordinal, $key) = ParseNewID($controlBlock, $newID);
395 :     # If we're counting, we need to count the ID. Otherwise, we need to
396 :     # replace the feature.
397 :     if ($controlBlock->{command} ne 'process') {
398 :     $controlBlock->{idHash}->{$key}++;
399 :     } else {
400 :     # Here we can go ahead and change the feature. First, we must
401 :     # get the old feature's assignment and annotations. Note that
402 :     # for the annotations we ask for the time in its raw format.
403 :     my @functions = $fig->function_of($fid);
404 :     my @annotations = $fig->feature_annotations($fid, 1);
405 :     # Create some counters.
406 :     my ($assignCount, $annotateCount) = (0, 0);
407 :     # Add the new version of the feature and get its ID.
408 :     my $realID = AddFeature($controlBlock, $ordinal, $key, $ftype, $locations,
409 :     $aliases, $translation);
410 :     # Copy over the assignments.
411 :     for my $assignment (@functions) {
412 :     my ($user, $function) = @{$assignment};
413 :     $fig->assign_function($realID, $user, $function);
414 :     $assignCount++;
415 :     }
416 :     # Copy over the annotations.
417 :     for my $annotation (@annotations) {
418 :     my ($oldID, $timestamp, $user, $annotation) = @{$annotation};
419 :     $fig->add_annotation($realID, $user, $annotation, $timestamp);
420 :     $annotateCount++;
421 :     }
422 :     # Mark the old feature for deletion.
423 :     $fig->delete_feature($fid);
424 :     # Tell the user what we did.
425 : parrello 1.2 $controlBlock->{stats}->Add("assignments", $assignCount);
426 :     $controlBlock->{stats}->Add("annotations", $annotateCount);
427 : parrello 1.1 Trace("Feature $realID created from $fid. $assignCount assignments and $annotateCount annotations copied.") if T(4);
428 :     # Write a corrected transaction to the transaction output file.
429 :     print TRANSOUT "change\t$fid\t$realID\t$locations\t$aliases\t$translation\n";
430 :     }
431 :     }
432 :    
433 :     =head3 Delete
434 :    
435 :     C<< Delete($controlBlock, $fid); >>
436 :    
437 :     Delete a feature from the data store. The feature will be marked as deleted,
438 :     which will remove it from consideration by most FIG methods. A garbage
439 :     collection job will be run later to permanently delete the feature.
440 :    
441 :     =over 4
442 :    
443 :     =item controlBlock
444 :    
445 :     Reference to a hash containing the data structures required to manage feature
446 :     transactions.
447 :    
448 :     =item fid
449 :    
450 :     ID of the feature to delete.
451 :    
452 :     =back
453 :    
454 :     =cut
455 :    
456 :     sub Delete {
457 :     my ($controlBlock, $fid) = @_;
458 :     my $fig = $controlBlock->{fig};
459 :     # Extract the feature type and count it.
460 :     my $ftype = FIG::ftype($fid);
461 :     $controlBlock->{stats}->Add($ftype, 1);
462 :     # If we're not counting, delete the feature.
463 :     if ($controlBlock->{command} eq 'process') {
464 :     # Mark the feature for deletion.
465 :     $fig->delete_feature($fid);
466 :     # Echo the transaction to the transaction output file.
467 :     print TRANSOUT "del\t$fid\n";
468 :     }
469 :     }
470 :    
471 :     =head3 ParseNewID
472 :    
473 :     C<< my ($ftype, $ordinal, $key) = ParseNewID($controlBlock, $newID); >>
474 :    
475 :     Extract the feature type and ordinal number from an incoming new ID.
476 :    
477 :     =over 4
478 :    
479 :     =item controlBlock
480 :    
481 :     Reference to a hash containing the data structures needed to manage transactions.
482 :    
483 :     =item newID
484 :    
485 :     New ID specification taken from a transaction input record. This contains the
486 :     feature type followed by a period and then the ordinal number of the ID.
487 :    
488 :     =item RETURN
489 :    
490 :     Returna a three-element list. If successful, the list will contain the feature
491 :     type followed by the ordinal number and the key to use in the ID hash to find
492 :     the feature's true ID number. If the incoming ID is invalid, the list
493 :     will contain three C<undef>s.
494 :    
495 :     =back
496 :    
497 :     =cut
498 :    
499 :     sub ParseNewID {
500 :     # Get the parameters.
501 :     my ($controlBlock, $newID) = @_;
502 :     my ($ftype, $ordinal, $key);
503 :     # Parse the ID.
504 :     if ($newID =~ /^([a-z]+)\.(\d+)$/) {
505 :     # Here we have a valid ID.
506 :     ($ftype, $ordinal) = ($1, $2);
507 :     $key = $controlBlock->{genomeID} . ".$ftype";
508 :     # Update the feature type count in the statistics.
509 :     $controlBlock->{stats}->Add($ftype, 1);
510 :     } else {
511 :     # Here we have an invalid ID.
512 :     $controlBlock->{stats}->AddMessage("Invalid ID $newID found in line " .
513 :     $controlBlock->{line} . " for genome " .
514 :     $controlBlock->{genomeID} . ".");
515 :     }
516 :     # Return the result.
517 :     return ($ftype, $ordinal, $key);
518 :     }
519 :    
520 :     =head3 GetRealID
521 :    
522 :     C<< my $realID = GetRealID($controlBlock, $ftype, $ordinal, $key); >>
523 :    
524 :     Compute the real ID of a new feature. This involves interrogating the ID hash and
525 :     formatting a full-blown ID out of little bits of information.
526 :    
527 :     =over 4
528 :    
529 :     =item controlBlock
530 :    
531 :     Reference to a hash containing data used to manage the transaction process.
532 :    
533 :     =item ordinal
534 :    
535 :     Zero-based ordinal number of this feature. The ordinal number is added to the value
536 :     stored in the control block's ID hash to compute the real feature number.
537 :    
538 :     =item key
539 :    
540 :     Key in the ID hash relevant to this feature.
541 :    
542 :     =item RETURN
543 :    
544 :     Returns a fully-formatted FIG ID for the new feature.
545 :    
546 :     =back
547 :    
548 :     =cut
549 :    
550 :     sub GetRealID {
551 :     # Get the parameters.
552 :     my ($controlBlock, $ordinal, $key) = @_;
553 :     #Declare the return value.
554 :     my $retVal;
555 :     # Get the base value for the feature ID number.
556 :     my $base = $controlBlock->{idHash}->{$key};
557 :     # If it didn't exist, we have an error.
558 :     if (! defined $base) {
559 :     Confess("No ID range found for genome ID and feature type $key.");
560 :     } else {
561 :     # Now we have enough data to format the ID.
562 :     my $num = $base + $ordinal;
563 :     $retVal = "fig|$key.$num";
564 :     }
565 :     # Return the result.
566 :     return $retVal;
567 :     }
568 :    
569 :     =head3 CheckTranslation
570 :    
571 :     C<< my $actualTranslation = CheckTranslation($controlBlock, $ftype, $locations, $translation); >>
572 :    
573 :     If we are processing a PEG, insure we have a translation for the peg's locations.
574 :    
575 :     This method checks the feature type and the incoming translation string. If the
576 :     translation string is empty and the feature type is C<peg>, it will generate
577 :     a translation string using the specified locations for the genome currently
578 :     being processed.
579 :    
580 :     =over 4
581 :    
582 :     =item controlBlock
583 :    
584 :     Reference to a hash containing data used to manage the transaction process.
585 :    
586 :     =item ftype
587 :    
588 :     Feature type (C<peg>, C<rna>, etc.)
589 :    
590 :     =item locations
591 :    
592 :     Comma-delimited list of location strings for the feature in question.
593 :    
594 :     =item translation (optional)
595 :    
596 :     If specified, will be returned to the caller as the result.
597 :    
598 :     =item RETURN
599 :    
600 :     Returns the protein translation string for the specified locations, or C<undef>
601 :     if no translation is warranted.
602 :    
603 :     =back
604 :    
605 :     =cut
606 :    
607 :     sub CheckTranslation {
608 :     # Get the parameters.
609 :     my ($controlBlock, $ftype, $locations, $translation) = @_;
610 :     my $fig = $controlBlock->{fig};
611 :     # Declare the return variable.
612 :     my $retVal;
613 :     if ($ftype eq 'peg') {
614 :     # Here we have a protein encoding gene. Check to see if we already have
615 :     # a translation.
616 :     if (defined $translation) {
617 :     # Pass it back unmodified.
618 :     $retVal = $translation;
619 :     } else {
620 :     # Here we need to compute the translation.
621 :     my $dna = $fig->dna_seq($controlBlock->{genomeID}, $locations);
622 :     $retVal = FIG::translate($dna);
623 :     }
624 :     }
625 :     # Return the result.
626 :     return $retVal;
627 :     }
628 :    
629 :     =head3 AddFeature
630 :    
631 :     C<< my $realID = AddFeature($controlBlock, $ordinal, $key, $ftype, $locations, $translation); >>
632 :    
633 :     Add the specified feature to the FIG data store. This involves generating the new feature's
634 :     ID, creating the translation (if needed), adding the feature to the data store, and
635 :     queueing a request to update the similarities. The generated ID will be returned to the
636 :     caller.
637 :    
638 :     =over 4
639 :    
640 :     =item controlBlock
641 :    
642 :     Reference to a hash containing the data structures required to manage feature
643 :     transactions.
644 :    
645 :     =item ordinal
646 :    
647 :     Zero-based ordinal number of the proposed feature in the ID space. This is added to the
648 :     base ID number to get the real ID number.
649 :    
650 :     =item key
651 :    
652 :     Key to use for getting the base ID number from the ID hash.
653 :    
654 :     =item ftype
655 :    
656 :     Proposed feature type (C<peg>, C<rna>, etc.)
657 :    
658 :     =item locations
659 :    
660 :     Location of the new feature, in the form of a comma-separated list of location
661 :     strings in SEED format.
662 :    
663 :     =item aliases (optional)
664 :    
665 :     A new list of alias names for the feature.
666 :    
667 :     =item translation (optional)
668 :    
669 :     Protein translation string for the new feature. If this field is omitted and
670 :     the feature is a peg, the translation will be generated by normal means.
671 :    
672 :     =back
673 :    
674 :     =cut
675 :    
676 :     sub AddFeature {
677 :     # Get the parameters.
678 :     my ($controlBlock, $ordinal, $key, $ftype, $locations, $aliases, $translation) = @_;
679 :     my $fig = $controlBlock->{fig};
680 :     # We want to add a new feature using the information provided. First, we
681 :     # generate its ID.
682 :     my $retVal = GetRealID($controlBlock, $ordinal, $key);
683 :     # Next, we insure that we have a translation.
684 :     my $actualTranslation = CheckTranslation($controlBlock, $ftype,
685 :     $locations, $translation);
686 :     # Now we add it to FIG.
687 :     $fig->add_feature($controlBlock->{genomeID}, $ftype, $locations, "",
688 :     $actualTranslation, $retVal);
689 :     # Tell FIG to recompute the similarities.
690 :     $fig->enqueue_similarities([$retVal]);
691 :     # Return the ID we generated.
692 :     return $retVal;
693 :     }
694 :    
695 :     1;

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3