[Bio] / FigKernelPackages / SAPtutorial.pm Repository:
ViewVC logotype

Diff of /FigKernelPackages/SAPtutorial.pm

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.7, Wed Aug 19 17:04:38 2009 UTC revision 1.8, Thu Oct 1 23:43:30 2009 UTC
# Line 22  Line 22 
22  where C<$document> is usually a hash reference and C<$args> is B<always> a hash  where C<$document> is usually a hash reference and C<$args> is B<always> a hash
23  reference. The method description includes a section called I<Parameter Hash  reference. The method description includes a section called I<Parameter Hash
24  Fields> that describes the fields in C<$args>. For example, L<SAP/taxonomy_of>  Fields> that describes the fields in C<$args>. For example, L<SAP/taxonomy_of>
25  has a field called C<ids> that is to be a list of genome IDs and an optional  has a field called C<-ids> that is to be a list of genome IDs and an optional
26  field called C<format> that indicates whether you want taxonomy groups  field called C<-format> that indicates whether you want taxonomy groups
27  represented by numbers, names, or both. To call the I<taxonomy_of> service,  represented by numbers, names, or both. To call the I<taxonomy_of> service,
28  you create a B<SAPserver> object and call a method with the same name as the  you create a B<SAPserver> object and call a method with the same name as the
29  service.  service.
# Line 87  Line 87 
87    
88      my $sapServer = SAPserver->new();      my $sapServer = SAPserver->new();
89    
90  Now we use I<all_genomes> to get a list of the IDs for complete genomes.  Now we use I<all_genomes> to get a list of the complete genomes.
91  I<all_genomes> will normally return B<all> genome IDs, but we use the  I<all_genomes> will normally return B<all> genomes, but we use the
92  C<complete> option to restrict the output to complete genomes.  C<-complete> option to restrict the output to those that are complete.
93    
94      my $genomeIDs = $sapServer->all_genomes(complete => 1);      my $genomeIDs = $sapServer->all_genomes(-complete => 1);
95    
96  All we want are the genome IDs, so we use a PERL trick to convert the  All we want are the genome IDs, so we use a PERL trick to convert the
97  hash reference to a list reference.  hash reference in C<$genomeIDs> to a list reference.
98    
99      $genomeIDs = [ keys %$genomeIDs ];      $genomeIDs = [ keys %$genomeIDs ];
100    
# Line 119  Line 119 
119  An excerpt from the output of this script is shown below. The first column contains  An excerpt from the output of this script is shown below. The first column contains
120  a genome ID, the second contains the representative genome's ID, and the third is  a genome ID, the second contains the representative genome's ID, and the third is
121  the full taxonomy. Note that the two genomes with very close taxonomies have the  the full taxonomy. Note that the two genomes with very close taxonomies have the
122  same representative genome: this is the expected vehavior.  same representative genome: this is the expected behavior.
123    
124      221109.1    221109.1    Bacteria Firmicutes Bacilli Bacillales Bacillaceae Oceanobacillus Oceanobacillus iheyensis Oceanobacillus iheyensis HTE831      221109.1    221109.1    Bacteria Firmicutes Bacilli Bacillales Bacillaceae Oceanobacillus Oceanobacillus iheyensis Oceanobacillus iheyensis HTE831
125      204722.1    204722.1    Bacteria Proteobacteria Alphaproteobacteria Rhizobiales Brucellaceae Brucella Brucella suis Brucella suis 1330      204722.1    204722.1    Bacteria Proteobacteria Alphaproteobacteria Rhizobiales Brucellaceae Brucella Brucella suis Brucella suis 1330
# Line 168  Line 168 
168  =head2 Specifying Gene IDs  =head2 Specifying Gene IDs
169    
170  Many of the Sapling Server services return data on genes (a term we use rather  Many of the Sapling Server services return data on genes (a term we use rather
171  loosely to include any kind of genetic I<locus> or C<feature>). The standard  loosely to include any kind of genetic I<locus> or I<feature>). The standard
172  method for identifying a gene is the I<FIG ID>, an identifying string that  method for identifying a gene is the I<FIG ID>, an identifying string that
173  begins with the characters C<fig|> and includes the genome ID, the gene type,  begins with the characters C<fig|> and includes the genome ID, the gene type,
174  and an additional number for uniqueness. For example, the FIG ID  and an additional number for uniqueness. For example, the FIG ID
# Line 176  Line 176 
176  Bacillus halodurans C-125 (I<272558.1>).  Bacillus halodurans C-125 (I<272558.1>).
177    
178  Frequently, however, you will have a list of gene IDs from some other  Frequently, however, you will have a list of gene IDs from some other
179  database (e.g. I<NCBI> (L<http://www.ncbi.nlm.nih.gov>), I<UniProt> (L<http://www.uniprot.org>))  database (e.g. I<NCBI>, I<UniProt>) or in a community format such as Locus Tags
180  or in a community format such as Locus Tags or gene names. Most services that  or gene names. Most services that take gene IDs as input allow you to specify a
181  take gene IDs as input allow you to specify a C<source> option that indicates  C<-source> option that indicates the type of IDs being used. The acceptable
182  the type of IDs being used. The acceptable formats are as follows.  formats are as follows.
183    
184  =over 4  =over 4
185    
# Line 247  Line 247 
247  also a risk that the server request might time out. If this happens, you may  also a risk that the server request might time out. If this happens, you may
248  want to consider breaking the input into smaller batches. At some point, the  want to consider breaking the input into smaller batches. At some point, the
249  server system will perform sophisticated flow control to reduce the risk of  server system will perform sophisticated flow control to reduce the risk of
250  timeout errors, but we are not yet at that point.  timeout errors, but we are not yet there.
251    
252  =head3 Retrieving Functional Roles  =head3 Retrieving Functional Roles
253    
# Line 296  Line 296 
296          }          }
297      }      }
298    
299  Sample output from this script is shown below. Note that one of the input IDs was  Sample output from this script is shown below. Note that one of the input IDs
300  not found.  was not found.
301    
302      HYPA_ECO57      [NiFe] hydrogenase nickel incorporation protein HypA      HYPA_ECO57      [NiFe] hydrogenase nickel incorporation protein HypA
303      17KD_RICBR      rickettsial 17 kDa surface antigen precursor      17KD_RICBR      rickettsial 17 kDa surface antigen precursor
# Line 311  Line 311 
311      Q8YY27_ANASP was not found.      Q8YY27_ANASP was not found.
312    
313  B<ids_to_subsystems> returns roles in subsystems. Roles in subsystems have  B<ids_to_subsystems> returns roles in subsystems. Roles in subsystems have
314  several differences from general functional roles. A single gene may be in  several differences from general functional roles. Only half of the genes in the
315  multiple subsystems and may have multiple roles in a subsystem. In addition,  database are currently associated with subsystems.A single gene may be in In
316  only half of the genes in the database are currently associated with subsystems.  addition, multiple subsystems and may have multiple roles in a subsystem.
317    
318  As a result, instead of a single string per incoming gene, B<ids_to_subsystems>  As a result, instead of a single string per incoming gene, B<ids_to_subsystems>
319  returns a list. Each element of the list consists of the role name followed by  returns a list. Each element of the list consists of the role name followed by
320  the subsystem name. This makes the processing of the results a little more  the subsystem name. This makes the processing of the results a little more

Legend:
Removed from v.1.7  
changed lines
  Added in v.1.8

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3