[Bio] / FigWebPages / Attributes.html Repository:
ViewVC logotype

Diff of /FigWebPages/Attributes.html

Parent Directory Parent Directory | Revision Log Revision Log | View Patch Patch

revision 1.4, Tue Jul 12 14:34:15 2005 UTC revision 1.8, Tue Jun 13 23:32:15 2006 UTC
# Line 1  Line 1 
1  <h1 style="text-align: center">Attributes</h1>  <h1 style="text-align: center">Attributes</h1>
2    
3  <h2 style="text-align: center">Updated July 11th, 2005. Rob Edwards</h2>  <h2 style="text-align: center">Updated July 11th, 2005. Rob Edwards</h2>
4    
5    
# Line 6  Line 7 
7                  <h3 style="text-align: center">Contents</h3>                  <h3 style="text-align: center">Contents</h3>
8                          <li><a href="#overview">Overview</a></li>                          <li><a href="#overview">Overview</a></li>
9                          <li><a href="#definitions">Definitions</a></li>                          <li><a href="#definitions">Definitions</a></li>
10                            <li><a href="#filelocations">File Locations</a></li>
11                            <li><a href="#scripts">Scripts for working with attributes</a></li>
12                          <li><a href="#methods">Methods for accessing attributes</a></li>                          <li><a href="#methods">Methods for accessing attributes</a></li>
13                            <ul>
14                          <li><a href="#get_attributes">get_attributes</a></li>                          <li><a href="#get_attributes">get_attributes</a></li>
15                          <li><a href="#add_attribute">add_attribute</a></li>                          <li><a href="#add_attribute">add_attribute</a></li>
16                          <li><a href="#delete_attribute">delete_attribute</a></li>                          <li><a href="#delete_attribute">delete_attribute</a></li>
# Line 19  Line 23 
23                          <li><a href="#guess_value_format">guess_value_format</a></li>                          <li><a href="#guess_value_format">guess_value_format</a></li>
24                          <li><a href="#attribute_location">attribute_location</a></li>                          <li><a href="#attribute_location">attribute_location</a></li>
25                  </ul>                  </ul>
26                    </ul>
27    
28  <p>I have added attributes to the database in a more significant way. This page is to document those attributes and ways to access/modify them. The page has two sections, a non-technical section for general discussion and overview, and a technical section for behind-the-scenes type information.</p>  <p>I have added attributes to the database in a more significant way. This page is to document those attributes and ways to access/modify them. The page has two sections, a non-technical section for general discussion and overview, and a technical section for behind-the-scenes type information.</p>
29    
# Line 42  Line 47 
47    
48  <li>Now choose WIDTH from the same pull down menu, and click show spreadsheet. Because width is a numeric variable, I grouped these key/value pairs in 1/10ths of the maximum. If you look at the Color Descriptions box you will see ranges (this is not perfect at the moment, but it is on the way).</li>  <li>Now choose WIDTH from the same pull down menu, and click show spreadsheet. Because width is a numeric variable, I grouped these key/value pairs in 1/10ths of the maximum. If you look at the Color Descriptions box you will see ranges (this is not perfect at the moment, but it is on the way).</li>
49    
50  <li>Now reset the WIDTH pull-down menu to empty (the first option in the list), and choose PIRSF from the menu labelled "color columns by each PEGs attribute" and click show spreadsheet. This is the same as before, but hopefully we can add more keys here and color other things.</li>  <li>Now reset the WIDTH pull-down menu to empty (the first option in the list), and choose structure from the menu labelled "color columns by each PEGs attribute" and click show spreadsheet. This is the same as before, but hopefully we can add more keys here and color other things.</li>
51    
52  <li>From one of the PEGs that is colored as having a PIRSF link click on the link to get to the protein page. There is the attributes box (as before), and a new "Edit Attributes" button. When you click this, you will get three fields, key, value, and URL. If you go to a protein that does not have any attributes yet, you still get the edit box to let you add some attributes.</li>  <li>From one of the PEGs that is colored as having a structure link click on the link to get to the protein page. There is the attributes box (as before), and a new "Edit Attributes" button. When you click this, you will get three fields, key, value, and URL. If you go to a protein that does not have any attributes yet, you still get the edit box to let you add some attributes.</li>
53    
54    
55  </ul>  </ul>
# Line 65  Line 70 
70                          <li>Keys are case sensitive</li>                          <li>Keys are case sensitive</li>
71                          <li>An optional mapping is provided between a key and an explanation of what the key means (see below)</li>                          <li>An optional mapping is provided between a key and an explanation of what the key means (see below)</li>
72                          <li>By default, any key can have multiple values. If a key is to have only one value then a boolean can be set (see below) to limit this behavior</li>                          <li>By default, any key can have multiple values. If a key is to have only one value then a boolean can be set (see below) to limit this behavior</li>
73                            <li>keys cannot contain the following characters: space, tab or newline or any of @$!#%^&*()`~{}[]|\:;"'<>?,./
74    
75    
76                  </ul>                  </ul>
77                  <li><em>Value</em>. The value is free form and there are no limitations on what is contained in the value.                  <li><em>Value</em>. The value is free form and there are no limitations on what is contained in the value.
78                  <li><em>URL</em>. The URL is optional, and not required for any data set.                  <li><em>URL</em>. The URL is optional, and not required for any data set.
79          </ul>          </ul>
80          <br>          <br>
81          <li style="font-weight: 700">File Locations</li>          <li style="font-weight: 700"><a name="filelocations">File Locations</a></li>
82          <ul>          <ul>
83                  <li><em>General Attributes</em> Attributes are stored in the following locations:</li>                  <li><em>General Attributes</em> Attributes are stored in the following locations:</li>
84                  <ul>                  <ul>
85                          <li>$FIG_Config::organisms/xxxxx/Attributes contains the genome and organism attributes</li>                          <li>$FIG_Config::organisms/xxxxx/Attributes contains the genome and organism attributes</li>
86                          <li>$FIG_Config::organisms/xxxxx/Features/peg/Attributes contains the attributes for pegs</li>                          <li>$FIG_Config::organisms/xxxxx/Features/peg/Attributes contains the attributes for pegs</li>
87                          <li>$FIG_Config::organisms/xxxxx/Features/rna/Attributes contains the attributes for rnas... etc</li>                          <li>$FIG_Config::organisms/xxxxx/Features/rna/Attributes contains the attributes for rnas... etc</li>
88                          <li>Note that no general attributes should be stored in $FIG_Config::global (see below)</li>                          <li>Note that general attributes should not normally be stored in $FIG_Config::global (see below)</li>
89                  </ul>                  </ul>
90                    <li>All attributes files can hold comments as long as the line begins with a pound sign. Blank lines are also ignored.
91                  <br>                  <br>
92                  <li><em>Deleted Attributes</em>                  <li><em>Modified attributes</em></li>
93                  <ul>                  <ul>
94                          <li>Deleted attributes are stored in the text file $FIG_Config::global/Attributes/deleted_attributes. The only information that is stored here is the ID and the key. Note that this will currently delete all occurences of this key from this ID (hence with multiple values, all will be deleted).</li>                          <li>Modified attributes are stored in the files transaction_log</li>
95                            <li>There are separate transaction_logs in each of the locations where attributes are stored (e.g. the Features/peg/Attributes, Organism/nnnn.nn/Attributes, and Global/Attributes directories<li>
96                            <li>The transaction_log file has the following format:
97                            <ol>
98                                    <li>Method. This must be one of ADD/CHANGE/DELETE</li>
99                                    <li>Feature ID (e.g. peg, genome, or RNA number)</li>
100                                    <li>Key</li>
101                                    <li>Old value</li>
102                                    <li>Old URL</li>
103                                    <li>New value</li>
104                                    <li>New URL</li>
105                            </ol>
106                            <li>The old value, old, url, new value, and new url are optional depending on the method. For example, old value/url can be null if the method is add, and new value/new url can be null if the method is delete.</LI>
107                            <li>If the old value and old URL are ommitted and the method is delete all attributes that match key will be deleted from the feature</li>
108    
109                  </ul>                  </ul>
110                  <br>                  <br>
111                  <li><em>Metadata</em></li>                  <li><em>Metadata</em></li>
# Line 96  Line 119 
119                          </ol>                          </ol>
120                  </ul>                  </ul>
121          </ul>          </ul>
122            <li style="font-weight: 700"><a name="scripts">Scripts for working with attributes</a></li>
123            <li>Here are a few common scripts that you may want to use:
124            <ol>
125                    <li>load_attributes</li>
126                    <p>This will delete the current attributes database, look through all the potential places that attributes are stored and add those attributes into the database. Both genome-specific and global attributes will be added. Finally, each of the transaction_logs are processed and the data added back into the database. This is used to add new data to a database, and to rebuild an existing database.</p>
127                    <li>gather_attributes</li>
128                    <p>Atrributes are stored in disparate locations (global, genome, etc) and this will look through all the various locations and print out any attributes that are found. Gather attributes can take an optional -d on the command line, and will "delete" any attributes file that it finds. It doesn't actually delete the file, rather moves it to FIG_Config::temp/Attributes/deleted_attributes, and you can delete it from there.</p>
129                    <li>distribute_attributes</li>
130                    <p>This script will take any attributes on STDIN and write them to their appropriate locations.</p>
131    <p><b>Recommended</b> The recommended way to run these two commands is to first run gather attributes to collate the information and delete it:
132    <br><tt>
133    $gather_attributes -d > gathered_attributes.txt
134    </tt>
135    </br>
136    <br>And then to run the distribute command:</br>
137    <br><tt>
138    $sort -u gathered_attributes.txt | distribute_attributes
139    </tt></br>
140    
141    <p>This will recreate the attributes files, and overcome any potential problems of writing files that are being moved.</p>
142    
143                    <li>dump_attributes</li>
144                    <p>Dumps the current value of each attribute from the database, so these have all the changes in transaction_log already enacted.</p>
145            </ol>
146    
147    
148    
149  </ol>  </ol>
150    
151    
# Line 127  Line 177 
177  [fid, key, value, url]</p>  [fid, key, value, url]</p>
178  <p>You can request an E. coli key like this  <p>You can request an E. coli key like this
179  $fig-&gt;get_attributes('83333.1');</p>  $fig-&gt;get_attributes('83333.1');</p>
180  <p>You can request any PIRSF key like this  <p>You can request any "structure" key like this
181  $fig-&gt;get_attributes('', 'PIRSF');</p>  $fig-&gt;get_attributes('', 'structure');</p>
182  <p>You can request any google url like this  <p>You can request any google url like this
183  $fig-&gt;get_attributes('', '', '', 'http://www.google.com');</p>  $fig-&gt;get_attributes('', '', '', 'http://www.google.com');</p>
184  <p>NOTE: If there are no attributes an empty array will be returned. You need to check for this and not assume that it will be undef.</p>  <p>NOTE: If there are no attributes an empty array will be returned. You need to check for this and not assume that it will be undef.</p>
# Line 143  Line 193 
193          value          value
194          optional URL to add          optional URL to add
195          optional file to store the attributes in.</pre>          optional file to store the attributes in.</pre>
196  <p>A note on file names. At the moment the file assigned_attributes is used to store new attributes by default, and load_attributes loads that file last so any changes will overwrite existing keys. However this is not quite true since we can now have multiple key/values for a single peg. Using this method you can define a filename to store the attributes in. The directory structure will be figured out for you, so you can use something like ``pirsf'' as the file name.</p>  <p>A note on file names. At the moment the file assigned_attributes is used to store new attributes by default, and load_attributes loads that file last so any changes will overwrite existing keys. However this is not quite true since we can now have multiple key/values for a single peg. Using this method you can define a filename to store the attributes in. The directory structure will be figured out for you, so you can use something like ``structure'' as the file name.</p>
197  <p>  <p>
198  </p>  </p>
199  <h3><a name="delete_attribute">delete_attribute</a></h3>  <h3><a name="delete_attribute">delete_attribute</a></h3>
# Line 173  Line 223 
223  <h3><a name="erase_attribute_entirely">erase_attribute_entirely</a></h3>  <h3><a name="erase_attribute_entirely">erase_attribute_entirely</a></h3>
224  <p>This method will remove any notion of the attribute that you give it. It is different from delete as that just removes a single attribute associated with a peg. This will remove the files and uninstall the attributes from the database so there is no memory of that type of attribute. All of the attribute files are moved to FIG_Tmp/Attributes/deleted_attributes, and so you can recover the data for a while. Still, you should probably use this carefully!</p>  <p>This method will remove any notion of the attribute that you give it. It is different from delete as that just removes a single attribute associated with a peg. This will remove the files and uninstall the attributes from the database so there is no memory of that type of attribute. All of the attribute files are moved to FIG_Tmp/Attributes/deleted_attributes, and so you can recover the data for a while. Still, you should probably use this carefully!</p>
225  <p>I use this to clean out old PIR superfamily attributes immediately before installing the new correspondence table.</p>  <p>I use this to clean out old PIR superfamily attributes immediately before installing the new correspondence table.</p>
226  <p>e.g. my $status=$fig-&gt;erase_attribute_entirely(``pirsf'');</p>  <p>e.g. my $status=$fig-&gt;erase_attribute_entirely(``structure'');</p>
227  <p>This will return the number of files that were moved to the new location</p>  <p>This will return the number of files that were moved to the new location</p>
228  <p>  <p>
229  </p>  </p>
# Line 182  Line 232 
232  <p>Without any arguments:</p>  <p>Without any arguments:</p>
233  <p>Returns a reference to a hash, where the key is the type of feature (peg, genome, rna, prophage, etc), and the value is a reference to a hash where the key is the key name and the value is a reference to an array of all features with that id.</p>  <p>Returns a reference to a hash, where the key is the type of feature (peg, genome, rna, prophage, etc), and the value is a reference to a hash where the key is the key name and the value is a reference to an array of all features with that id.</p>
234  <p>e.g.</p>  <p>e.g.</p>
235  <p>print ``There are  '' , scalar @{{$fig-&gt;get_keys}-&gt;{'peg'}-&gt;{'PIRSF'}}, `` PIRSF keys in the database\n'';</p>  <p>print ``There are  '' , scalar @{{$fig-&gt;get_keys}-&gt;{'peg'}-&gt;{'structure'}}, `` Structure keys in the database\n'';</p>
236  <p>my $keys=$fig-&gt;get_keys;  <p>my $keys=$fig-&gt;get_keys;
237  foreach my $type (keys %$keys)  foreach my $type (keys %$keys)
238  {  {
# Line 215  Line 265 
265  <pre>  <pre>
266          $fig-&gt;get_values('peg'); # will get all values for pegs</pre>          $fig-&gt;get_values('peg'); # will get all values for pegs</pre>
267  <pre>  <pre>
268          $fig-&gt;get_values('peg', 'pirsf'); # will get all values for pegs with attribute pirsf</pre>          $fig-&gt;get_values('peg', 'structure'); # will get all values for pegs with attribute structure</pre>
269  <pre>  <pre>
270          $fig-&gt;get_values(undef, 'pirsf'); # will get all values for anything with that attribute</pre>          $fig-&gt;get_values(undef, 'structure'); # will get all values for anything with that attribute</pre>
271  <p>  <p>
272  </p>  </p>
273  <h3><a name="key_info">key_info</a></h3>  <h3><a name="key_info">key_info</a></h3>
274  <p>Access a reference to an array of [single, explanation]</p>  <p>Access a hash of key information. The data that are returned are:</p>
275    <table>
276    <tr><td>hash key name</td><td>what is it</td><td>data type</td></tr>
277    <tr><td>single</td><td>Whether the attribute can handle only a single data point</td><td>[boolean]</td></tr>
278    <tr><td>description</td><td>Explanation of key</td><td>[free text]</td></tr>
279    <tr><td>readonly</td><td>whether to allow read/write</td><td>[boolean]</td></tr>
280    <tr><td>is_cv</td><td>attribute is a cv term</td><td>[boolean]</td></tr>
281    </table>
282    
283  <p>Single is a boolean, if it is true only the last value returned should be used. Note that the other methods willl still return all the values, it is upto the implementer to ensure that only the last value is used.</p>  <p>Single is a boolean, if it is true only the last value returned should be used. Note that the other methods willl still return all the values, it is upto the implementer to ensure that only the last value is used.</p>
284  <p>Explanation is a user-derived explanation that can be defined.</p>  
285  <p>if a reference to an array is provided, along with the key, those values will be set.</p>  <p>Explanation is a user-derived explanation that can be free text</p>
286  <p>e.g.  
287  $fig-&gt;key_info($key, \@data); # set the data  <p>If a reference to a hash is provided, along with the key, those values will be set to the attribute_keys file</p>
288  $data=$fig-&gt;key_info($key); # get the data</p>  
289    <p>Returns an empty hash if the key is not provieded or doesn't exist</p>
290    
291    <p>e.g.<br />
292    $fig->key_info($key, \%data); # set the data<br />
293    $data=$fig->key_info($key); # get the data<br />
294    </p>
295    
296    
297    
298  <p>  <p>
299  </p>  </p>
300  <h3><a name="get_key_value">get_key_value</a></h3>  <h3><a name="get_key_value">get_key_value</a></h3>

Legend:
Removed from v.1.4  
changed lines
  Added in v.1.8

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3