[Bio] / FigWebPages / help_RobE.html Repository:
ViewVC logotype

View of /FigWebPages/help_RobE.html

Parent Directory Parent Directory | Revision Log Revision Log


Revision 1.5 - (download) (as text) (annotate)
Sun Sep 3 03:26:23 2006 UTC (13 years, 1 month ago) by redwards
Branch: MAIN
CVS Tags: rast_rel_2014_0912, rast_rel_2008_06_18, rast_rel_2008_06_16, rast_rel_2008_07_21, rast_rel_2010_0928, rast_2008_0924, rast_rel_2008_09_30, rast_rel_2010_0526, rast_rel_2014_0729, rast_rel_2009_05_18, rast_rel_2009_0925, rast_rel_2010_1206, rast_rel_2010_0118, rast_rel_2009_02_05, rast_rel_2011_0119, rast_rel_2008_12_18, rast_rel_2008_10_09, rast_release_2008_09_29, rast_rel_2008_04_23, rast_rel_2008_08_07, rast_rel_2009_07_09, rast_rel_2010_0827, myrast_33, rast_rel_2011_0928, rast_rel_2008_09_29, rast_rel_2008_10_29, rast_rel_2009_03_26, rast_rel_2008_11_24, HEAD
Changes since 1.4: +38 -0 lines
adding metagenomics help

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/1999/REC-html401-19991224/loose.dtd">
<html>
<head>
	<title>Various Help Pieces</title>
<style><!--
	li{margin-top: 1em}
	h1{
		text-align: center; 
		font-size: 200%; 
		background: lightblue; 
		font-family: Arial, Helvetica, sans-serif;
		}
		
	h2{
		text-align: center; 
		font-weight: bolder; 
		background: #D3D3D3;
		font-family: Arial, Helvetica, sans-serif;
		}
	
	img{
		border: solid 1px #FF0000;
		margin: 5px;
	}
	
	div > img {border: none; display: block; margin: 0}
	
	div.multipleimages {
		border: solid 1px #FF0000;
		margin: 5px;
	}
	
	
	
	//-->
</style>
</head>
<body>
<h1>Various Random Pieces of Help</h1>

<p>Written by RobE, March 2005.<p>

<p><a href="#metagenomics">metagenomics help</a></p>


<a name="tagvalue" />
<h2>Tag Value Pairs</h2>



<ol type="I">
<li>Please see:</li>
<p>Please see the specific page on <a href="Attributes.html">attributes</a> that contains significantly more detail now</p>

<li>Overview</li>

<p>
I have tried to comprehensively add tag/value pairs into the SEED following the lead of and discussions with Mike, Ross, Rick, and others. The key points are:
</p>

<ol type="1">
	<li>Organisms</li>
	
	<p>An organism can have tag/value pairs associated with it. To view the data associated with any organism choose that organism from the <a href="/FIG/index.cgi">FIG Search Page</a> and click on the <b>Statistics</b> button.</p>
	<p>From this same page, you can <b>Edit tag/value pairs</b>. There are three options:</p>
	<ol type="i">
	<li>Tag/Values that the organism already has.</li>
	<p>You can change the value of existing tags, or delete the value and the tag will go away.</p>
	<li>Tag/Values that are in the database but the organism doesn't have</li>
	<p>There are pull down menus for you to select things that the database is cogniscent of, but for which we don't yet have data for the given organism. The value entry field is free-form and you can enter whatever you like.</p> 
	<li>New tag/value pairs</li>
	<p>There are some empty fields where you can enter your own data sets for any organism.</p>

	</ol>
	
	<p>In addition, any of the programmers can add tag/value pairs in batch to the database. This is the most efficient way to get data in for parsing later.</p>

<li>Features</li>
<p>Any and every feature (including pegs, RNA's, phage, IS elements and so on) can have tag/value pairs associated with it. Currently they are mostly used with PEGs.</p>
<p>The current rule of thumb (that may change) is that if a tag/value pair has a URL associated with it, it will be displayed on the protein page. Other tag/value pairs are not displayed by default, but once we have more data there we may decide to support selective display some how.</p>
<p>You can edit the tag/value data associated with a particular feature from its page. There is a list of tags that have URLs associated with them, and a button at edit/add more data</p>
<p>When you edit tag/value pairs you not only see the data that is displayed on the protein page, but currently you also see the tag/value pairs that are not displayed, allowing you to edit that data too.</p>
</ol>


 
<li>Uses of Tag/Value Pairs</li>

<p>The most comprehensive use of tag/value pairs is currently on the <b>subsystems spreadsheet page</b>. There are two pull down menus. The "<b>color rows by each organism's attribute</b>" option will color the spreadsheet based on the value of the <b>organisms</b> value for the chosen tag. The "<b>color columns by each PEGs attribute</b>" option will color the columns based on <b>peg</b> tag/value pairs.</p>

<p>The other place that tag/value pairs are being extensively used are in the <a href="#entry">selective SEED entry points</a>.


<li>Excercises with tag/value pairs</li>
<ol type="1" start="1">
	<li>Choose an organism from the FIG search page and select statistics to see the list. There is an option at the bottom of the page to edit the key/value pairs. You have seen this before, I expect. This list still needs editing and cleaning up, to make it more sensible. I'll try and get that done.</li>
	<li>Open Rick's Flagellum subsytem, and scroll to the checkboxes/buttons at the bottom. Notice that the structure checkbox is now two pulldown lists. From the first one (labeled "color rows by each organism's attribute" choose MOTILE), and click show spreadsheet. The sheet is now highlighted with motile and non-motile organisms that have flagella (hmmm, interesting, huh? I think there might be a subsystem story here with pathogens like Shigella and motility; see PMID: 8596461 and PMID: 8682772). This view is also helped by pressing "apple -" (the apple key between the space bar and option or enter, and the minus key) which in safari and firefox will decrease the font size and put the whole table in one view. (Use "apple" + to get the font bigger, or "apple 0" to get it back to where you started). There is a key at the bottom just above the "show spreadsheet" button so you know which color is which, and in this case there is only motile and non-motile</li>
	<li>Now choose WIDTH from the same pull down menu, and click show spreadsheet. Because width is a numeric variable, I grouped these key/value pairs in 1/10ths of the maximum. If you look at the Color Descriptions box you will see ranges (this is not perfect at the moment, but it is on the way).</li>
	<li>Now reset the WIDTH pull-down menu to empty (the first option in the list), and choose structure from the menu labelled "color columns by each PEGs attribute" and click show spreadsheet. This is the same as before, but hopefully we can add more tags here and color other things.</li>
	<li>From one of the PEGs that is colored as having a structure link click on the link to get to the protein page. There is the attributes box (as before), and a new "Edit Attributes" button. When you click this, you will get three fields, key, value, and URL. If you go to a protein that does not have any attributes yet, you still get the edit box to let you add some attributes.</li>



<br />
The rules that apply here are:

<ol type="i">
	<li>the text is free form and can be whatever you like.</li>
	<li>the key is case insensitive (at the moment generally uppercase, but I may change this to Sentence Case)</li>
	<li>if the URL is a webpage, the key/value pair will be visible on the protein page. The URL doesn't have to be a webpage, and as I mentioned before, will probably become a flag for many other things.</li>
	<li>you can add, edit, or delete individual key/value pairs here.</li>
	<li>if you have a lot of key value pairs, you can send them to me and I'll load them in batch.</li>
</ol>


</ol>




</ol>





<a name="entry" />
<h2>Selective SEED entry points</h2>

<p>I have written an cgi interface that mimics cyanoseed in an interactive way.</p>

<div class="technical">
<p>There are a couple of technical things, such as some major-ish changes associated with this. First, I amended the Subsystem.pm to allow active_subsetR to be selected by tag/value pairs. This incurred a new method load_row_subsets_by_kv. This was a natural extension of the coloring that I had previously, and in theory you should be able to limit a spreadsheet to any key/value pair. For example, if you do the motile example I discussed earlier this week you will see that the motile/non-motile color key are now active links that will limit display of the spreadsheet to those organisms. This only limits display and does not affect other subsystem operations.</p>

<p>Second, I added support for css to HTML.pm. This is silent - if you don't do anything to call it, two default css sheets are defined and used. One is essentially blank, and keeps all the SEED interfaces looking like they do now. This is the default behavior, unless you specifically provide a css. HTML.pm now has options for default and alternate css sheets for any page displayed using show_page. I already have several css sheets to add, and so I created a new directory FigCSS to hold them  and associated files (background images). During make this is converted to FIG/CGI/Html/css/</p>

<p>I made a <a href="/FIG/organisms.cgi">new interface</a>. Using this interface we can interactively design specific websites for specific people or uses. I have a couple primary examples that I am working on at the moment:

<ol type="i">
	<li>Cyanobacterial SEED entry point (aka CyanoSEED)</li>
	<p>The basic site is <a href="http://cyanoseed.thefig.info">here</a><br />
	The new version is <a href="/FIG/organisms.cgi?show=cyano">here</a><br />
	Essentially all the sites are based on this template (although I massaged it a little to use &lt;div&gt; rather than &lt;iframe&gt; calls).</p>

	<li>Marine SEED entry point</li>
	<p>This <a href="/FIG/organisms.cgi?show=marine">site</a>, is designed to support researchers interested in the marine organisms that we have in the SEED.


	<li>My SEED</li>
	<p>This is another name that can easily be changed. In <a href="/FIG/organisms.cgi">your SEED</a> you can select whichever organisms interest you from the database and generate an entry point into the SEED. This demonstrates the versatility of the approach, as well as allows us to showcase specific SEED entry points whenever we want.</p>
	
</ol>


</div>

<a name="cookies" />
<h2>Cookies</h2>

<h3>The basics of Cookies</h3>

<p>I have created a cookie called FIG that contains tag value pairs. The tag/value pairs are unlimited, but before you create your own you should take a look at the existing ones to make sure you are not overwriting someone else's!<p>

<h3>Getting the value of a cookie</h3>

<p>The easiest way to get and set the values of cookies is to use the method raelib->cookies. This method returns a tuple of a hash of the cookie contents and the cookie itself. So you can call it like this to get the hash:</p>

 <tt>
 my $cookie=[$raelib->cookie($cgi)]->[1];
 </tt>

<p>$cookie is a reference to a hash with all the cookie information. eg. $cookie->{'user'} will return the username.</p>
<p>You can also get a specific value out like this:</p>

 <tt>
 my $user=[$raelib->cookie($cgi)]->[1]->{'user'};
 </tt>

<p>and now user contains the username</p>

<h3>Setting the value of a cookie</h3>

<p>Pass raelib->cookie the $cgi reference, and a reference to a hash with the data that you wish to be in a cookie. The only thing to note is that any value you pass in will replace the existing value of the cookie on the users machine. The name part of the cookie will be modified according to FIG->clean_attribute_key which removes reserved characters, but the value will not be changed. So you should probably use a "clean" name if you want to get the data back out!. Existing cookie data will not be touched unless you change the values.</p>

<p><b>Note: </b>At the moment the cookie method resides in raelib, although it should be moved somewhere else. However, it maintains the state of the cookie using $self while the script is running (so you can add/edit the cookies at anytime). To my knowledge HTML.pm (the most reasonable choice for cookies) does not maintain state this way (and is always called &HTML::foo). Perhaps we should make a method like HTMLHelper.pm or something?</p>


<h1><a name="metagenomics">Metagenomics Help</a></h1>



<h1>Some tricks and tips for finding things in metagenomes on the seed</h1>

<h2>Rob Edwards, August, 2006</h2>

<p>These are some things that people have found very useful for finding things on the SEED. It is by no means an exhaustive list of what can be accomplished with the metagenomes that we have installed, but it should give you some starting points. Please email Rob with any tips that you find useful and would like to share.<p>


<h2>Getting Started</h2>
<p>Most of these are from the <a href="http://seed.sdsu.edu/FIG/index.cgi">seed</a> index page.</p>
<ol>
<li>Genome IDs</li>
<p>Each genome has a static ID. For regular genomes these are generally the taxonomic ID of the organism followed by a version number, so 83333.1 is the first version of the <i>E. coli</i> K-12 genome. For the metagenomes these are either "seven-9's" or "seven-4's" like 4444444.23 or 9999999.2. In general I have been using seven-4's to indicate 454 sequences and seven-9's to indicate sanger sequences</p> 
<li>Simple statistics</li>
<p>Scroll down until the genomes show. If you have not yet chosen the metagenomes, please do so by choosing <em>Environmental samples</em> and <em>All</em> from the Domains and Completeness checkboxes, and then clicking update. This will show you a list of the environmental genomes</p>
<p>To find some statistics about a genome, choose it in the list, and click Statistics.<p>
<li>Searching in a single genome</li>
<p>From the statistics page (above), you can search a single genome. You can also enter its genome number (something like 4444444.45) in the search box, and then some text of your choice. That will search only your genome.</p>
<li>Metabolic Overview of A Metagenome</li>
<p>Choose a metagenome from the list, and then choose a metabolic map from the menu marked <em>Metabolic Overviews and Subsystem Maps (via KEGG & SEED) - Choose Map</em>. Clicking the button labeled <em>Metabolic Overview</em> will show you a map for that sample with all genes in the sample highlighted.</p>
</ol>

<h2>Similarities</h2>
<p>In general I have installed similarities and annotated the metagenomes based on the data in the SEED. For any given protein you can find the information that was used to generate that annotation. Go to the protein page, by clicking on a protein ID, or searching for something in your genome. If you are playing, I suggest searching for something like Histidine along with your samples ID number to get a reasonable number of hits. Then, from the protein page, near the bottom, you can retrieve the precomputed SIMS. Just increase the E-value to 10 because shorter sequences tend to have worse E-values, and click Similarities to see all the similarities, highlighted by organism.</p>

<h2>Heat Maps</h2>
<p><a href="http://seed.sdsu.edu/FIG/heat_map.cgi">Heat maps</a> are available here, and are a SEED-centric way of looking at the metagenomes. I have precomputed the connections between a single metagenome and each of the subsystems in the seed. This allows you to paint and color the metagenomes. There is more help on the heat map page.</p>




</body>

</html>


MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3