push @$html, ( 
$cgi>start_form(), 
$cgi>h2("Heat Map NQ"), 
$cgi>p( 
Heat Map NQ is designed to show relationships between subsystems in different environmental samples. Each subsystem that is present in a sample gets a score. The score is calculated by counting the number of sequences that are similar to a protein in each subsystem. This number is divided by the total number of sequences from the sample that are similar to any protein in a subsystem, so it is the fraction of sequences in subsystems. Therefore the size of the sample should not necessarily affect the number that you see. Please note that these numbers are only approximate and "for entertainment purposes only". We will integrate our statistical comparison package xipetotec into this analysis so that you can identify those subsystems that are present at unlikely levels. 
Heat map NQ is designed to show relationships between subsystems in different genomes. This is the prototype 
The raw numbers mean that if there are 10 sequences that hit all subsystems in total, then a subsystem that has two sequences that hit it will get a score of 0.2 (2/10). However, these numbers tend to be 2 and 100000, so the number is very small in most cases. Therefore, the multiplier allows you to multiply all scores by a number to make them 2 instead of 0.0000002. The nonquantitative analysis gets biased by one or two outliers, so you can also overcome the outlier effect by trimming off the maximums — anything above your chosen value is set as the maximum. Note that the maximum value is from the unmodified raw score. 
My reccommendation is that you display different areas of metabolism, with nonquantitative differences grouped in either 5 or 10 groups. 
The raw scores may not mean that 2 is twice as much as 1, just that 2 is more than one. Because of that, and because it is easier to visualize groups of data, you can aggregate all the data into chunks. This will take all scores and split them into however many groups you tell it to. That is the nonquantitative analysis. 
You can also see the raw data by using the quantitative analysis. For this version, you can not select genomes (that will be coming), and so 
My reccommendation is that you display different areas of metabolism, with nonquantitative differences grouped in either 5 or 10 groups. You can also see the raw data by using the quantitative analysis checkbox, but I am not certain how much you can infer from these numbers — does 2 mean twice as much as 1? 

if you compare all of metabolism the nonquantitative analysis will be biased by one or two outliers 


You can also overcome the outlier effect by trimming off the maximums, so that anything above that value is set as the maximum. Note that the 


maximum value is from the unmodified raw score. 


), 

$cgi>h2("Dataset"), 
$cgi>p( 
Please choose some genomes: 
push @$tab, \@row; 
} 
@$tab=sort {$a>[0]>[0] cmp $b>[0]>[0]} @$tab; 
# sort the table by column 1 then col 2 then col 3 
@$tab=sort {$a>[0]>[0] cmp $b>[0]>[0]  $a>[1]>[0] cmp $b>[1]>[0]  $a>[2]>[0] cmp $b>[2]>[0]} @$tab; 
# merge the table 
# skip the data columns 
# skip the data columns 