[Bio] / FigTutorial / phylogeny.html Repository:
ViewVC logotype

View of /FigTutorial/phylogeny.html

Parent Directory Parent Directory | Revision Log Revision Log

Revision 1.1 - (download) (as text) (annotate)
Thu Dec 30 14:25:07 2004 UTC (15 years, 3 months ago) by overbeek
Branch: MAIN
CVS Tags: merge-bodev_news-3, rast_rel_2014_0912, rast_rel_2008_06_18, rast_rel_2008_06_16, rast_rel_2008_07_21, rast_rel_2010_0928, rast_2008_0924, Root-bobdev_news, rast_rel_2008_09_30, caBIG-13Feb06-00, rast_rel_2010_0526, rast_rel_2014_0729, merge-trunktag-bobdev_news-1, rast_rel_2009_05_18, caBIG-05Apr06-00, rast_rel_2009_0925, rast_rel_2010_1206, rast_rel_2010_0118, caBIG-00-00-00, rast_rel_2009_02_05, rast_rel_2011_0119, rast_rel_2008_12_18, merge-trunktag-bodev_news-3, merge-bobdev_news-2, merge-bobdev_news-1, rast_rel_2008_10_09, rast_release_2008_09_29, rast_rel_2008_04_23, rast_rel_2008_08_07, rast_rel_2009_07_09, rast_rel_2010_0827, myrast_33, rast_rel_2011_0928, rast_rel_2008_09_29, rast_rel_2008_10_29, rast_rel_2009_03_26, merge-trunktag-bobdev_news-2, rast_rel_2008_11_24, HEAD
Branch point for: Branch-bobdev_news
add the phylogeny notes

<h1>Tools to Support Phylogenetic Analysis</h1>
Over the coming year FIG and its friends plan on developing and releasing a number of tools
to support phylogenetic analysis.  At this point, we are making
available a tool for inserting taxa into a tree, assuming that one has
an alignment of SSU rRNA and a tree that includes some subset of the
taxa in the alignment.  At one time this technology was used to extend
the tree distributed by thye Ribosomal Database Project.  We have
revived it and make it available now as a first step in supporting the
development of a large SSU-based phylogenetic tree.  By itself, it is
not adequate for many tasks.  A few key tools are needed to complement
the set we are making available.  We plan on making these additional
tools available over the coming year.

<h2>Extending an Existine Tree by Insertion of One Sequence at a Time</h2>

Assuming that you have
<li> a new alignment (call it ssu_alignment.fasta),
<li> a table giving a correspondence between IDs and organisms (call
it ssu.names), and
<li> an old tree (call it old.ssu.tree).

you should follow these steps to extend the tree:
First, you need to verify that all of the ids in the tree are still
in the alignment.   To do this, run
   compare_tree_and_alignment ssu_alignment.fasta old.ssu.tree tmp.tree.only tmp.ali.only tmp.both
If tmp.tree.only is not empty, run
	mv old.ssu.tree old.ssu.tree.BAK
	subtree_of old.ssu.tree.Bak < tmp.both > old.ssu.tree

   If you are inserting into a tree that may contain fragments, you should
   probably consider removing short sequences and then insert sequences in
   descending order of length (i.e., nonambiguous characters).
	count_bases < ssu_alignment.fasta | sort -n -r +1 > nonambiguous
   gives counts for sequences in the alignment.

	initial_set tmp.both nonambiguous > initial.ids
	subtree_of old.ssu.tree < initial.ids > initial.tree
	mv initial.tree old.ssu.tree
        to_insert nonambiguous initial.ids > tmp.ali.only

   is how we recommend handling this.
Now you need to get weights and rates for each column of the
alignment.  To do this, run

	make_rates ssu_alignment.fasta old.ssu.tree > weights_and_rates

<li> Now you are ready to do insertions.  To accomplish this, run

   insert_all ssu_alignment.fasta old.ssu.tree weights_and_rates < tmp.ali.only > new.ssu.tree

   As the insertion runs, it updates the "old.ssu.tree".  This means that if
   the run gets terminated, take off the initial section of tmp.ali.only,
   and just restart it.

You can display your tree (very crudely) using 
	display_tree new.ssu.tree ssu.names

Note that the tree is unrooted.  We supply a command for rooting it,
but you do need to understand exactly where you wish to place the
root.  To root it, use

	root_at Node1 Node2 FractionBetween < UntootedTree > RootedTree

where the the nodes are specified as either
tip id, or
<li>three ids separated by commas (which gives a unique point in the
You can extract a representative tree by using 
	representative_tree big.tree N 
where <i>big.tree</i> is the file containing a newick tree and
<i>N</i> is the number of nodes desired in the representative tree.

MCS Webmaster
ViewVC Help
Powered by ViewVC 1.0.3