Phylogenies, i.e., the evolutionary histories of groups of organisms,
play a major role in representing the interrelationships among
biological entities. Their pervasiveness has led biologists,
mathematicians, and computer scientists to design a variety of methods
for modeling, comparing, and reconstructing them. We address two
problems with existing methods for phylogeny reconstruction, and
present our solutions.
First, we address the inaccuracy of phylogenetic tree reconstruction
methods. We present a new method, called DCM-NJ+ML that is both fast
and accurate, and outperforms all methods in its class. The method is
based on a divide-and-conquer approach.
The second problem that we address is reticulate evolution. Almost
all existing phylogenetic methods assume that the underlying
evolutionary history of a given set of entities can be represented by
a tree. While this model gives a satisfactory first-order
approximation for many families of organisms, other families exhibit
evolutionary mechanisms that cannot be represented by trees. In
particular, processes such as hybrid speciation (e.g., in groups of
plants) and horizontal gene transfer (e.g., in bacteria) result in
"networks" of relationships rather than trees of relationships.
Although this problem is widely appreciated, there has been
comparatively little work on computational methods for estimating and
studying evolutionary networks. I will describe a mathematical model
of phylogenetic networks, and the simulation tools we have developed
based on this model. Then, I will discuss our new measure of distance
between a pair of networks; this is the first metric that allows for
accessing the topological accuracy of phylogenetic networks. This
suite of tools and measures allows for conducting simulations to study
the performance of network reconstruction methods. Finally, I will
describe our new method for reconstructing phylogenetic networks.
This method, called SpNet (for "Species Networks"), is based on a
separate analysis approach of the dataset: individual gene trees are
first reconstructed, and then the resulting trees are reconciled into
a network. Our experimental studies show that SpNet significantly
outperforms existing methods. Central to our method are efficient
algorithms that we have designed to solve a special case of a
long-standing open problem.
Joint work with:
Tandy Warnow (CS, UT Austin), Randy Linder (Biology, UT Austin), and Bernard Moret (CS, UNM).
Luay Nakhleh is a faculty candidate.
Monday, April 5, 2004 at 3:00 p.m. in DH 1070
Reception preceding the talk at 2:30 p.m. in DH 3092