Finding Optimal Phylogenetic Trees

16 June 2015
Katherine St. John

Phylogenies, or evolutionary histories, play a central role in modern biology, illustrating the interrelationships between species, and also aiding the prediction of structural, physiological, and biochemical properties. The reconstruction of the underlying evolutionary history from a set of morphological characters or biomolecular sequences is difficult since the optimality criteria favored by biologists are NP-hard, and the space of possible answers is huge. Phylogenies are often modeled by trees with n leaves, and the number of possible phylogenetic trees is $(2n-5)!!$. Due to the hardness and the large number of possible answers, clever searching techniques and heuristics are used to estimate the underlying tree.

We explore the combinatorial structure of the underlying space of trees, under different metrics, in particular the nearest-neighbor-interchange (NNI), subtree- prune-and-regraft (SPR), tree-bisection-and-reconnection (TBR), and Robinson-Foulds (RF) distances.  Further, we examine the interplay between the metric chosen and the difficulty of the search for the optimal tree.

  • Combinatorial Theory Seminar