Reconstructing trees when sequence sites evolve at variable rates

Type of content
Discussion / Working Papers
Publisher's DOI/URI
Thesis discipline
Degree name
Research Report
Publisher
University of Canterbury. Dept. of Mathematics
Journal Title
Journal ISSN
Volume Title
Language
Date
1994
Authors
Steel, Mike A.
Székely, L. A.
Hendy, M. D.
Abstract

For a sequence of colors independently evolving on a tree under a simple Markov model, we consider conditions under which the tree can be uniquely recovered from the "sequence spectrum" - the expected frequencies of the various leaf colorations. This is relevant for phylogenetic analysis (where colors represent nucleotides or amino acids; leaves represent extant taxa) as the sequence spectrum is estimated directly from a collection of aligned sequences. Allowing the rate of the evolutionary process to vary across sites is an important extension over most previous studies - we show that, given suitable restrictions on the rate distribution, the true tree (up to the placement of its root) is uniquely identified by its sequence spectrum. However, if the rate distribution is unknown and arbitrary, then, for simple models, it is possible for every tree to produce the same sequence spectrum. Hence there is a logical barrier to accurate, consistent phylogenetic inference for these models when assumptions about the rate distribution are not made. This result exploits a novel theorem on the action of polynomials with non-negative coefficients on sequences.

Description
Citation
Keywords
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Field of Research::01 - Mathematical Sciences
Rights
Copyright Mike A. Steel