Computing the minimum number of hybridization events for a consistent evolutionary history
It is now well-documented that the structure of evolutionary relationships between a set of present-day species is not necessarily tree-like. The reason for this is that reticulation events such as hybridizations mean that species are a mixture of genes from different ancestors. Since such events are relatively rare, a fundamental problem for biologists is to determine the smallest number of hybridization events required to explain a given (input) set of data in a single (hybrid) phylogeny. The main results of this paper show that computing this smallest number is APX-hard, and thus NP-hard, in the case the input is a collection of phylogenetic trees on sets of present-day species. This answers a problem which was raised at a recent conference (Phylogenetic Combinatorics and Applications, Uppsala University, 2004). As a consequence of these results, we also correct a previously published NP-hardness proof in the case the input is a collection of binary sequences, where each sequence represents the attributes of a particular present-day species. The APX-hardness of these problems means that it is unlikely that there is an efficient algorithm for either computing the result exactly or approximating it to any arbitrary degree of accuracy.