Distributions of gene tree branch lengths under coalescence
In Bayesian phylogenetic inference, commonly used prior distributions for branch lengths are the uniform, exponential, and gamma distributions. We derive the exact distributions of branch lengths of gene trees under a fixed species tree using the coalescent model. We find that the distributions of branch lengths depend on both the shape and branch lengths of the species tree, which depend on the population genetic parameters of ancestral population sizes and divergence times. Distributions of branch lengths are formed by mixtures depending on the ancestral populations in which coalescent events occur (coalescent histories). For some sets of moderately short branches, these mixtures can lead to distributions of branch lengths which are not well approximated by uniform, exponential, or gamma distributions, thus suggesting that a prior based on a mixture of distributions might be more appropriate for inferring branch lengths on some gene trees.