Coalescent experiments I: Unlabeled n-coalescent and the site frequency spectrum

Type of content
Reports
Publisher's DOI/URI
Thesis discipline
Degree name
Publisher
Department of Mathematics & Statistics
University of Canterbury. Mathematics and Statistics
Journal Title
Journal ISSN
Volume Title
Language
Date
2009
Authors
Sainudiin, R.
Thornton, K.
Griffiths, R.
McVean, G.
Donnelly, P.
Abstract

We derive the transition structure of a Markovian lumping of Kingman’s n-coalescent [1, 2]. Lumping a Markov chain is meant in the sense of [3, def. 6.3.1]. The lumped Markov process, referred as the unlabeled n-coalescent, is a continuous-time Markov chain on the set of all integer partitions of the sample size n. We derive the backward-transition, forward-transition, state-specific, and sequence-specific probabilities of this chain. We show that the likelihood of any given site-frequency-spectrum (SFS), a commonly used statistics in genome scans, from a locus free of intra-locus recombination, can be directly obtained by integrating conditional realizations of the unlabeled n-coalescent. We develop a controlled Markov chain for importance sampling such integrals from an augmented unlabeled n-coalescent forward in time. We apply the methods to population-genetic data to conduct demographic inference at the empirical resolution of the site-frequency-spectra. We also extend a family of classical hypothesis tests of standard neutrality at a non-recombining locus based on any statistics of the SFS to a more powerful version that conditions on the topological information contained in the SFS. We formalize a graph of coalescent experiments to set a decision-theoretic stage for population genetic inference across different empirical resolutions.

Description
Citation
Sainudiin, R., Thornton, K., Griffiths, R., McVean, G., Donnelly, P. (2009) Coalescent experiments I: Unlabeled n-coalescent and the site frequency spectrum. UCDMS Research Report 2009/7. 29pp..
Keywords
Statistical decision theory of population genetic experiments, partially ordered n-coalescent experiments graph, controlled Markov chain for importance sampling
Ngā upoko tukutuku/Māori subject headings
ANZSRC fields of research
Rights