Current browse context:
stat.ME
Change to browse by:
References & Citations
Statistics > Methodology
Title: Confidence sets for phylogenetic trees
(Submitted on 28 Jul 2016 (this version), latest version 13 Oct 2017 (v2))
Abstract: Inferring evolutionary histories (phylogenetic trees) has important applications in biology, criminology and public health. However, phylogenetic trees are complex mathematical objects that reside in a non-Euclidean space, which complicates their analysis. While our mathematical, algorithmic, and probabilistic understanding of the behavior of phylogenies in their metric space is relatively mature, rigorous inferential infrastructure is as yet undeveloped. In this manuscript we unify recent computational and probabilistic advancements to propose a method for constructing tree-valued confidence sets. The procedure accounts for both centre and multiple directions of tree-valued variability, proposing advantages over existing methods that are exploratory and only address a single direction of variability. We demonstrate fast identification of splits with weak and strong support, eliminating problems of simultaneous inference that arise when only bootstrap cladal support is considered. We draw on statistical concepts of block replicates for improved testing, investigating the hypothesis of isotrophy in a turtle evolution dataset, identifying the best supported most recent ancestor of the Zika virus (Pacific strains, contrary to media releases in Latin America), and formally test the hypothesis that a Floridian dentist with AIDS infected two of his patients with HIV. The method illustrates connections between variability in Euclidean and tree space, opening phylogenetic tree analysis to techniques available in the multivariate Euclidean setting.
Submission history
From: Amy Willis [view email][v1] Thu, 28 Jul 2016 00:30:25 GMT (140kb,D)
[v2] Fri, 13 Oct 2017 01:18:13 GMT (532kb,D)
Link back to: arXiv, form interface, contact.