We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

q-bio.PE

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Quantitative Biology > Populations and Evolution

Title: Identifiability of species network topologies from genomic sequences using the logDet distance

Abstract: Inference of network-like evolutionary relationships between species from genomic data must address the interwoven signals from both gene flow and incomplete lineage sorting. The heavy computational demands of standard approaches to this problem severely limit the size of datasets that may be analyzed, in both the number of species and the number of genetic loci. Here we provide a theoretical pointer to more efficient methods, by showing that logDet distances computed from genomic-scale sequences retain sufficient information to recover network relationships in the level-1 ultrametric case. This result is obtained under the Network Multispecies Coalescent model combined with a mixture of General Time-Reversible sequence evolution models across individual gene trees, but does not depend on partitioning sequences by genes. Thus under standard stochastic models statistically justifiable inference of network relationships from sequences can be accomplished without consideration of individual genes or gene trees.
Comments: 25 pages
Subjects: Populations and Evolution (q-bio.PE); Statistics Theory (math.ST)
MSC classes: 92D15, 92D20
Cite as: arXiv:2108.01765 [q-bio.PE]
  (or arXiv:2108.01765v1 [q-bio.PE] for this version)

Submission history

From: John Rhodes [view email]
[v1] Tue, 3 Aug 2021 21:58:19 GMT (310kb,D)

Link back to: arXiv, form interface, contact.