 [1] arXiv:1712.04000 [pdf, other]

Title: Physical epistatic landscape of antibody binding affinitySubjects: Populations and Evolution (qbio.PE); Biomolecules (qbio.BM)
Affinity maturation produces antibodies that bind antigens with high specificity by accumulating mutations in the antibody sequence. Mapping out the antibodyantigen affinity landscape can give us insight into the accessible paths during this rapid evolutionary process. By developing a carefully controlled null model for noninteracting mutations, we characterized epistasis in affinity measurements of a large library of antibody variants obtained by TiteSeq, a recently introduced Deep Mutational Scan method yielding physical values of the binding constant. We show that representing affinity as the binding free energy minimizes epistasis. Yet, we find that epistatically interacting sites contribute substantially to binding. In addition to negative epistasis, we report a large amount of beneficial epistasis, enlarging the space of highaffinity antibodies as well as their mutational accessibility. These properties suggest that the degeneracy of antibody sequences that can bind a given antigen is enhanced by epistasis  an important property for vaccine design.
 [2] arXiv:1712.04112 [pdf]

Title: Theoretical Model and Characteristics of Brown Adipocyte ThermogenesisAuthors: JianSheng KangComments: 10 pages, 1 figureSubjects: Subcellular Processes (qbio.SC)
Mitochondria of brown adipocyte (BA) are the main intracellular sites for thermogenesis, which have been targeted for therapy to reduce obesity. However, there are longstanding critique and debates about the ability of raising cellular temperature by endogenous thermogenesis. Currently, wrong theoretical model gives about five orders of magnitude less than facts, and becomes a big problem and obstacle for thermogenesis studies in the field of BA. Here, based on the first law of thermodynamics and thermal diffusion equation, thermal physical model of thermogenesis for BA is deduced. We found the mitochondrial thermogenesis of brown adipocyte is a special case for thermal diffusion equation. The model settles the longstanding questioning about the ability of raising cellular temperature by endogenous thermogenesis, and explains the thermogenic characteristics of brown adipocyte. The model and calculations also suggest that the number of free available proton is the major limiting factor for endogenous thermogenesis and its speed.
 [3] arXiv:1712.04127 [pdf, other]

Title: On the existence of a cherrypicking sequenceComments: Accepted for publication in Theoretical Computer ScienceSubjects: Populations and Evolution (qbio.PE); Data Structures and Algorithms (cs.DS)
Recently, the minimum number of reticulation events that is required to simultaneously embed a collection P of rooted binary phylogenetic trees into a socalled temporal network has been characterized in terms of cherrypicking sequences. Such a sequence is a particular ordering on the leaves of the trees in P. However, it is wellknown that not all collections of phylogenetic trees have a cherrypicking sequence. In this paper, we show that the problem of deciding whether or not P has a cherrypicking sequence is NPcomplete for when P contains at least eight rooted binary phylogenetic trees. Moreover, we use automata theory to show that the problem can be solved in polynomial time if the number of trees in P and the number of cherries in each such tree are bounded by a constant.
 [4] arXiv:1712.04131 [pdf, other]

Title: Attaching leaves and picking cherries to characterise the hybridisation number for a set of phylogeniesSubjects: Populations and Evolution (qbio.PE); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
Throughout the last decade, we have seen much progress towards characterising and computing the minimum hybridisation number for a set P of rooted phylogenetic trees. Roughly speaking, this minimum quantifies the number of hybridisation events needed to explain a set of phylogenetic trees by simultaneously embedding them into a phylogenetic network. From a mathematical viewpoint, the notion of agreement forests is the underpinning concept for almost all results that are related to calculating the minimum hybridisation number for when P=2. However, despite various attempts, characterising this number in terms of agreement forests for P>2 remains elusive. In this paper, we characterise the minimum hybridisation number for when P is of arbitrary size and consists of not necessarily binary trees. Building on our previous work on cherrypicking sequences, we first establish a new characterisation to compute the minimum hybridisation number in the space of treechild networks. Subsequently, we show how this characterisation extends to the space of all rooted phylogenetic networks. Moreover, we establish a particular hardness result that gives new insight into some of the limitations of agreement forests.
 [5] arXiv:1712.04223 [pdf, other]

Title: Identifiability of treechild phylogenetic networks under a probabilistic recombinationmutation model of evolutionComments: 18 pages, 4 figuresSubjects: Populations and Evolution (qbio.PE); Probability (math.PR)
Phylogenetic networks are an extension of phylogenetic trees which are used to represent evolutionary histories in which reticulation events (such as recombination and hybridization) have occurred. A central question for such networks is that of identifiability, which essentially asks under what circumstances can we reliably identify the phylogenetic network that gave rise to the observed data? Recently, identifiability results have appeared for networks relative to a model of sequence evolution that generalizes the standard Markov models used for phylogenetic trees. However, these results are quite limited in terms of the complexity of the networks that are considered. In this paper, by introducing an alternative probabilistic model for evolution along a network that is based on some groundbreaking work by Thatte for pedigrees, we are able to obtain an identifiability result for a much larger class of phylogenetic networks (essentially the class of socalled treechild networks). To prove our main theorem, we derive some new results for identifying treechild networks combinatorially, and then adapt some techniques developed by Thatte for pedigrees to show that our combinatorial results imply identifiability in the probabilistic setting. We hope that the introduction of our new model for networks could lead to new approaches to reliably construct phylogenetic networks.
 [6] arXiv:1712.04329 [pdf, other]

Title: A geometric model of multiscale orientation preference maps via Gabor functionsSubjects: Neurons and Cognition (qbio.NC)
In this paper we present a new model for the generation of orientation preference maps in the primary visual cortex (V1), considering both orientation and scale features. First we undertake to model the functional architecture of V1 by interpreting it as a principal fiber bundle over the 2dimensional retinal plane by introducing intrinsic variables orientation and scale. The intrinsic variables constitute a fiber on each point of the retinal plane and the set of receptive profiles of simple cells is located on the fiber. Each receptive profile on the fiber is mathematically interpreted as a rotated Gabor function derived from an uncertainty principle. The visual stimulus is lifted in a 4dimensional space, characterized by coordinate variables, position, orientation and scale, through a linear filtering of the stimulus with Gabor functions. Orientation preference maps are then obtained by mapping the orientation value found from the lifting of a noise stimulus onto the 2dimensional retinal plane. This corresponds to a Bargmann transform in the reducible representation of the $\text{SE}(2)=\mathbb{R}^2\times S^1$ group. A comparison will be provided with a previous model based on the Bargman transform in the irreducible representation of the $\text{SE}(2)$ group, outlining that the new model is more physiologically motivated. Then we present simulation results related to the construction of the orientation preference map by using Gabor filters with different scales and compare those results to the relevant neurophysiological findings in the literature.
 [7] arXiv:1712.04339 [pdf, other]

Title: Quantitative toxicity prediction using topology based multitask deep neural networksComments: arXiv admin note: substantial text overlap with arXiv:1703.10951Subjects: Quantitative Methods (qbio.QM)
The understanding of toxicity is of paramount importance to human health and environmental protection. Quantitative toxicity analysis has become a new standard in the field. This work introduces element specific persistent homology (ESPH), an algebraic topology approach, for quantitative toxicity prediction. ESPH retains crucial chemical information during the topological abstraction of geometric complexity and provides a representation of small molecules that cannot be obtained by any other method. To investigate the representability and predictive power of ESPH for small molecules, ancillary descriptors have also been developed based on physical models. Topological and physical descriptors are paired with advanced machine learning algorithms, such as deep neural network (DNN), random forest (RF) and gradient boosting decision tree (GBDT), to facilitate their applications to quantitative toxicity predictions. A topology based multitask strategy is proposed to take the advantage of the availability of large data sets while dealing with small data sets. Four benchmark toxicity data sets that involve quantitative measurements are used to validate the proposed approaches. Extensive numerical studies indicate that the proposed topological learning methods are able to outperform the stateoftheart methods in the literature for quantitative toxicity analysis. Our online server for computing elementspecific topological descriptors (ESTDs) is available at this http URL
 [8] arXiv:1712.04377 [pdf]

Title: A review of our current knowledge of clouded leopards (Neofelis nebulosa)Comments: 8 pages, 1 figure, 3 tablesJournalref: International Journal of Avian & Wildlife Biology 2017, 2(5): 00032Subjects: Populations and Evolution (qbio.PE)
Little is known about clouded leopards (Neofelis nebulosa), who have a vulnerable population that extends across southern Asia. We reviewed the literature and synthesized what is known about their ecology and behavior. Much of the published literature either note detections within and on the edges of their range, or are anecdotal observations, many of which are decades if not over a century old. Clouded leopards are a mediumsized felid, with distinctive cloudshape markings, and notably long canines relative to skull size. Estimates for population densities range from 0.58 to 6.53 individuals per 100 km2. Only 7 clouded leopards have been tracked via radiocollars, and home range estimates range from 33.639.7 km2 for females and 35.543.5 km2 for males. Most accounts describe clouded leopards as nocturnal, but radio telemetry studies showed that clouded leopards have arrhythmic activity patterns, with highest activity in the morning followed by evening crepuscular hours. There has never been a targeted study of clouded leopard diet, but observations show that they consume a variety of animals, including ungulates, primates, and rodents. We encourage future study of their population density and range to inform conservation efforts, and ecological studies in order to understand the species and its ecological niche.
 [9] arXiv:1712.04386 [pdf, other]

Title: Hawkes Processes for Invasive Species Modeling and ManagementSubjects: Populations and Evolution (qbio.PE); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Social and Information Networks (cs.SI); Physics and Society (physics.socph)
The spread of invasive species to new areas threatens the stability of ecosystems and causes major economic losses in agriculture and forestry. We propose a novel approach to minimizing the spread of an invasive species given a limited intervention budget. We first model invasive species propagation using Hawkes processes, and then derive closedform expressions for characterizing the effect of an intervention action on the invasion process. We use this to obtain an optimal intervention plan based on an integer programming formulation, and compare the optimal plan against several ecologicallymotivated heuristic strategies used in practice. We present an empirical study of two variants of the invasive control problem: minimizing the final rate of invasions, and minimizing the number of invasions at the end of a given time horizon. Our results show that the optimized intervention achieves nearly the same level of control that would be attained by completely eradicating the species, with a 20% cost saving. Additionally, we design a heuristic intervention strategy based on a combination of the density and life stage of the invasive individuals, and find that it comes surprisingly close to the optimized strategy, suggesting that this could serve as a good rule of thumb in invasive species management.
 [10] arXiv:1712.04412 [pdf]

Title: Missing and spurious interaction in additive, multiplicative and odds ratio modelsAuthors: Jorge FernandezdeCossio (1), Jorge FernandezdeCossioDiaz (2), Toshifumi Takao (3), Yasser Perera (1) ((1) Center for Genetic Engineering and Biotechnology, (2) Center of Molecular Immunology, (3) Institute for Protein Research)Comments: 7 pages, 3 figuresSubjects: Other Quantitative Biology (qbio.OT)
Additive, multiplicative, and odd ratio neutral models for interactions are for long advocated and controversial in epidemiology. We show here that these commonly advocated models are biased, leading to spurious interactions, and missing true interactions.
 [11] arXiv:1712.04195 (crosslist from stat.ML) [pdf, other]

Title: Concept Formation and Dynamics of Repeated Inference in Deep Generative ModelsComments: 20 pages, 9 figuresSubjects: Machine Learning (stat.ML); Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (qbio.NC)
Deep generative models are reported to be useful in broad applications including image generation. Repeated inference between data space and latent space in these models can denoise cluttered images and improve the quality of inferred results. However, previous studies only qualitatively evaluated image outputs in data space, and the mechanism behind the inference has not been investigated. The purpose of the current study is to numerically analyze changes in activity patterns of neurons in the latent space of a deep generative model called a "variational autoencoder" (VAE). What kinds of inference dynamics the VAE demonstrates when noise is added to the input data are identified. The VAE embeds a dataset with clear cluster structures in the latent space and the center of each cluster of multiple correlated data points (memories) is referred as the concept. Our study demonstrated that transient dynamics of inference first approaches a concept, and then moves close to a memory. Moreover, the VAE revealed that the inference dynamics approaches a more abstract concept to the extent that the uncertainty of input data increases due to noise. It was demonstrated that by increasing the number of the latent variables, the trend of the inference dynamics to approach a concept can be enhanced, and the generalization ability of the VAE can be improved.
 [12] arXiv:1510.01197 (replaced) [pdf]

Title: Channel Capacity of Coding System on Tsallis Entropy and qStatisticsAuthors: Tatsuaki TsuruyamaSubjects: Molecular Networks (qbio.MN)
 [13] arXiv:1702.00632 (replaced) [pdf, other]

Title: Gene length as a regulator for ribosome recruitment and protein synthesis: theoretical insightsJournalref: Scientific Reports 7, Article number: 17409 (2017)Subjects: Subcellular Processes (qbio.SC)
 [14] arXiv:1704.05264 (replaced) [pdf, other]

Title: Universal entrainment mechanism governs contact times with motile cellsComments: New analytical entrainment theory; includes Supplementary informations as Appendix; Supplementary movies available upon requestSubjects: Biological Physics (physics.bioph); Fluid Dynamics (physics.fludyn); Cell Behavior (qbio.CB)
 [15] arXiv:1705.04312 (replaced) [pdf, other]

Title: FDRCorrected Sparse Canonical Correlation Analysis with Applications to Imaging GenomicsComments: (1) The methodology has been improved by using an independent subset of the data to estimate the variances utilized in the hypotheses tests of Step 5 of Procedure 1. (2) The proposed sparse CCA procedure is now compared to other widelyused sparse CCA methods in an extensive simulation study (Section VA). (3) The real data application example has been substantially improvedSubjects: Methodology (stat.ME); Quantitative Methods (qbio.QM); Applications (stat.AP); Machine Learning (stat.ML)
 [16] arXiv:1706.03014 (replaced) [pdf]

Title: A machine learning approach to drug repositioning based on drug expression profiles: Applications to schizophrenia and depression/anxiety disordersSubjects: Genomics (qbio.GN); Quantitative Methods (qbio.QM)
 [17] arXiv:1708.03475 (replaced) [pdf, ps, other]

Title: Stochastic spatial models in ecology: a statistical physics approachComments: 18 pages, 12 figures. Included in the special edition "Statistical Theory of Biological Evolution" of Journal of Statistical PhysicsSubjects: Populations and Evolution (qbio.PE); Statistical Mechanics (condmat.statmech)
 [18] arXiv:1708.05453 (replaced) [pdf, other]

Title: Innovation rather than improvement: a solvable highdimensional model highlights the limitations of scalar fitnessComments: 8 pages, 4 figures + Supplementary MaterialSubjects: Populations and Evolution (qbio.PE); Disordered Systems and Neural Networks (condmat.disnn); Statistical Mechanics (condmat.statmech)
 [19] arXiv:1710.07989 (replaced) [pdf, other]

Title: Practical Identifiability and Uncertainty Quantification of a Pulsatile Cardiovascular ModelComments: 47 pages, 9 figures, 3 tablesSubjects: Quantitative Methods (qbio.QM)
 [20] arXiv:1711.07425 (replaced) [pdf, other]

Title: Modular Continual Learning in a Unified Visual EnvironmentSubjects: Learning (cs.LG); Artificial Intelligence (cs.AI); Neurons and Cognition (qbio.NC); Machine Learning (stat.ML)
 [21] arXiv:1712.03926 (replaced) [pdf]

Title: The impact of hydrodynamic interactions on protein folding rates depends on temperatureComments: 11 figuresSubjects: Biomolecules (qbio.BM); Biological Physics (physics.bioph)
