We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 22 entries: 1-22 ]
[ showing up to 1000 entries per page: fewer | more ]

New submissions for Mon, 6 Jul 20

[1]  arXiv:2007.01340 [pdf, other]
Title: Lack of evidence for a substantial rate of templated mutagenesis in B cell diversification
Subjects: Populations and Evolution (q-bio.PE); Applications (stat.AP)

B cell receptor sequences diversify through mutations introduced by purpose-built cellular machinery. A recent paper has concluded that a "templated mutagenesis" process is a major contributor to somatic hypermutation, and therefore immunoglobulin diversification, in mice and humans. In this proposed process, mutations in the immunoglobulin locus are introduced by copying short segments from other immunoglobulin genes. If true, this would overturn decades of research on B cell diversification, and would require a complete re-write of computational methods to analyze B cell data for these species.
In this paper, we re-evaluate the templated mutagenesis hypothesis. By applying the original inferential method using potential donor templates absent from B cell genomes, we obtain estimates of the methods's false positive rates. We find false positive rates of templated mutagenesis in murine and human immunoglobulin loci that are similar to or even higher than the original rate inferences, and by considering the bases used in substitution we find evidence that if templated mutagenesis occurs, it is at a low rate. We also show that the statistically significant results in the original paper can easily result from a slight misspecification of the null model.

[2]  arXiv:2007.01344 [pdf, other]
Title: Decoding asymptomatic COVID-19 infection and transmission
Comments: 18 pages, 5 figures
Subjects: Populations and Evolution (q-bio.PE); Biomolecules (q-bio.BM)

Coronavirus disease 2019 (COVID-19) is a continuously devastating public health and the world economy. One of the major challenges in controlling the COVID-19 outbreak is its asymptomatic infection and transmission, which are elusive and defenseless in most situations. The pathogenicity and virulence of asymptomatic COVID-19 remain mysterious. Based on the genotyping of 20656 Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) genome isolates, we reveal that asymptomatic infection is linked to SARS-CoV-2 11083G>T mutation, i.e., leucine (L) to phenylalanine (F) substitution at the residue 37 (L37F) of nonstructure protein 6 (NSP6). By analyzing the distribution of 11083G>T in various countries, we unveil that 11083G>T may correlate with the hypotoxicity of SARS-CoV-2. Moreover, we show a global decaying tendency of the 11083G>T mutation ratio indicating that 11083G>T hinders SARS-CoV-2 transmission capacity. Sequence alignment found both NSP6 and residue 37 neighborhoods are relatively conservative over a few coronaviral species, indicating their importance in regulating host cell autophagy to undermine innate cellular defense against viral infection. Using machine learning and topological data analysis, we demonstrate that mutation L37F has made NSP6 energetically less stable. The rigidity and flexibility index and several network models suggest that mutation L37F may have compromised the NSP6 function, leading to a relatively weak SARS-CoV subtype. This assessment is a good agreement with our genotyping of SARS-CoV-2 evolution and transmission across various countries and regions over the past few months.

[3]  arXiv:2007.01363 [pdf, other]
Title: Optimal evolutionary decision-making to store immune memory
Subjects: Populations and Evolution (q-bio.PE); Biological Physics (physics.bio-ph); Quantitative Methods (q-bio.QM)

The adaptive immune system in vertebrates consists of highly diverse immune receptors to mount specific responses against a multitude of pathogens. A central feature of the adaptive immune system is the ability to form a memory to act more efficiently in future encounters with similar pathogens. However, memory formation especially in B-cells is one of the least understood cell fate decisions in the immune system. Here, we present a framework to characterize optimal strategies to store memory in order to maximize the utility of immune response to counter evolving pathogens throughout an organism's lifetime. To do so, we have incorporated the kinetics and energetics of memory response as ingredients of non-equilibrium decision-making between an adaptive exploration to mount a specific and novel response or exploitation of existing memory that can be activated rapidly yet with a reduced specificity against evolved pathogens. To achieve a long-term benefit for the host, we show that memory generation should be actively regulated and dependent on immune receptors' affinity, with a preference for cross-reactive receptors with a moderate affinity against pathogens as opposed to high affinity receptors -- a recipe that is consistent with recent experimental findings [1,2]. Moreover, we show that the specificity of memory should depend on the organism's lifespan, and shorter-lived organisms with fewer pathogenic encounters throughout their lifetime should store more cross-reactive memory. Overall, our framework provides a baseline to gauge the efficacy of immune memory formation in light of an organism's coevolutionary history with pathogens.

[4]  arXiv:2007.01378 [pdf, other]
Title: Generative embeddings of brain collective dynamics using variational autoencoders
Subjects: Neurons and Cognition (q-bio.NC)

We consider the problem of encoding pairwise correlations between coupled dynamical systems in a low-dimensional latent space based on few distinct observations. We used variational autoencoders (VAE) to embed temporal correlations between coupled nonlinear oscillators that model brain states in the wake-sleep cycle into a two-dimensional manifold. Training a VAE with samples generated using two different parameter combinations resulted in an embedding that represented the whole repertoire of collective dynamics, as well as the topology of the underlying connectivity network. We first followed this approach to infer the trajectory of brain states measured from wakefulness to deep sleep from the two endpoints of this trajectory; next, we showed that the same architecture was capable of representing the pairwise correlations of generic Landau-Stuart oscillators coupled by complex network topology

[5]  arXiv:2007.01411 [pdf, other]
Title: Hospitalization dynamics during the first COVID-19 pandemic wave: SIR modelling compared to Belgium, France, Italy, Switzerland and New York City data
Authors: Gregory Kozyreff
Subjects: Populations and Evolution (q-bio.PE); Physics and Society (physics.soc-ph)

Using the classical Susceptible-Infected-Recovered epidemiological model, an analytical formula is derived for the number of beds occupied by Covid-19 patients. The analytical curve is fitted to data in Belgium, France, New York City and Switzerland, with a correlation coefficient exceeding 98.8%, suggesting that finer models are unnecessary with such macroscopic data. The fitting is used to extract estimates of the doubling time in the ascending phase of the epidemic, the mean recovery time and, for those who require medical intervention, the mean hospitalization time. Large variations can be observed among different outbreaks.

[6]  arXiv:2007.01436 [pdf, other]
Title: Attribution Methods Reveal Flaws in Fingerprint-Based Virtual Screening
Authors: Vikram Sundar (1), Lucy Colwell (1 and 2) ((1) Google Research, (2) Department of Chemistry, University of Cambridge)
Comments: 4 pages, 5 figures. In proceedings for the 2020 ICML workshop on Machine Learning Interpretability for Scientific Discovery
Subjects: Biomolecules (q-bio.BM); Quantitative Methods (q-bio.QM)

Fingerprint-based models for protein-ligand binding have demonstrated outstanding success on benchmark datasets; however, these models may not learn the correct binding rules. To assess this concern, we use \textit{in silico} datasets with known binding rules to develop a general framework for evaluating model attribution. This framework identifies fragments that a model considers necessary to achieve a particular score, sidestepping the need for a model to be differentiable. Our results confirm that high-performing models may not learn the correct binding rule, and suggest concrete steps that can remedy this situation. We show that adding fragment-matched inactive molecules (decoys) to the data reduces attribution false negatives, while attribution false positives largely arise from the background correlation structure of molecular data. Normalizing for these background correlations helps to reveal the true binding logic. Our work highlights the danger of trusting attributions from high-performing models and suggests that a closer examination of fingerprint correlation structure and better decoy selection may help reduce misattributions.

[7]  arXiv:2007.01508 [pdf]
Title: Method to monitor the evolution of an epidemic in real time
Comments: 9 pages, 4 figures
Subjects: Populations and Evolution (q-bio.PE); Physics and Society (physics.soc-ph)

The emergence of an epidemic evokes the need to monitor its spread and assess and validate any mitigation measures enacted by governments and administrative bodies in real time. We present here a method to observe and quantify this spread and the response of affected populations and governing bodies and apply it to COVID-19 as a case study. This method provides means to simultaneously track in real time quantities such as the mortality and the recovery rates as well as the number of new infections caused by an infected person. With sufficient data, this method enables thorough monitoring and assessment of an epidemic without assumptions regarding the evolution of the pandemic in the future.

[8]  arXiv:2007.01585 [pdf, other]
Title: Mechanism underlying dynamic scaling properties observed in the contour of spreading epithelial monolayer
Comments: 11 pages, 6 figures, and supplemental materials
Subjects: Cell Behavior (q-bio.CB); Pattern Formation and Solitons (nlin.PS); Biological Physics (physics.bio-ph)

We found evidence of dynamic scaling in the spreading of MDCK monolayer, which can be characterized by the Hurst exponent ${\alpha} = 0.86$ and the growth exponent ${\beta} = 0.73$, and theoretically and experimentally clarified the mechanism that governs the contour shape dynamics. During the spreading of the monolayer, it is known that so-called "leader cells" generate the driving force and lead the other cells. Our time-lapse observations of cell behavior showed that these leader cells appeared at the early stage of the spreading, and formed the monolayer protrusion. Informed by these observations, we developed a simple mathematical model that included differences in cell motility, cell-cell adhesion, and random cell movement. The model reproduced the quantitative characteristics obtained from the experiment, such as the spreading speed, the distribution of the increment, and the dynamic scaling law. Analysis of the model equation revealed that the model could reproduce the different scaling law from ${\alpha} = 0.5, {\beta} = 0.25$ to ${\alpha} = 0.9, {\beta} = 0.75$, and the exponents ${\alpha}, {\beta}$ were determined by the two indices: $\rho t$ and $c$. Based on the analytical result, parameter estimation from the experimental results was achieved. The monolayer on the collagen-coated dishes showed a different scaling law ${\alpha} = 0.74, {\beta} = 0.68$, suggesting that cell motility increased by 9 folds. This result was consistent with the assay of the single-cell motility. Our study demonstrated that the dynamics of the contour of the monolayer were explained by the simple model, and proposed a new mechanism that exhibits the dynamic scaling property.

Cross-lists for Mon, 6 Jul 20

[9]  arXiv:2007.01337 (cross-list from math.CO) [pdf, other]
Title: Metric Dimension of Hamming Graphs and Applications to Computational Biology
Authors: Lucas Laird
Subjects: Combinatorics (math.CO); Quantitative Methods (q-bio.QM)

Genetic sequencing has become an increasingly affordable and accessible source of genomic data in computational biology. This data is often represented as $k$-mers, i.e., strings of some fixed length $k$ with symbols chosen from a reference alphabet. In contrast, some of the most effective and well-studied machine learning algorithms require numerical representations of the data. The concept of metric dimension of the so-called Hamming graphs presents a promising way to address this issue. A subset of vertices in a graph is said to be resolving when the distances to those vertices uniquely characterize every vertex in the graph. The metric dimension of a graph is the size of a smallest resolving subset of vertices. Finding the metric dimension of a general graph is a challenging problem, NP-complete in fact. Recently, an efficient algorithm for finding resolving sets in Hamming graphs has been proposed, which suffices to uniquely embed $k$-mers into a real vector space. Since the dimension of the embedding is the cardinality of the associated resolving set, determining whether or not a node can be removed from a resolving set while keeping it resolving is of great interest. This can be quite challenging for large graphs since only a brute-force approach is known for checking whether a set is a resolving set or not. In this thesis, we characterize resolvability of Hamming graphs in terms of a linear system over a finite domain: a set of nodes is resolving if and only if the linear system has only a trivial solution over said domain. We can represent the domain as the roots of a polynomial system so the apparatus of Gr\"obner bases comes in handy to determine, whether or not a set of nodes is resolving. As proof of concept, we study the resolvability of Hamming graphs associated with octapeptides i.e. proteins sequences of length eight.

[10]  arXiv:2007.01383 (cross-list from eess.IV) [pdf, other]
Title: Deep Interactive Learning: An Efficient Labeling Approach for Deep Learning-Based Osteosarcoma Treatment Response Assessment
Comments: Accepted at MICCAI 2020
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)

Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes pre-operative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a time-consuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 hours of annotation using DIaL can successfully estimate ratios of necrosis within expected inter-observer variation rate for non-standardized manual surgical pathology task.

[11]  arXiv:2007.01424 (cross-list from physics.soc-ph) [pdf, ps, other]
Title: Active Control and Sustained Oscillations in actSIS Epidemic Dynamics
Subjects: Physics and Society (physics.soc-ph); Dynamical Systems (math.DS); Populations and Evolution (q-bio.PE)

An actively controlled Susceptible-Infected-Susceptible (actSIS) contagion model is presented for studying epidemic dynamics with continuous-time feedback control of infection rates. Our work is inspired by the observation that epidemics can be controlled through decentralized disease-control strategies such as quarantining, sheltering in place, social distancing, etc., where individuals actively modify their contact rates with others in response to observations of infection levels in the population. Accounting for a time lag in observations and categorizing individuals into distinct sub-populations based on their risk profiles, we show that the actSIS model manifests qualitatively different features as compared with the SIS model. In a homogeneous population of risk-averters, the endemic equilibrium is always reduced, although the transient infection level can exhibit overshoot or undershoot. In a homogeneous population of risk-tolerating individuals, the system exhibits bistability, which can also lead to reduced infection. For a heterogeneous population comprised of risk-tolerators and risk-averters, we prove conditions on model parameters for the existence of a Hopf bifurcation and sustained oscillations in the infected population.

[12]  arXiv:2007.01516 (cross-list from cs.LG) [pdf, other]
Title: Deep interpretability for GWAS
Comments: Accepted at ICML 2020 workshop on ML Interpretability for Scientific Discovery
Subjects: Machine Learning (cs.LG); Genomics (q-bio.GN); Applications (stat.AP); Machine Learning (stat.ML)

Genome-Wide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In these studies, association testing is done on a variant-by-variant basis, possibly missing out on non-linear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.

[13]  arXiv:2007.01583 (cross-list from physics.soc-ph) [pdf, other]
Title: COVID-19 lockdown induces structural changes in mobility networks -- Implication for mitigating disease dynamics
Comments: 16 pages, 6 figures
Subjects: Physics and Society (physics.soc-ph); Populations and Evolution (q-bio.PE)

In the wake of the COVID-19 pandemic many countries implemented containment measures to reduce disease transmission. Studies using digital data sources show that the mobility of individuals was effectively reduced in multiple countries. However, it remains unclear whether these reductions caused deeper structural changes in mobility networks, and how such changes may affect dynamic processes on the network. Here we use movement data of mobile phone users to show that mobility in Germany has not only been reduced considerably: Lockdown measures caused substantial and lasting structural changes in the mobility network. We find that long-distance travel was reduced disproportionately strongly. The trimming of long-range network connectivity leads to a more local, clustered network and a moderation of the "small-world" effect. We demonstrate that these structural changes have a considerable effect on epidemic spreading processes by "flattening" the epidemic curve and delaying the spread to geographically distant regions.

Replacements for Mon, 6 Jul 20

[14]  arXiv:1708.00909 (replaced) [pdf]
Title: Machine learning for neural decoding
Subjects: Neurons and Cognition (q-bio.NC); Machine Learning (cs.LG); Machine Learning (stat.ML)
[15]  arXiv:1908.06180 (replaced) [pdf, ps, other]
Title: Multi-View Broad Learning System for Primate Oculomotor Decision Decoding
Journal-ref: IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020
Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[16]  arXiv:1911.04218 (replaced) [pdf, other]
Title: Dynamical Heart Beat Correlations during Running
Comments: 19 pages, 10 figures
Subjects: Data Analysis, Statistics and Probability (physics.data-an); Medical Physics (physics.med-ph); Quantitative Methods (q-bio.QM)
[17]  arXiv:2001.10313 (replaced) [pdf, other]
Title: Evolutionary dynamics of higher-order interactions in social networks
Comments: 38 pages, 10 figures, 1 table -- Was "Evolutionary Dynamics of Higher-Order Interactions"
Subjects: Physics and Society (physics.soc-ph); Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE)
[18]  arXiv:2002.06060 (replaced) [pdf, other]
Title: Causality in cognitive neuroscience: concepts, challenges, and distributional robustness
Subjects: Neurons and Cognition (q-bio.NC); Applications (stat.AP); Methodology (stat.ME)
[19]  arXiv:2006.05581 (replaced) [pdf, other]
Title: Semiparametric Bayesian Inference for the Transmission Dynamics of COVID-19 with a State-Space Model
Subjects: Methodology (stat.ME); Populations and Evolution (q-bio.PE); Applications (stat.AP)
[20]  arXiv:2006.06987 (replaced) [pdf, other]
Title: Improved estimations of stochastic chemical kinetics by finite state expansion
Comments: 33 pages, 9 figures
Subjects: Molecular Networks (q-bio.MN); Computational Engineering, Finance, and Science (cs.CE)
[21]  arXiv:2006.15336 (replaced) [pdf, other]
Title: Spatio-temporal predictive modeling framework for infectious disease spread
Comments: 9 pages, 4 figures
Subjects: Populations and Evolution (q-bio.PE); Dynamical Systems (math.DS); Physics and Society (physics.soc-ph)
[22]  arXiv:2007.00577 (replaced) [pdf, other]
Title: Incorporating age and delay into models for biophysical systems
Comments: 21 pages, 4 figures. Under review for publication
Subjects: Populations and Evolution (q-bio.PE)
[ total of 22 entries: 1-22 ]
[ showing up to 1000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2007, contact, help  (Access key information)