New submissions for Mon, 6 Jul 20
 [1] arXiv:2007.01340 [pdf, other]

Title: Lack of evidence for a substantial rate of templated mutagenesis in B cell diversificationSubjects: Populations and Evolution (qbio.PE); Applications (stat.AP)
B cell receptor sequences diversify through mutations introduced by purposebuilt cellular machinery. A recent paper has concluded that a "templated mutagenesis" process is a major contributor to somatic hypermutation, and therefore immunoglobulin diversification, in mice and humans. In this proposed process, mutations in the immunoglobulin locus are introduced by copying short segments from other immunoglobulin genes. If true, this would overturn decades of research on B cell diversification, and would require a complete rewrite of computational methods to analyze B cell data for these species.
In this paper, we reevaluate the templated mutagenesis hypothesis. By applying the original inferential method using potential donor templates absent from B cell genomes, we obtain estimates of the methods's false positive rates. We find false positive rates of templated mutagenesis in murine and human immunoglobulin loci that are similar to or even higher than the original rate inferences, and by considering the bases used in substitution we find evidence that if templated mutagenesis occurs, it is at a low rate. We also show that the statistically significant results in the original paper can easily result from a slight misspecification of the null model.  [2] arXiv:2007.01344 [pdf, other]

Title: Decoding asymptomatic COVID19 infection and transmissionComments: 18 pages, 5 figuresSubjects: Populations and Evolution (qbio.PE); Biomolecules (qbio.BM)
Coronavirus disease 2019 (COVID19) is a continuously devastating public health and the world economy. One of the major challenges in controlling the COVID19 outbreak is its asymptomatic infection and transmission, which are elusive and defenseless in most situations. The pathogenicity and virulence of asymptomatic COVID19 remain mysterious. Based on the genotyping of 20656 Severe Acute Respiratory Syndrome Coronavirus 2 (SARSCoV2) genome isolates, we reveal that asymptomatic infection is linked to SARSCoV2 11083G>T mutation, i.e., leucine (L) to phenylalanine (F) substitution at the residue 37 (L37F) of nonstructure protein 6 (NSP6). By analyzing the distribution of 11083G>T in various countries, we unveil that 11083G>T may correlate with the hypotoxicity of SARSCoV2. Moreover, we show a global decaying tendency of the 11083G>T mutation ratio indicating that 11083G>T hinders SARSCoV2 transmission capacity. Sequence alignment found both NSP6 and residue 37 neighborhoods are relatively conservative over a few coronaviral species, indicating their importance in regulating host cell autophagy to undermine innate cellular defense against viral infection. Using machine learning and topological data analysis, we demonstrate that mutation L37F has made NSP6 energetically less stable. The rigidity and flexibility index and several network models suggest that mutation L37F may have compromised the NSP6 function, leading to a relatively weak SARSCoV subtype. This assessment is a good agreement with our genotyping of SARSCoV2 evolution and transmission across various countries and regions over the past few months.
 [3] arXiv:2007.01363 [pdf, other]

Title: Optimal evolutionary decisionmaking to store immune memorySubjects: Populations and Evolution (qbio.PE); Biological Physics (physics.bioph); Quantitative Methods (qbio.QM)
The adaptive immune system in vertebrates consists of highly diverse immune receptors to mount specific responses against a multitude of pathogens. A central feature of the adaptive immune system is the ability to form a memory to act more efficiently in future encounters with similar pathogens. However, memory formation especially in Bcells is one of the least understood cell fate decisions in the immune system. Here, we present a framework to characterize optimal strategies to store memory in order to maximize the utility of immune response to counter evolving pathogens throughout an organism's lifetime. To do so, we have incorporated the kinetics and energetics of memory response as ingredients of nonequilibrium decisionmaking between an adaptive exploration to mount a specific and novel response or exploitation of existing memory that can be activated rapidly yet with a reduced specificity against evolved pathogens. To achieve a longterm benefit for the host, we show that memory generation should be actively regulated and dependent on immune receptors' affinity, with a preference for crossreactive receptors with a moderate affinity against pathogens as opposed to high affinity receptors  a recipe that is consistent with recent experimental findings [1,2]. Moreover, we show that the specificity of memory should depend on the organism's lifespan, and shorterlived organisms with fewer pathogenic encounters throughout their lifetime should store more crossreactive memory. Overall, our framework provides a baseline to gauge the efficacy of immune memory formation in light of an organism's coevolutionary history with pathogens.
 [4] arXiv:2007.01378 [pdf, other]

Title: Generative embeddings of brain collective dynamics using variational autoencodersAuthors: Yonatan Sanz Perl, Hernán Boccacio, Ignacio PérezIpiña, Federico Zamberlán, Helmut Laufs, Morten Kringelbach, Gustavo Deco, Enzo TagliazucchiSubjects: Neurons and Cognition (qbio.NC)
We consider the problem of encoding pairwise correlations between coupled dynamical systems in a lowdimensional latent space based on few distinct observations. We used variational autoencoders (VAE) to embed temporal correlations between coupled nonlinear oscillators that model brain states in the wakesleep cycle into a twodimensional manifold. Training a VAE with samples generated using two different parameter combinations resulted in an embedding that represented the whole repertoire of collective dynamics, as well as the topology of the underlying connectivity network. We first followed this approach to infer the trajectory of brain states measured from wakefulness to deep sleep from the two endpoints of this trajectory; next, we showed that the same architecture was capable of representing the pairwise correlations of generic LandauStuart oscillators coupled by complex network topology
 [5] arXiv:2007.01411 [pdf, other]

Title: Hospitalization dynamics during the first COVID19 pandemic wave: SIR modelling compared to Belgium, France, Italy, Switzerland and New York City dataAuthors: Gregory KozyreffSubjects: Populations and Evolution (qbio.PE); Physics and Society (physics.socph)
Using the classical SusceptibleInfectedRecovered epidemiological model, an analytical formula is derived for the number of beds occupied by Covid19 patients. The analytical curve is fitted to data in Belgium, France, New York City and Switzerland, with a correlation coefficient exceeding 98.8%, suggesting that finer models are unnecessary with such macroscopic data. The fitting is used to extract estimates of the doubling time in the ascending phase of the epidemic, the mean recovery time and, for those who require medical intervention, the mean hospitalization time. Large variations can be observed among different outbreaks.
 [6] arXiv:2007.01436 [pdf, other]

Title: Attribution Methods Reveal Flaws in FingerprintBased Virtual ScreeningAuthors: Vikram Sundar (1), Lucy Colwell (1 and 2) ((1) Google Research, (2) Department of Chemistry, University of Cambridge)Comments: 4 pages, 5 figures. In proceedings for the 2020 ICML workshop on Machine Learning Interpretability for Scientific DiscoverySubjects: Biomolecules (qbio.BM); Quantitative Methods (qbio.QM)
Fingerprintbased models for proteinligand binding have demonstrated outstanding success on benchmark datasets; however, these models may not learn the correct binding rules. To assess this concern, we use \textit{in silico} datasets with known binding rules to develop a general framework for evaluating model attribution. This framework identifies fragments that a model considers necessary to achieve a particular score, sidestepping the need for a model to be differentiable. Our results confirm that highperforming models may not learn the correct binding rule, and suggest concrete steps that can remedy this situation. We show that adding fragmentmatched inactive molecules (decoys) to the data reduces attribution false negatives, while attribution false positives largely arise from the background correlation structure of molecular data. Normalizing for these background correlations helps to reveal the true binding logic. Our work highlights the danger of trusting attributions from highperforming models and suggests that a closer examination of fingerprint correlation structure and better decoy selection may help reduce misattributions.
 [7] arXiv:2007.01508 [pdf]

Title: Method to monitor the evolution of an epidemic in real timeComments: 9 pages, 4 figuresSubjects: Populations and Evolution (qbio.PE); Physics and Society (physics.socph)
The emergence of an epidemic evokes the need to monitor its spread and assess and validate any mitigation measures enacted by governments and administrative bodies in real time. We present here a method to observe and quantify this spread and the response of affected populations and governing bodies and apply it to COVID19 as a case study. This method provides means to simultaneously track in real time quantities such as the mortality and the recovery rates as well as the number of new infections caused by an infected person. With sufficient data, this method enables thorough monitoring and assessment of an epidemic without assumptions regarding the evolution of the pandemic in the future.
 [8] arXiv:2007.01585 [pdf, other]

Title: Mechanism underlying dynamic scaling properties observed in the contour of spreading epithelial monolayerComments: 11 pages, 6 figures, and supplemental materialsSubjects: Cell Behavior (qbio.CB); Pattern Formation and Solitons (nlin.PS); Biological Physics (physics.bioph)
We found evidence of dynamic scaling in the spreading of MDCK monolayer, which can be characterized by the Hurst exponent ${\alpha} = 0.86$ and the growth exponent ${\beta} = 0.73$, and theoretically and experimentally clarified the mechanism that governs the contour shape dynamics. During the spreading of the monolayer, it is known that socalled "leader cells" generate the driving force and lead the other cells. Our timelapse observations of cell behavior showed that these leader cells appeared at the early stage of the spreading, and formed the monolayer protrusion. Informed by these observations, we developed a simple mathematical model that included differences in cell motility, cellcell adhesion, and random cell movement. The model reproduced the quantitative characteristics obtained from the experiment, such as the spreading speed, the distribution of the increment, and the dynamic scaling law. Analysis of the model equation revealed that the model could reproduce the different scaling law from ${\alpha} = 0.5, {\beta} = 0.25$ to ${\alpha} = 0.9, {\beta} = 0.75$, and the exponents ${\alpha}, {\beta}$ were determined by the two indices: $\rho t$ and $c$. Based on the analytical result, parameter estimation from the experimental results was achieved. The monolayer on the collagencoated dishes showed a different scaling law ${\alpha} = 0.74, {\beta} = 0.68$, suggesting that cell motility increased by 9 folds. This result was consistent with the assay of the singlecell motility. Our study demonstrated that the dynamics of the contour of the monolayer were explained by the simple model, and proposed a new mechanism that exhibits the dynamic scaling property.
Crosslists for Mon, 6 Jul 20
 [9] arXiv:2007.01337 (crosslist from math.CO) [pdf, other]

Title: Metric Dimension of Hamming Graphs and Applications to Computational BiologyAuthors: Lucas LairdSubjects: Combinatorics (math.CO); Quantitative Methods (qbio.QM)
Genetic sequencing has become an increasingly affordable and accessible source of genomic data in computational biology. This data is often represented as $k$mers, i.e., strings of some fixed length $k$ with symbols chosen from a reference alphabet. In contrast, some of the most effective and wellstudied machine learning algorithms require numerical representations of the data. The concept of metric dimension of the socalled Hamming graphs presents a promising way to address this issue. A subset of vertices in a graph is said to be resolving when the distances to those vertices uniquely characterize every vertex in the graph. The metric dimension of a graph is the size of a smallest resolving subset of vertices. Finding the metric dimension of a general graph is a challenging problem, NPcomplete in fact. Recently, an efficient algorithm for finding resolving sets in Hamming graphs has been proposed, which suffices to uniquely embed $k$mers into a real vector space. Since the dimension of the embedding is the cardinality of the associated resolving set, determining whether or not a node can be removed from a resolving set while keeping it resolving is of great interest. This can be quite challenging for large graphs since only a bruteforce approach is known for checking whether a set is a resolving set or not. In this thesis, we characterize resolvability of Hamming graphs in terms of a linear system over a finite domain: a set of nodes is resolving if and only if the linear system has only a trivial solution over said domain. We can represent the domain as the roots of a polynomial system so the apparatus of Gr\"obner bases comes in handy to determine, whether or not a set of nodes is resolving. As proof of concept, we study the resolvability of Hamming graphs associated with octapeptides i.e. proteins sequences of length eight.
 [10] arXiv:2007.01383 (crosslist from eess.IV) [pdf, other]

Title: Deep Interactive Learning: An Efficient Labeling Approach for Deep LearningBased Osteosarcoma Treatment Response AssessmentAuthors: David Joon Ho, Narasimhan P. Agaram, Peter J. Schueffler, Chad M. Vanderbilt, MarcHenri Jean, Meera R. Hameed, Thomas J. FuchsComments: Accepted at MICCAI 2020Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (qbio.QM)
Osteosarcoma is the most common malignant primary bone tumor. Standard treatment includes preoperative chemotherapy followed by surgical resection. The response to treatment as measured by ratio of necrotic tumor area to overall tumor area is a known prognostic factor for overall survival. This assessment is currently done manually by pathologists by looking at glass slides under the microscope which may not be reproducible due to its subjective nature. Convolutional neural networks (CNNs) can be used for automated segmentation of viable and necrotic tumor on osteosarcoma whole slide images. One bottleneck for supervised learning is that large amounts of accurate annotations are required for training which is a timeconsuming and expensive process. In this paper, we describe Deep Interactive Learning (DIaL) as an efficient labeling approach for training CNNs. After an initial labeling step is done, annotators only need to correct mislabeled regions from previous segmentation predictions to improve the CNN model until the satisfactory predictions are achieved. Our experiments show that our CNN model trained by only 7 hours of annotation using DIaL can successfully estimate ratios of necrosis within expected interobserver variation rate for nonstandardized manual surgical pathology task.
 [11] arXiv:2007.01424 (crosslist from physics.socph) [pdf, ps, other]

Title: Active Control and Sustained Oscillations in actSIS Epidemic DynamicsSubjects: Physics and Society (physics.socph); Dynamical Systems (math.DS); Populations and Evolution (qbio.PE)
An actively controlled SusceptibleInfectedSusceptible (actSIS) contagion model is presented for studying epidemic dynamics with continuoustime feedback control of infection rates. Our work is inspired by the observation that epidemics can be controlled through decentralized diseasecontrol strategies such as quarantining, sheltering in place, social distancing, etc., where individuals actively modify their contact rates with others in response to observations of infection levels in the population. Accounting for a time lag in observations and categorizing individuals into distinct subpopulations based on their risk profiles, we show that the actSIS model manifests qualitatively different features as compared with the SIS model. In a homogeneous population of riskaverters, the endemic equilibrium is always reduced, although the transient infection level can exhibit overshoot or undershoot. In a homogeneous population of risktolerating individuals, the system exhibits bistability, which can also lead to reduced infection. For a heterogeneous population comprised of risktolerators and riskaverters, we prove conditions on model parameters for the existence of a Hopf bifurcation and sustained oscillations in the infected population.
 [12] arXiv:2007.01516 (crosslist from cs.LG) [pdf, other]

Title: Deep interpretability for GWASAuthors: Deepak Sharma, Audrey Durand, MarcAndré Legault, LouisPhilippe Lemieux Perreault, Audrey Lemaçon, MariePierre Dubé, Joelle PineauComments: Accepted at ICML 2020 workshop on ML Interpretability for Scientific DiscoverySubjects: Machine Learning (cs.LG); Genomics (qbio.GN); Applications (stat.AP); Machine Learning (stat.ML)
GenomeWide Association Studies are typically conducted using linear models to find genetic variants associated with common diseases. In these studies, association testing is done on a variantbyvariant basis, possibly missing out on nonlinear interaction effects between variants. Deep networks can be used to model these interactions, but they are difficult to train and interpret on large genetic datasets. We propose a method that uses the gradient based deep interpretability technique named DeepLIFT to show that known diabetes genetic risk factors can be identified using deep models along with possibly novel associations.
 [13] arXiv:2007.01583 (crosslist from physics.socph) [pdf, other]

Title: COVID19 lockdown induces structural changes in mobility networks  Implication for mitigating disease dynamicsComments: 16 pages, 6 figuresSubjects: Physics and Society (physics.socph); Populations and Evolution (qbio.PE)
In the wake of the COVID19 pandemic many countries implemented containment measures to reduce disease transmission. Studies using digital data sources show that the mobility of individuals was effectively reduced in multiple countries. However, it remains unclear whether these reductions caused deeper structural changes in mobility networks, and how such changes may affect dynamic processes on the network. Here we use movement data of mobile phone users to show that mobility in Germany has not only been reduced considerably: Lockdown measures caused substantial and lasting structural changes in the mobility network. We find that longdistance travel was reduced disproportionately strongly. The trimming of longrange network connectivity leads to a more local, clustered network and a moderation of the "smallworld" effect. We demonstrate that these structural changes have a considerable effect on epidemic spreading processes by "flattening" the epidemic curve and delaying the spread to geographically distant regions.
Replacements for Mon, 6 Jul 20
 [14] arXiv:1708.00909 (replaced) [pdf]

Title: Machine learning for neural decodingAuthors: Joshua I. Glaser, Ari S. Benjamin, Raeed H. Chowdhury, Matthew G. Perich, Lee E. Miller, Konrad P. KordingSubjects: Neurons and Cognition (qbio.NC); Machine Learning (cs.LG); Machine Learning (stat.ML)
 [15] arXiv:1908.06180 (replaced) [pdf, ps, other]

Title: MultiView Broad Learning System for Primate Oculomotor Decision DecodingJournalref: IEEE Transactions on Neural Systems and Rehabilitation Engineering, 2020Subjects: Neurons and Cognition (qbio.NC); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
 [16] arXiv:1911.04218 (replaced) [pdf, other]

Title: Dynamical Heart Beat Correlations during RunningComments: 19 pages, 10 figuresSubjects: Data Analysis, Statistics and Probability (physics.dataan); Medical Physics (physics.medph); Quantitative Methods (qbio.QM)
 [17] arXiv:2001.10313 (replaced) [pdf, other]

Title: Evolutionary dynamics of higherorder interactions in social networksAuthors: Unai AlvarezRodriguez, Federico Battiston, Guilherme Ferraz de Arruda, Yamir Moreno, Matjaz Perc, Vito LatoraComments: 38 pages, 10 figures, 1 table  Was "Evolutionary Dynamics of HigherOrder Interactions"Subjects: Physics and Society (physics.socph); Social and Information Networks (cs.SI); Populations and Evolution (qbio.PE)
 [18] arXiv:2002.06060 (replaced) [pdf, other]

Title: Causality in cognitive neuroscience: concepts, challenges, and distributional robustnessSubjects: Neurons and Cognition (qbio.NC); Applications (stat.AP); Methodology (stat.ME)
 [19] arXiv:2006.05581 (replaced) [pdf, other]

Title: Semiparametric Bayesian Inference for the Transmission Dynamics of COVID19 with a StateSpace ModelSubjects: Methodology (stat.ME); Populations and Evolution (qbio.PE); Applications (stat.AP)
 [20] arXiv:2006.06987 (replaced) [pdf, other]

Title: Improved estimations of stochastic chemical kinetics by finite state expansionComments: 33 pages, 9 figuresSubjects: Molecular Networks (qbio.MN); Computational Engineering, Finance, and Science (cs.CE)
 [21] arXiv:2006.15336 (replaced) [pdf, other]

Title: Spatiotemporal predictive modeling framework for infectious disease spreadComments: 9 pages, 4 figuresSubjects: Populations and Evolution (qbio.PE); Dynamical Systems (math.DS); Physics and Society (physics.socph)
 [22] arXiv:2007.00577 (replaced) [pdf, other]

Title: Incorporating age and delay into models for biophysical systemsComments: 21 pages, 4 figures. Under review for publicationSubjects: Populations and Evolution (qbio.PE)
