Quantitative Biology
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Wed, 1 Feb 23
 [1] arXiv:2301.13201 [pdf, other]

Title: Numerical Issues for a Nonautonomous Logistic ModelComments: 9 pages, 3 figuresSubjects: Quantitative Methods (qbio.QM); Numerical Analysis (math.NA)
The logistic equation has been extensively used to model biological phenomena across a variety of disciplines and has provided valuable insight into how our universe operates. Incorporating timedependent parameters into the logistic equation allows the modeling of more complex behavior than its autonomous analog, such as a tumor's varying growth rate under treatment, or the expansion of bacterial colonies under varying resource conditions. Some of the most commonly used numerical solvers produce vastly different approximations for a nonautonomous logistic model with a periodicallyvarying growth rate changing signum. Incorrect, inconsistent, or even unstable approximate solutions for this nonautonomous problem can occur from some of the most frequently used numerical methods, including the lsoda, implicit backwards difference, and RungeKutta methods, all of which employ a blackbox framework. Meanwhile, a simple, manuallyprogrammed RungeKutta method is robust enough to accurately capture the analytical solution for biologically reasonable parameters and consistently produce reliable simulations. Consistency and reliability of numerical methods are fundamental for simulating nonautonomous differential equations and dynamical systems, particularly when applications are physically or biologically informed.
 [2] arXiv:2301.13387 [pdf, other]

Title: Deep Learning for ReferenceFree Geolocation for Poplar TreesComments: Accepted at NeurIPS 2022 AI for Science WorkshopSubjects: Genomics (qbio.GN); Machine Learning (cs.LG)
A core task in precision agriculture is the identification of climatic and ecological conditions that are advantageous for a given crop. The most succinct approach is geolocation, which is concerned with locating the native region of a given sample based on its genetic makeup. Here, we investigate genomic geolocation of Populus trichocarpa, or poplar, which has been identified by the US Department of Energy as a fastrotation biofuel crop to be harvested nationwide. In particular, we approach geolocation from a referencefree perspective, circumventing the need for computeintensive processes such as variant calling and alignment. Our model, MashNet, predicts latitude and longitude for poplar trees from randomlysampled, unaligned sequence fragments. We show that our model performs comparably to Locator, a stateoftheart method based on aligned wholegenome sequence data. MashNet achieves an error of 34.0 km^2 compared to Locator's 22.1 km^2. MashNet allows growers to quickly and efficiently identify natural varieties that will be most productive in their growth environment based on genotype. This paper explores geolocation for precision agriculture while providing a framework and data source for further development by the machine learning community.
Crosslists for Wed, 1 Feb 23
 [3] arXiv:2301.13245 (crosslist from cs.DS) [pdf, other]

Title: A Safety Framework for Flow Decomposition Problems via Integer Linear ProgrammingSubjects: Data Structures and Algorithms (cs.DS); Combinatorics (math.CO); Genomics (qbio.GN)
Many important problems in Bioinformatics (e.g., assembly or multiassembly) admit multiple solutions, while the final objective is to report only one. A common approach to deal with this uncertainty is finding safe partial solutions (e.g., contigs) which are common to all solutions. Previous research on safety has focused on polynomiallytime solvable problems, whereas many successful and natural models are NPhard to solve, leaving a lack of "safety tools" for such problems. We propose the first method for computing all safe solutions for an NPhard problem, minimum flow decomposition. We obtain our results by developing a "safety test" for paths based on a general Integer Linear Programming (ILP) formulation. Moreover, we provide implementations with practical optimizations aimed to reduce the total ILP time, the most efficient of these being based on a recursive grouptesting procedure.
Results: Experimental results on the transcriptome datasets of Shao and Kingsford (TCBB, 2017) show that all safe paths for minimum flow decompositions correctly recover up to 90% of the full RNA transcripts, which is at least 25% more than previously known safe paths, such as (Caceres et al. TCBB, 2021), (Zheng et al., RECOMB 2021), (Khan et al., RECOMB 2022, ESA 2022). Moreover, despite the NPhardness of the problem, we can report all safe paths for 99.8% of the over 27,000 nontrivial graphs of this dataset in only 1.5 hours. Our results suggest that, on perfect data, there is less ambiguity than thought in the notoriously hard RNA assembly problem.
Availability: https://github.com/algbio/mfdsafety  [4] arXiv:2301.13290 (crosslist from condmat.statmech) [pdf, other]

Title: Resonant noise amplification in a predatorprey model with quasidiscrete generationsComments: 14 pages, 12 figuresSubjects: Statistical Mechanics (condmat.statmech); Populations and Evolution (qbio.PE)
Predatorprey models have been shown to exhibit resonancelike behaviour, in which random fluctuations in the number of organisms (demographic noise) are amplified when their frequency is close to the natural oscillatory frequency of the system. This behaviour has been traditionally studied in models with exponentially distributed replication and death times. Here we consider a biologically more realistic model, in which organisms replicate quasisynchronously such that the distribution of replication times has a narrow maximum at some $T>0$ corresponding to the mean doubling time. We show that when the frequency of replication $f=1/T$ is tuned to the natural oscillatory frequency of the predatorprey model, the system exhibits oscillations that are much stronger than in the model with Poissonian (nonsynchronous) replication and death. The effect can be explained by resonant amplification of coloured noise generated by quasisynchronous replication events. To show this, we consider a singlespecies model with quasisynchronous replication. We calculate the spectrum and the amplitude of demographic noise in this model, and use these results to obtain these quantities for the twospecies model.
 [5] arXiv:2301.13644 (crosslist from cs.LG) [pdf, other]

Title: Exploring QSAR Models for ActivityCliff PredictionComments: Submitted to Journal of CheminformaticsSubjects: Machine Learning (cs.LG); Biomolecules (qbio.BM); Machine Learning (stat.ML)
Pairs of similar compounds that only differ by a small structural modification but exhibit a large difference in their binding affinity for a given target are known as activity cliffs (ACs). It has been hypothesised that quantitative structureactivity relationship (QSAR) models struggle to predict ACs and that ACs thus form a major source of prediction error. However, a study to explore the ACprediction power of modern QSAR methods and its relationship to general QSARprediction performance is lacking. We systematically construct nine distinct QSAR models by combining three molecular representation methods (extendedconnectivity fingerprints, physicochemicaldescriptor vectors and graph isomorphism networks) with three regression techniques (random forests, knearest neighbours and multilayer perceptrons); we then use each resulting model to classify pairs of similar compounds as ACs or nonACs and to predict the activities of individual molecules in three case studies: dopamine receptor D2, factor Xa, and SARSCoV2 main protease. We observe low ACsensitivity amongst the tested models when the activities of both compounds are unknown, but a substantial increase in ACsensitivity when the actual activity of one of the compounds is given. Graph isomorphism features are found to be competitive with or superior to classical molecular representations for ACclassification and can thus be employed as baseline ACprediction models or simple compoundoptimisation tools. For general QSARprediction, however, extendedconnectivity fingerprints still consistently deliver the best performance. Our results provide strong support for the hypothesis that indeed QSAR methods frequently fail to predict ACs. We propose twinnetwork training for deep learning models as a potential future pathway to increase ACsensitivity and thus overall QSAR performance.
 [6] arXiv:2301.13659 (crosslist from cs.CV) [pdf, other]

Title: Spyker: Highperformance Library for Spiking Deep Neural NetworksComments: 11 pages, 6 figures, 6 listingsSubjects: Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (qbio.NC)
Spiking neural networks (SNNs) have been recently brought to light due to their promising capabilities. SNNs simulate the brain with higher biological plausibility compared to previous generations of neural networks. Learning with fewer samples and consuming less power are among the key features of these networks. However, the theoretical advantages of SNNs have not been seen in practice due to the slowness of simulation tools and the impracticality of the proposed network structures. In this work, we implement a highperformance library named Spyker using C++/CUDA from scratch that outperforms its predecessor. Several SNNs are implemented in this work with different learning rules (spiketimingdependent plasticity and reinforcement learning) using Spyker that achieve significantly better runtimes, to prove the practicality of the library in the simulation of largescale networks. To our knowledge, no such tools have been developed to simulate largescale spiking neural networks with high performance using a modular structure. Furthermore, a comparison of the represented stimuli extracted from Spyker to recorded electrophysiology data is performed to demonstrate the applicability of SNNs in describing the underlying neural mechanisms of the brain functions. The aim of this library is to take a significant step toward uncovering the true potential of the brain computations using SNNs.
 [7] arXiv:2301.13787 (crosslist from physics.bioph) [pdf]

Title: Designing Covalent Organic Frameworkbased Lightdriven Microswimmers towards Intraocular Theranostic ApplicationsAuthors: Varun Sridhar, Erdost Yildiz, Andrés RodríguezCamargo, Xianglong Lyu, Liang Yao, Paul Wrede, Amirreza Aghakhani, Mukrime Birgul Akolpoglu, Filip Podjaski, Bettina V. Lotsch, Metin SittiSubjects: Biological Physics (physics.bioph); Materials Science (condmat.mtrlsci); Soft Condensed Matter (condmat.soft); Quantitative Methods (qbio.QM)
Even micromachines with tailored functionalities enable targeted therapeutic applications in biological environments, their controlled motion in biological media and drug delivery functions usually require sophisticated designs and complex propulsion apparatuses for practical applications. Covalent organic frameworks (COFs), new chemically versatile and nanoporous materials, offer microscale multipurpose solutions, which are not explored in lightdriven micromachines. We describe and compare two different types of COFs, uniformly spherical TABPPDACOF submicron particles and texturally highly nanoporous, irregular, micronsized TpAzoCOF particles as lightdriven microrobots. They can be used as highly efficient visiblelightdriven drug carriers in aqueous ionic and cellular media, even in intraocular fluids. Their absorption ranging down to red light enables phototaxis even in deeper biological media and the organic nature of COFs enables their biocompatibility. The inherently porous structure with ~2.5 nm structural pores, and large surface areas allow for targeted and efficient drug loading even for insoluble drugs and peptides, which can be released on demand. Also, indocyanine green (ICG) dye loading in the pores enables photoacoustic imaging or optical coherence tomography and hyperthermia in operando conditions. The realtime visualization of the drugloaded COF microswimmers enables new insights into the function of porous organic micromachines, which will be useful to solve various drug delivery problems.
Replacements for Wed, 1 Feb 23
 [8] arXiv:2109.05605 (replaced) [pdf, other]

Title: A Perturbative Approach to the Analysis of ManyCompartment Models Characterized by the Presence of Waning ImmunityAuthors: Shoshana ElgartComments: 24 pages, 9 figuresSubjects: Dynamical Systems (math.DS); Populations and Evolution (qbio.PE)
 [9] arXiv:2206.02795 (replaced) [pdf]

Title: Forecasting COVID 19 cases using Statistical Models and Ontologybased Semantic Modelling: A real time data analytics approachSubjects: Populations and Evolution (qbio.PE); Machine Learning (cs.LG)
 [10] arXiv:2211.16553 (replaced) [pdf, other]

Title: Hierarchically Clustered PCA, LLE, and CCA via a Convex Clustering PenaltyComments: 11 pages, 4 figures, 3 tablesSubjects: Machine Learning (cs.LG); Quantitative Methods (qbio.QM); Machine Learning (stat.ML)
 [11] arXiv:2212.08379 (replaced) [pdf, other]

Title: GeneFormer: Learned Gene Compression using Transformerbased Context ModelingSubjects: Machine Learning (cs.LG); Genomics (qbio.GN)
 [12] arXiv:2301.12422 (replaced) [pdf, other]

Title: PhaVIP: Phage VIrion Protein classification based on chaos game representation and Vision TransformerComments: 15 pages, 13 figuresSubjects: Genomics (qbio.GN)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, qbio, recent, 2301, contact, help (Access key information)