We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Biology

New submissions

[ total of 22 entries: 1-22 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 25 Apr 24

[1]  arXiv:2404.15300 [pdf, ps, other]
Title: Bioregionalization analyses with the bioregion R-package
Comments: 27 pages, 2 figures
Subjects: Quantitative Methods (q-bio.QM)

Bioregionalization consists in the identification of spatial units with similar species composition and is a classical approach in the fields of biogeography and macroecology. The recent emergence of global databases, improvements in computational power, and the development of clustering algorithms coming from the network theory have led to several major updates of the bioregionalizations of many taxa. A typical bioregionalization workflow involves five different steps: formatting the input data, computing a (dis)similarity matrix, selecting a clustering algorithm, evaluating the resulting bioregionalization, and mapping and interpreting the bioregions. For most of these steps, there are many options available in the methods and R packages. Here, we present bioregion, a package that includes all the steps of a bioregionalization workflow under a single architecture, with an exhaustive list of the clustering algorithms used in biogeography and macroecology. These algorithms include (non-)hierarchical algorithms as well as community detection algorithms coming from the network theory. Some key methods from the literature, such as Infomap or OSLOM, that were not available in the R language are included in bioregion. By allowing different methods coming from different fields to communicate easily, bioregion will allow a reproducible and complete comparison of the different bioregionalization methods, which is still missing in the literature.

[2]  arXiv:2404.15318 [pdf, ps, other]
Title: VASARI-auto: equitable, efficient, and economical featurisation of glioma MRI
Comments: 28 pages, 6 figures, 1 table
Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)

The VASARI MRI feature set is a quantitative system designed to standardise glioma imaging descriptions. Though effective, deriving VASARI is time-consuming and seldom used in clinical practice. This is a problem that machine learning could plausibly automate. Using glioma data from 1172 patients, we developed VASARI-auto, an automated labelling software applied to both open-source lesion masks and our openly available tumour segmentation model. In parallel, two consultant neuroradiologists independently quantified VASARI features in a subsample of 100 glioblastoma cases. We quantified: 1) agreement across neuroradiologists and VASARI-auto; 2) calibration of performance equity; 3) an economic workforce analysis; and 4) fidelity in predicting patient survival. Tumour segmentation was compatible with the current state of the art and equally performant regardless of age or sex. A modest inter-rater variability between in-house neuroradiologists was comparable to between neuroradiologists and VASARI-auto, with far higher agreement between VASARI-auto methods. The time taken for neuroradiologists to derive VASARI was substantially higher than VASARI-auto (mean time per case 317 vs. 3 seconds). A UK hospital workforce analysis forecast that three years of VASARI featurisation would demand 29,777 consultant neuroradiologist workforce hours ({\pounds}1,574,935), reducible to 332 hours of computing time (and {\pounds}146 of power) with VASARI-auto. The best-performing survival model utilised VASARI-auto features as opposed to those derived by neuroradiologists. VASARI-auto is a highly efficient automated labelling system with equitable performance across patient age or sex, a favourable economic profile if used as a decision support tool, and with non-inferior fidelity in downstream patient survival prediction. Future work should iterate upon and integrate such tools to enhance patient care.

[3]  arXiv:2404.15355 [pdf, ps, other]
Title: Frailty Assessment in Aortic Stenosis based on Dynamic Interconnection between Cardiac and Motor Systems
Comments: arXiv admin note: substantial text overlap with arXiv:2303.13591
Subjects: Quantitative Methods (q-bio.QM)

Background: Aortic stenosis (AS) is the most common acquired valvar disease and is associated with increased risk for frailty. Frailty as a geriatric syndrome is associated with muscle weakness and a compromised autonomic nervous system (ANS) performance in older adults. The purpose of the current work was to assess differences in both motor and ANS performance, and interaction between them, as symptoms of frailty in community dwelling older adults with and without AS. Results: Eighty-six participants were recruited, including 30 with (age=72$\pm$11, 10 non-frail and 20 pre-frail/frail) and 56 without AS (age=80$\pm$8, 12 non-frail and 44 pre-frail/frail). There was a significant difference in UEF motor score between older adults with and without AS (p<0.01, mean values of 0.57$\pm$0.25 and 0.48$\pm$0.23, respectively). Differences in UEF motor score was also observed between the frailty groups (p=0.02, mean values of 0.55$\pm$0.24 and 0.40$\pm$0.20 for pre-frail/frail and non-frail, respectively). CCM parameters showed significant differences between the frailty groups (p=0.02, mean CCM of 0.69$\pm$0.05 for non-frail and 0.54$\pm$0.03 for pre-frail/frail), but not between the AS groups (p>0.70). No significant interaction was observed between frailty and AS condition (p>0.08). Conclusion: Current findings suggest that ANS measures may be highly associated with frailty regardless of AS condition. Combining motor and HR dynamics parameters in a multimodal model may provide a promising tool for frailty assessment

[4]  arXiv:2404.15369 [pdf, other]
Title: Can a Machine be Conscious? Towards Universal Criteria for Machine Consciousness
Comments: This work was supported by the UKRI CDT in AI for Healthcare, this http URL (Grant No. EP/S023283/1)
Subjects: Neurons and Cognition (q-bio.NC); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)

As artificially intelligent systems become more anthropomorphic and pervasive, and their potential impact on humanity more urgent, discussions about the possibility of machine consciousness have significantly intensified, and it is sometimes seen as 'the holy grail'. Many concerns have been voiced about the ramifications of creating an artificial conscious entity. This is compounded by a marked lack of consensus around what constitutes consciousness and by an absence of a universal set of criteria for determining consciousness. By going into depth on the foundations and characteristics of consciousness, we propose five criteria for determining whether a machine is conscious, which can also be applied more generally to any entity. This paper aims to serve as a primer and stepping stone for researchers of consciousness, be they in philosophy, computer science, medicine, or any other field, to further pursue this holy grail of philosophy, neuroscience and artificial intelligence.

[5]  arXiv:2404.15387 [pdf, other]
Title: Machine Learning Applied to the Detection of Mycotoxin in Food: A Review
Comments: 39 pages, 8 figures, review paper
Subjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG)

Mycotoxins, toxic secondary metabolites produced by certain fungi, pose significant threats to global food safety and public health. These compounds can contaminate a variety of crops, leading to economic losses and health risks to both humans and animals. Traditional lab analysis methods for mycotoxin detection can be time-consuming and may not always be suitable for large-scale screenings. However, in recent years, machine learning (ML) methods have gained popularity for use in the detection of mycotoxins and in the food safety industry in general, due to their accurate and timely predictions. We provide a systematic review on some of the recent ML applications for detecting/predicting the presence of mycotoxin on a variety of food ingredients, highlighting their advantages, challenges, and potential for future advancements. We address the need for reproducibility and transparency in ML research through open access to data and code. An observation from our findings is the frequent lack of detailed reporting on hyperparameters in many studies as well as a lack of open source code, which raises concerns about the reproducibility and optimisation of the ML models used. The findings reveal that while the majority of studies predominantly utilised neural networks for mycotoxin detection, there was a notable diversity in the types of neural network architectures employed, with convolutional neural networks being the most popular.

[6]  arXiv:2404.15634 [pdf, other]
Title: A Minimal Framework for Optimizing Vaccination Protocols Targeting Highly Mutable Pathogens
Subjects: Populations and Evolution (q-bio.PE); Statistical Mechanics (cond-mat.stat-mech); Biological Physics (physics.bio-ph)

A persistent public health challenge is finding immunization schemes that are effective in combating highly mutable pathogens such as HIV and influenza viruses. To address this, we analyze a simplified model of affinity maturation, the Darwinian evolutionary process B cells undergo during immunization. The vaccination protocol dictates selection forces that steer affinity maturation to generate antibodies. We focus on determining the optimal selection forces exerted by a generic time-dependent vaccination protocol to maximize production of broadly neutralizing antibodies (bnAbs) that can protect against a broad spectrum of pathogen strains. The model lends itself to a path integral representation and operator approximations within a mean-field limit, providing guiding principles for optimizing time-dependent vaccine-induced selection forces to enhance bnAb generation. We compare our analytical mean-field results with the outcomes of stochastic simulations and discuss their similarities and differences.

[7]  arXiv:2404.15776 [pdf, ps, other]
Title: The Past, Present, and Future of Plant Stress Research
Subjects: Quantitative Methods (q-bio.QM)

Life finds a way. For sessile organisms like plants, the need to adapt to changes in the environment is even more poignant. For humanity, the need to develop crops that can grow in diverse environments and feed our growing population is an existential one. The development of fast-growing, high-yielding crop varieties sparked the Green Revolution, and the advent of the genomics era enabled the development of customized transgenic crops enhanced for specific traits or resistances. Today, the proliferation of artificial intelligence (AI) allows scientists to rapidly screen through massive and complex datasets to uncover elusive patterns in the data, enabling us to create more robust and faster models for prediction and hypothesis generation in a bid to develop more stress-resilient plants. This review aims to provide an overview of the evolution of environmental stress research across the plant kingdom over the past fifty years. It will cover historical landmark concepts and discoveries that were seminal in advancing the field, provide a global snapshot of our current scientific progress, and conclude with a discussion on the advent of AI tools that would help accelerate scientific discovery.

[8]  arXiv:2404.15805 [pdf, other]
Title: Beyond ESM2: Graph-Enhanced Protein Sequence Modeling with Efficient Clustering
Subjects: Biomolecules (q-bio.BM); Machine Learning (cs.LG)

Proteins are essential to life's processes, underpinning evolution and diversity. Advances in sequencing technology have revealed millions of proteins, underscoring the need for sophisticated pre-trained protein models for biological analysis and AI development. Facebook's ESM2, the most advanced protein language model to date, leverages a masked prediction task for unsupervised learning, crafting amino acid representations with notable biochemical accuracy. Yet, it lacks in delivering functional protein insights, signaling an opportunity for enhancing representation quality.Our study addresses this gap by incorporating protein family classification into ESM2's training.This approach, augmented with Community Propagation-Based Clustering Algorithm, improves global protein representations, while a contextual prediction task fine-tunes local amino acid accuracy. Significantly, our model achieved state-of-the-art results in several downstream experiments, demonstrating the power of combining global and local methodologies to substantially boost protein representation quality.

Cross-lists for Thu, 25 Apr 24

[9]  arXiv:2404.15293 (cross-list from eess.IV) [pdf, other]
Title: Interactive Manipulation and Visualization of 3D Brain MRI for Surgical Training
Subjects: Image and Video Processing (eess.IV); Graphics (cs.GR); Neurons and Cognition (q-bio.NC)

In modern medical diagnostics, magnetic resonance imaging (MRI) is an important technique that provides detailed insights into anatomical structures. In this paper, we present a comprehensive methodology focusing on streamlining the segmentation, reconstruction, and visualization process of 3D MRI data. Segmentation involves the extraction of anatomical regions with the help of state-of-the-art deep learning algorithms. Then, 3D reconstruction converts segmented data from the previous step into multiple 3D representations. Finally, the visualization stage provides efficient and interactive presentations of both 2D and 3D MRI data. Integrating these three steps, the proposed system is able to augment the interpretability of the anatomical information from MRI scans according to our interviews with doctors. Even though this system was originally designed and implemented as part of human brain haptic feedback simulation for surgeon training, it can also provide experienced medical practitioners with an effective tool for clinical data analysis, surgical planning and other purposes

[10]  arXiv:2404.15309 (cross-list from eess.SP) [pdf, other]
Title: Sparse Bayesian Correntropy Learning for Robust Muscle Activity Reconstruction from Noisy Brain Recordings
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)

Sparse Bayesian learning has promoted many effective frameworks for brain activity decoding, especially for the reconstruction of muscle activity. However, existing sparse Bayesian learning mainly employs Gaussian distribution as error assumption in the reconstruction task, which is not necessarily the truth in the real-world application. On the other hand, brain recording is known to be highly noisy and contains many non-Gaussian noises, which could lead to significant performance degradation for sparse Bayesian learning method. The goal of this paper is to propose a new robust implementation for sparse Bayesian learning, so that robustness and sparseness can be realized simultaneously. Motivated by the great robustness of maximum correntropy criterion (MCC), we proposed an integration of MCC into the sparse Bayesian learning regime. To be specific, we derived the explicit error assumption inherent in the MCC and then leveraged it for the likelihood function. Meanwhile, we used the automatic relevance determination (ARD) technique for the sparse prior distribution. To fully evaluate the proposed method, a synthetic dataset and a real-world muscle activity reconstruction task with two different brain modalities were employed. Experimental results showed that our proposed sparse Bayesian correntropy learning framework improves significantly the robustness in a noisy regression task. The proposed method can realize higher correlation coefficient and lower root mean squared error in the real-world muscle activity reconstruction tasks. Sparse Bayesian correntropy learning provides a powerful tool for neural decoding which can promote the development of brain-computer interfaces.

[11]  arXiv:2404.15319 (cross-list from eess.SP) [pdf, other]
Title: The largest EEG-based BCI reproducibility study for open science: the MOABB benchmark
Comments: 43 pages, 13 figures, 5 tables
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Neurons and Cognition (q-bio.NC)

Objective. This study conduct an extensive Brain-computer interfaces (BCI) reproducibility analysis on open electroencephalography datasets, aiming to assess existing solutions and establish open and reproducible benchmarks for effective comparison within the field. The need for such benchmark lies in the rapid industrial progress that has given rise to undisclosed proprietary solutions. Furthermore, the scientific literature is dense, often featuring challenging-to-reproduce evaluations, making comparisons between existing approaches arduous.
Approach. Within an open framework, 30 machine learning pipelines (separated into raw signal: 11, Riemannian: 13, deep learning: 6) are meticulously re-implemented and evaluated across 36 publicly available datasets, including motor imagery (14), P300 (15), and SSVEP (7). The analysis incorporates statistical meta-analysis techniques for results assessment, encompassing execution time and environmental impact considerations.
Main results. The study yields principled and robust results applicable to various BCI paradigms, emphasizing motor imagery, P300, and SSVEP. Notably, Riemannian approaches utilizing spatial covariance matrices exhibit superior performance, underscoring the necessity for significant data volumes to achieve competitive outcomes with deep learning techniques. The comprehensive results are openly accessible, paving the way for future research to further enhance reproducibility in the BCI domain.
Significance. The significance of this study lies in its contribution to establishing a rigorous and transparent benchmark for BCI research, offering insights into optimal methodologies and highlighting the importance of reproducibility in driving advancements within the field.

[12]  arXiv:2404.15986 (cross-list from cs.DS) [pdf, other]
Title: Seed Selection in the Heterogeneous Moran Process
Comments: Accepted for publication in IJCAI 2024
Subjects: Data Structures and Algorithms (cs.DS); Computational Complexity (cs.CC); Social and Information Networks (cs.SI); Populations and Evolution (q-bio.PE)

The Moran process is a classic stochastic process that models the rise and takeover of novel traits in network-structured populations. In biological terms, a set of mutants, each with fitness $m\in(0,\infty)$ invade a population of residents with fitness $1$. Each agent reproduces at a rate proportional to its fitness and each offspring replaces a random network neighbor. The process ends when the mutants either fixate (take over the whole population) or go extinct. The fixation probability measures the success of the invasion. To account for environmental heterogeneity, we study a generalization of the Standard process, called the Heterogeneous Moran process. Here, the fitness of each agent is determined both by its type (resident/mutant) and the node it occupies. We study the natural optimization problem of seed selection: given a budget $k$, which $k$ agents should initiate the mutant invasion to maximize the fixation probability? We show that the problem is strongly inapproximable: it is $\mathbf{NP}$-hard to distinguish between maximum fixation probability 0 and 1. We then focus on mutant-biased networks, where each node exhibits at least as large mutant fitness as resident fitness. We show that the problem remains $\mathbf{NP}$-hard, but the fixation probability becomes submodular, and thus the optimization problem admits a greedy $(1-1/e)$-approximation. An experimental evaluation of the greedy algorithm along with various heuristics on real-world data sets corroborates our results.

Replacements for Thu, 25 Apr 24

[13]  arXiv:2106.07096 (replaced) [pdf, ps, other]
Title: Tests for partial correlation between repeatedly observed nonstationary nonlinear timeseries
Subjects: Methodology (stat.ME); Neurons and Cognition (q-bio.NC); Applications (stat.AP)
[14]  arXiv:2303.02079 (replaced) [pdf, other]
Title: Insights from number theory into the critical Kauffman model with connectivity one
Comments: 15 pages, 3 figures
Subjects: Molecular Networks (q-bio.MN); Disordered Systems and Neural Networks (cond-mat.dis-nn); Probability (math.PR)
[15]  arXiv:2305.01851 (replaced) [pdf, ps, other]
Title: Complexity and Enumeration in Models of Genome Rearrangement
Comments: Full version of paper that appeared in COCOON 2023: this https URL
Subjects: Genomics (q-bio.GN); Computational Complexity (cs.CC); Combinatorics (math.CO)
[16]  arXiv:2309.11087 (replaced) [pdf, other]
Title: Embed-Search-Align: DNA Sequence Alignment using Transformer Models
Comments: 13 pages, Tables 7, Figures 6
Subjects: Genomics (q-bio.GN); Artificial Intelligence (cs.AI)
[17]  arXiv:2401.08805 (replaced) [pdf, other]
Title: Quantifying cell cycle regulation by tissue crowding
Subjects: Quantitative Methods (q-bio.QM); Biological Physics (physics.bio-ph)
[18]  arXiv:2401.17231 (replaced) [pdf, other]
Title: Achieving More Human Brain-Like Vision via Human EEG Representational Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE); Neurons and Cognition (q-bio.NC)
[19]  arXiv:2402.01598 (replaced) [pdf, other]
Title: Learning from Two Decades of Blood Pressure Data: Demography-Specific Patterns Across 75 Million Patient Encounters
Subjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG); Applications (stat.AP)
[20]  arXiv:2402.01942 (replaced) [pdf, ps, other]
Title: Pairwise Rearrangement is Fixed-Parameter Tractable in the Single Cut-and-Join Model
Comments: Full version of paper to appear in SWAT 2024; arXiv admin note: text overlap with arXiv:2305.01851
Subjects: Genomics (q-bio.GN); Data Structures and Algorithms (cs.DS); Combinatorics (math.CO)
[21]  arXiv:2403.19011 (replaced) [pdf, other]
Title: Sequential Inference of Hospitalization Electronic Health Records Using Probabilistic Models
Subjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG)
[22]  arXiv:2404.14854 (replaced) [pdf, other]
Title: Quenching of stable pulses in slow-fast excitable media
Comments: 16 pages, 11 figures
Subjects: Pattern Formation and Solitons (nlin.PS); Numerical Analysis (math.NA); Quantitative Methods (q-bio.QM)
[ total of 22 entries: 1-22 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2404, contact, help  (Access key information)