We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Methods

New submissions

[ total of 7 entries: 1-7 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 25 Jan 22

[1]  arXiv:2201.08941 [pdf]
Title: Uncovering the system vulnerability and criticality of human brain under evolving neuropathological events in Alzheimer's Disease
Comments: 26 pages, 9 figures
Subjects: Quantitative Methods (q-bio.QM); Neurons and Cognition (q-bio.NC)

Despite the striking efforts in investigating neurobiological factors behind the acquisition of beta-amyloid (A), protein tau (T), and neurodegeneration ([N]) biomarkers, the mechanistic pathways of how AT[N] biomarkers spread throughout the brain remain elusive. In this work, we characterized the interaction of AT[N] biomarkers and their propagation across brain networks using a novel bistable reaction-diffusion model, which allows us to establish a new systems biology underpinning of Alzheimer's disease (AD) progression. We applied our model to large-scale longitudinal neuroimages from the ADNI database and studied the systematic vulnerability and criticality of brains. Our major findings include (i) tau is a stronger indicator of regional risk compared to amyloid, (ii) the progression of amyloid and tau follow the Braak-like pattern across the brain, (iii) temporal lobe exhibits higher vulnerability to AD-related pathologies, and (iv) proposed critical brain regions outperforms hub nodes in transmitting disease factors across the brain.

[2]  arXiv:2201.09508 [pdf, other]
Title: Multiple Similarity Drug-Target Interaction Prediction with Random Walks and Matrix Factorization
Subjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG)

The discovery of drug-target interactions (DTIs) is a very promising area of research with great potential. In general, the identification of reliable interactions among drugs and proteins can boost the development of effective pharmaceuticals. In this work, we leverage random walks and matrix factorization techniques towards DTI prediction. In particular, we take a multi-layered network perspective, where different layers correspond to different similarity metrics between drugs and targets. To fully take advantage of topology information captured in multiple views, we develop an optimization framework, called MDMF, for DTI prediction. The framework learns vector representations of drugs and targets that not only retain higher-order proximity across all hyper-layers and layer-specific local invariance, but also approximates the interactions with their inner product. Furthermore, we propose an ensemble method, called MDMF2A, which integrates two instantiations of the MDMF model that optimize surrogate losses of the area under the precision-recall curve (AUPR) and the area under the receiver operating characteristic curve (AUC), respectively. The empirical study on real-world DTI datasets shows that our method achieves significant improvement over current state-of-the-art approaches in four different settings. Moreover, the validation of highly ranked non-interacting pairs also demonstrates the potential of MDMF2A to discover novel DTIs.

[3]  arXiv:2201.09837 [pdf]
Title: Dynamic optimization of volatile fatty acids to enrich biohydrogen production using a deep learning neural network
Subjects: Quantitative Methods (q-bio.QM); Biomolecules (q-bio.BM)

A new strategy was developed to investigate the effect of volatile fatty acids (VFAs) on the efficiency of biogas production with a focus on improving bio-H$_2$. The inoculum used, anaerobic granular sludge obtained from a UASB reactor treating poultry slaughterhouse wastewater, was pretreated with five different pretreatments. The relationship between VFAs and biogas compounds was studied as time-dependent components. In time-dependent processes with small sample size data, regression models may not be good enough at estimating responses. Therefore, a deep learning neural network (DNN) model was developed to estimate the biogas compounds based on the VFAs. The accuracy of this model to predict the biogas compounds was higher than that of multivariate regression models. Further, it could predict the effect of time changes on biogas compounds. Analysis showed that all the pretreatments were able to increase the ratio of butyric acid / acetic acid successfully, decrease propionic acid drastically, and increase the efficiency of bio-H$_2$ production. As discovered, butyric acid had the greatest effect on bio-H$_2$, and propionic acid had the greatest effect on CH$_4$ production. The best amounts of the VFAs were determined using an optimization method, integrated DNN and desirability analysis, dynamically retrained based on digestion time. Accordingly, optimal ranges of acetic, propionic, and butyric acids were 823.2 - 1534.3, 36.3 - 47.4, and 1522 - 1822 mg/L, respectively, determined for digestion time of 25.23 - 123.63 h. These values resulted in the production of bio-H$_2$, N$_2$, CO$_2$, and CH$_4$ in ranges of 6.4 - 26.2, 12.2 - 43.2, 5 - 25.3, and 0 - 1.4 mmol/L, respectively. The optimum ranges of VFAs are relatively wide ranges and practically can be used in biogas plants.

Cross-lists for Tue, 25 Jan 22

[4]  arXiv:2201.09351 (cross-list from stat.AP) [pdf]
Title: The risk of bias in denoising methods
Authors: Kendrick Kay (Center for Magnetic Resonance Research (CMRR), Department of Radiology, University of Minnesota)
Comments: 19 pages, 4 figures
Subjects: Applications (stat.AP); Image and Video Processing (eess.IV); Quantitative Methods (q-bio.QM)

Experimental datasets are growing rapidly in size, scope, and detail, but the value of these datasets is limited by unwanted measurement noise. It is therefore tempting to apply analysis techniques that attempt to reduce noise and enhance signals of interest. In this paper, we draw attention to the possibility that denoising methods may introduce bias and lead to incorrect scientific inferences. To present our case, we first review the basic statistical concepts of bias and variance. Denoising techniques typically reduce variance observed across repeated measurements, but this can come at the expense of introducing bias to the average expected outcome. We then conduct three simple simulations that provide concrete examples of how bias may manifest in everyday situations. These simulations reveal several findings that may be surprising and counterintuitive: (i) different methods can be equally effective at reducing variance but some incur bias while others do not, (ii) identifying methods that better recover ground truth does not guarantee the absence of bias, (iii) bias can arise even if one has specific knowledge of properties of the signal of interest. We suggest that researchers should consider and possibly quantify bias before deploying denoising methods on important research data.

[5]  arXiv:2201.09394 (cross-list from cs.LG) [pdf, other]
Title: An integrated recurrent neural network and regression model with spatial and climatic couplings for vector-borne disease dynamics
Subjects: Machine Learning (cs.LG); Numerical Analysis (math.NA); Populations and Evolution (q-bio.PE); Quantitative Methods (q-bio.QM)

We developed an integrated recurrent neural network and nonlinear regression spatio-temporal model for vector-borne disease evolution. We take into account climate data and seasonality as external factors that correlate with disease transmitting insects (e.g. flies), also spill-over infections from neighboring regions surrounding a region of interest. The climate data is encoded to the model through a quadratic embedding scheme motivated by recommendation systems. The neighboring regions' influence is modeled by a long short-term memory neural network. The integrated model is trained by stochastic gradient descent and tested on leish-maniasis data in Sri Lanka from 2013-2018 where infection outbreaks occurred. Our model outperformed ARIMA models across a number of regions with high infections, and an associated ablation study renders support to our modeling hypothesis and ideas.

[6]  arXiv:2201.09637 (cross-list from cs.LG) [pdf, other]
Title: DrugOOD: Out-of-Distribution (OOD) Dataset Curator and Benchmark for AI-aided Drug Discovery -- A Focus on Affinity Prediction Problems with Noise Annotations
Comments: 54 pages, 11 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Quantitative Methods (q-bio.QM)

AI-aided drug discovery (AIDD) is gaining increasing popularity due to its promise of making the search for new pharmaceuticals quicker, cheaper and more efficient. In spite of its extensive use in many fields, such as ADMET prediction, virtual screening, protein folding and generative chemistry, little has been explored in terms of the out-of-distribution (OOD) learning problem with \emph{noise}, which is inevitable in real world AIDD applications.
In this work, we present DrugOOD, a systematic OOD dataset curator and benchmark for AI-aided drug discovery, which comes with an open-source Python package that fully automates the data curation and OOD benchmarking processes. We focus on one of the most crucial problems in AIDD: drug target binding affinity prediction, which involves both macromolecule (protein target) and small-molecule (drug compound). In contrast to only providing fixed datasets, DrugOOD offers automated dataset curator with user-friendly customization scripts, rich domain annotations aligned with biochemistry knowledge, realistic noise annotations and rigorous benchmarking of state-of-the-art OOD algorithms. Since the molecular data is often modeled as irregular graphs using graph neural network (GNN) backbones, DrugOOD also serves as a valuable testbed for \emph{graph OOD learning} problems. Extensive empirical studies have shown a significant performance gap between in-distribution and out-of-distribution experiments, which highlights the need to develop better schemes that can allow for OOD generalization under noise for AIDD.

Replacements for Tue, 25 Jan 22

[7]  arXiv:2108.00024 (replaced) [pdf]
Title: FRET nanoscopy enables seamless imaging of molecular assemblies with sub-nanometer resolution
Comments: Main Text (34 pages, 6 figures) with Supporting Information (90 pages, 29 figures, 20 tables)
Subjects: Optics (physics.optics); Mesoscale and Nanoscale Physics (cond-mat.mes-hall); Soft Condensed Matter (cond-mat.soft); Biomolecules (q-bio.BM); Quantitative Methods (q-bio.QM)
[ total of 7 entries: 1-7 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2201, contact, help  (Access key information)