We gratefully acknowledge support from
the Simons Foundation and member institutions.

Quantitative Methods

New submissions

[ total of 6 entries: 1-6 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 27 Jan 23

[1]  arXiv:2301.10772 [pdf]
Title: Gene-SGAN: a method for discovering disease subtypes with imaging and genetic signatures via multi-view weakly-supervised deep clustering
Subjects: Quantitative Methods (q-bio.QM); Machine Learning (cs.LG); Image and Video Processing (eess.IV)

Disease heterogeneity has been a critical challenge for precision diagnosis and treatment, especially in neurologic and neuropsychiatric diseases. Many diseases can display multiple distinct brain phenotypes across individuals, potentially reflecting disease subtypes that can be captured using MRI and machine learning methods. However, biological interpretability and treatment relevance are limited if the derived subtypes are not associated with genetic drivers or susceptibility factors. Herein, we describe Gene-SGAN - a multi-view, weakly-supervised deep clustering method - which dissects disease heterogeneity by jointly considering phenotypic and genetic data, thereby conferring genetic correlations to the disease subtypes and associated endophenotypic signatures. We first validate the generalizability, interpretability, and robustness of Gene-SGAN in semi-synthetic experiments. We then demonstrate its application to real multi-site datasets from 28,858 individuals, deriving subtypes of Alzheimer's disease and brain endophenotypes associated with hypertension, from MRI and SNP data. Derived brain phenotypes displayed significant differences in neuroanatomical patterns, genetic determinants, biological and clinical biomarkers, indicating potentially distinct underlying neuropathologic processes, genetic drivers, and susceptibility factors. Overall, Gene-SGAN is broadly applicable to disease subtyping and endophenotype discovery, and is herein tested on disease-related, genetically-driven neuroimaging phenotypes.

[2]  arXiv:2301.10865 [pdf, other]
Title: Persistent topological Laplacian analysis of SARS-CoV-2 variants
Subjects: Quantitative Methods (q-bio.QM)

Topological data analysis (TDA) is an emerging field in mathematics and data science. Its central technique, persistent homology, has had tremendous success in many science and engineering disciplines. However, persistent homology has limitations, including its incapability of describing the homotopic shape evolution of data during filtration. Persistent topological Laplacians (PTLs), such as persistent Laplacian and persistent sheaf Laplacian, were proposed to overcome the drawback of persistent homology. In this work, we examine the modeling and analysis power of PTLs in the study of the protein structures of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike receptor binding domain (RBD) and its variants, i.e., Alpha, Beta, Gamma, BA.1, and BA.2. First, we employ PTLs to study how the RBD mutation-induced structural changes of RBD-angiotensin-converting enzyme 2 (ACE2) binding complexes are captured in the changes of spectra of the PTLs among SARS-CoV-2 variants. Additionally, we use PTLs to analyze the binding of RBD and ACE2-induced structural changes of various SARS-CoV-2 variants. Finally, we explore the impacts of computationally generated RBD structures on PTL-based machine learning, including deep learning, and predictions of deep mutational scanning datasets for the SARS-CoV-2 Omicron BA.2 variant. Our results indicate that PTLs have advantages over persistent homology in analyzing protein structural changes and provide a powerful new TDA tool for data science.

[3]  arXiv:2301.11262 [pdf, ps, other]
Title: Better than DFA? A Bayesian Method for Estimating the Hurst Exponent in Behavioral Sciences
Comments: 50 pages, 14 figures, 6 tables
Subjects: Quantitative Methods (q-bio.QM)

Detrended Fluctuation Analysis (DFA) is the most popular fractal analytical technique used to evaluate the strength of long-range correlations in empirical time series in terms of the Hurst exponent, $H$. Specifically, DFA quantifies the linear regression slope in log-log coordinates representing the relationship between the time series' variability and the number of timescales over which this variability is computed. We compared the performance of two methods of fractal analysis -- the current gold standard, DFA, and a Bayesian method that is not currently well-known in behavioral sciences: the Hurst-Kolmogorov (HK) method -- in estimating the Hurst exponent of synthetic and empirical time series. Simulations demonstrate that the HK method consistently outperforms DFA in three important ways. The HK method: (i) accurately assesses long-range correlations when the measurement time series is short, (ii) shows minimal dispersion about the central tendency, and (iii) yields a point estimate that does not depend on the length of the measurement time series or its underlying Hurst exponent. Comparing the two methods using empirical time series from multiple settings further supports these findings. We conclude that applying DFA to synthetic time series and empirical time series during brief trials is unreliable and encourage the systematic application of the HK method to assess the Hurst exponent of empirical time series in behavioral sciences.

Cross-lists for Fri, 27 Jan 23

[4]  arXiv:2301.10877 (cross-list from cs.CV) [pdf, other]
Title: The Projection-Enhancement Network (PEN)
Comments: Main text: 14 pages, 5 figures; Supplementary text: 4 pages, 2 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)

Contemporary approaches to instance segmentation in cell science use 2D or 3D convolutional networks depending on the experiment and data structures. However, limitations in microscopy systems or efforts to prevent phototoxicity commonly require recording sub-optimally sampled data regimes that greatly reduces the utility of such 3D data, especially in crowded environments with significant axial overlap between objects. In such regimes, 2D segmentations are both more reliable for cell morphology and easier to annotate. In this work, we propose the Projection Enhancement Network (PEN), a novel convolutional module which processes the sub-sampled 3D data and produces a 2D RGB semantic compression, and is trained in conjunction with an instance segmentation network of choice to produce 2D segmentations. Our approach combines augmentation to increase cell density using a low-density cell image dataset to train PEN, and curated datasets to evaluate PEN. We show that with PEN, the learned semantic representation in CellPose encodes depth and greatly improves segmentation performance in comparison to maximum intensity projection images as input, but does not similarly aid segmentation in region-based networks like Mask-RCNN. Finally, we dissect the segmentation strength against cell density of PEN with CellPose on disseminated cells from side-by-side spheroids. We present PEN as a data-driven solution to form compressed representations of 3D data that improve 2D segmentations from instance segmentation networks.

Replacements for Fri, 27 Jan 23

[5]  arXiv:2107.05741 (replaced) [pdf]
Title: Effects of Fine Particulate Matter on Cardiovascular Disease Morbidity: A Study on Seven Metropolitan Cities in South Korea
Subjects: Quantitative Methods (q-bio.QM)
[6]  arXiv:2203.14742 (replaced) [pdf, ps, other]
Title: A Bayesian Approach to Modelling Biological Pattern Formation with Limited Data
Comments: Minor errors corrected, additional clarification added, results unchanged
Subjects: Analysis of PDEs (math.AP); Quantitative Methods (q-bio.QM)
[ total of 6 entries: 1-6 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, q-bio, recent, 2301, contact, help  (Access key information)