Applications
New submissions
[ showing up to 2000 entries per page: fewer  more ]
New submissions for Fri, 21 Jan 22
 [1] arXiv:2201.07910 [pdf, other]

Title: A ComplexLASSO Approach for Localizing Forced Oscillations in Power SystemsComments: 5 pages, submitted to IEEE PESGM 2022Subjects: Applications (stat.AP)
We study the problem of localizing multiple sources of forced oscillations (FOs) and estimating their characteristics, such as frequency, phase, and amplitude, using noisy PMU measurements. For each source location, we model the input oscillation as a sum of unknown sinusoidal terms. This allows us to obtain a linear relationship between measurements and the inputs at the unknown sinusoids' frequencies in the frequency domain. We determine these frequencies by thresholding the empirical spectrum of the noisy measurements. Assuming sparsity in the number of FOs' locations and the number of sinusoids at each location, we cast the location recovery problem as an $\ell_1$regularized least squares problem in the complex domain  i.e., complexLASSO (linear shrinkage and selection operator). We numerically solve this optimization problem using the complexvalued coordinate descent method, and show its efficiency on the IEEE 68bus, 16 machine and WECC 179bus, 29machine systems.
 [2] arXiv:2201.07945 [pdf, other]

Title: A Guideline for the Statistical Analysis of Compositional Data in ImmunologySubjects: Applications (stat.AP)
The study of immune cellular composition is of great scientific interest in immunology and multiple largescale data have also been generated recently to support this investigation. From the statistical point of view, such immune cellular composition data corresponds to compositional data that conveys relative information. In compositional data, each element is positive and all the elements together sum to a constant, which can be set to one in general. Standard statistical methods are not directly applicable for the analysis of compositional data because they do not appropriately handle correlations among elements in the compositional data. As this type of data has become more widely available, investigation of optimal statistical strategies considering compositional features in data became more in great need. In this paper, we review statistical methods for compositional data analysis and illustrate them in the context of immunology. Specifically, we focus on regression analyses using logratio and Dirichlet approaches, discuss their theoretical foundations, and illustrate their applications with immune cellular fraction data generated from colorectal cancer patients.
 [3] arXiv:2201.08072 [pdf, other]

Title: Geometrically adapted Langevin dynamics for Markov chain Monte Carlo simulationsComments: 43 pages, 9 figuresSubjects: Applications (stat.AP); Computation (stat.CO)
Markov Chain Monte Carlo (MCMC) is one of the most powerful methods to sample from a given probability distribution, of which the Metropolis Adjusted Langevin Algorithm (MALA) is a variant wherein the gradient of the distribution is used towards faster convergence. However, being set up in the Euclidean framework, MALA might perform poorly in higher dimensional problems or in those involving anisotropic densities as the underlying nonEuclidean aspects of the geometry of the sample space remain unaccounted for. We make use of concepts from differential geometry and stochastic calculus on Riemannian manifolds to geometrically adapt a stochastic differential equation with a nontrivial drift term. This adaptation is also referred to as a stochastic development. We apply this method specifically to the Langevin diffusion equation and arrive at a geometrically adapted Langevin dynamics. This new approach far outperforms MALA, certain manifold variants of MALA, and other approaches such as Hamiltonian Monte Carlo (HMC), its adaptive variant the noUturn sampler (NUTS) implemented in Stan, especially as the dimension of the problem increases where often GALA is actually the only successful method. This is evidenced through several numerical examples that include parameter estimation of a broad class of probability distributions and a logistic regression problem.
 [4] arXiv:2201.08171 [pdf, other]

Title: Use of Simulation Models for the Development of a Statistical Production Framework for Mobile Network Data with the simutils PackageComments: 17 pages, 11 figures, presented at the Conference Use of R in Official Statistics 2021, 2426 November 2021, Bucharest (Romania)Subjects: Applications (stat.AP); Methodology (stat.ME)
We propose to use agentbased simulation models for the development of statistical methods in Official Statistics, especially in relation with the new digital data sources. We present a mobile network data simulator which is managed through the simutils R package which provides geospatial representations of the simulated data. While the synthetic data are produced by an external tool, our simutils package allows an R user to parameterize and run this external simulation tool, to build geospatial data structures from the simulation output or to compute several aggregates. The geospatial data structures were designed with the purpose of using them in a visualization package too. Useful simulation models require the incorporation of real metadata from mobile telecommunication networks driving us to the inclusion of functionalities allowing the user to specify and validate them. All metadata are specified using XML file whose structure are defined in corresponding XSD files. Our R package includes example data sets and we show here how validate the metadata, how to run a simulation and how build the geospatial data structures and how to compute different aggregates.
 [5] arXiv:2201.08362 [pdf, ps, other]

Title: Generalised functional additive mixed models with compositional covariates for areal Covid19 incidence curvesComments: submitted for publicationSubjects: Applications (stat.AP); Methodology (stat.ME)
We extend the generalised functional additive mixed model to include (functional) compositional covariates carrying relative information of a whole. Relying on the isometric isomorphism of the Bayes Hilbert space of probability densities with a subspace of the $L^2$, we include functional compositions as transformed functional covariates with constrained effect function. The extended model allows for the estimation of linear, nonlinear and timevarying effects of scalar and functional covariates, as well as (correlated) functional random effects, in addition to the compositional effects. We use the model to estimate the effect of the age, sex and smoking (functional) composition of the population on regional Covid19 incidence data for Spain, while accounting for climatological and sociodemographic covariate effects and spatial correlation.
Crosslists for Fri, 21 Jan 22
 [6] arXiv:2201.07874 (crosslist from stat.ME) [pdf, ps, other]

Title: Bayesian Prediction with Covariates Subject to Detection LimitsSubjects: Methodology (stat.ME); Applications (stat.AP)
Missing values in covariates due to censoring by signal interference or lack of sensitivity in the measuring devices are common in industrial problems. We propose a full Bayesian solution to the prediction problem with an efficient Markov Chain Monte Carlo (MCMC) algorithm that updates all the censored covariate values jointly in a random scan Gibbs sampler. We show that the joint updating of missing covariate values can be at least two orders of magnitude more efficient than univariate updating. This increased efficiency is shown to be crucial for quickly learning the missing covariate values and their uncertainty in a realtime decision making context, in particular when there is substantial correlation in the posterior for the missing values. The approach is evaluated on simulated data and on data from the telecom sector. Our results show that the proposed Bayesian imputation gives substantially more accurate predictions than na\"ive imputation, and that the use of auxiliary variables in the imputation gives additional predictive power.
 [7] arXiv:2201.07896 (crosslist from stat.ME) [pdf, other]

Title: Generative Models for Periodicity Detection in Noisy SignalsSubjects: Methodology (stat.ME); Applications (stat.AP)
We introduce a new periodicity detection algorithm for binary time series of event onsets, the Gaussian Mixture Periodicity Detection Algorithm (GMPDA). The algorithm approaches the periodicity detection problem to infer the parameters of a generative model. We specified two models  the Clock and Random Walk  which describe two different periodic phenomena and provide a generative framework. The algorithm achieved strong results on test cases for single and multiple periodicity detection and varying noise levels. The performance of GMPDA was also evaluated on real data, recorded leg movements during sleep, where GMPDA was able to identify the expected periodicities despite high noise levels. The paper's key contributions are two new models for generating periodic event behavior and the GMPDA algorithm for multiple periodicity detection, which is highly accurate under noise.
 [8] arXiv:2201.08302 (crosslist from stat.CO) [pdf, other]

Title: The R Package HCV for Hierarchical Clustering from VertexlinksComments: 12 pages, 7 figuresSubjects: Computation (stat.CO); Applications (stat.AP)
The HCV package implements the hierarchical clustering for spatial data. It requires clustering results not only homogeneous in nongeographical features among samples but also geographically close to each other within a cluster. We modified typically used hierarchical agglomerative clustering algorithms to introduce the spatial homogeneity, by considering geographical locations as vertices and converting spatial adjacency into whether a shared edge exists between a pair of vertices. The main function HCV obeying constraints of the vertex links automatically enforces the spatial contiguity property at each step of iterations. In addition, two methods to find an appropriate number of clusters and to report cluster members are also provided.
Replacements for Fri, 21 Jan 22
 [9] arXiv:2102.08573 (replaced) [pdf, other]

Title: Robust Mean Estimation in High Dimensions: An Outlier Fraction Agnostic and Efficient AlgorithmComments: arXiv admin note: text overlap with arXiv:2008.09239Subjects: Applications (stat.AP); Information Theory (cs.IT)
 [10] arXiv:2111.08118 (replaced) [pdf, other]

Title: NeuroHotnet: A Graph Theoretic Approach for Brain FC EstimationComments: 36 pages, 10 figures, 3 tables, 2 algorithmsSubjects: Applications (stat.AP); Social and Information Networks (cs.SI); Neurons and Cognition (qbio.NC)
 [11] arXiv:2104.01165 (replaced) [pdf, other]

Title: Distributional data analysis of accelerometer data from the NHANES database using nonparametric survey regression modelsSubjects: Methodology (stat.ME); Applications (stat.AP); Other Statistics (stat.OT)
 [12] arXiv:2110.00224 (replaced) [pdf, other]

Title: Censored autoregressive regression models with Student$t$ innovationsComments: 22 pages, 10 figures and 3 tablesSubjects: Methodology (stat.ME); Applications (stat.AP)
 [13] arXiv:2110.00533 (replaced) [pdf, ps, other]

Title: Relative Contagiousness of Emerging Virus Variants: An Analysis of the Alpha, Delta, and Omicron SARSCoV2 VariantsAuthors: Peter Reinhard HansenSubjects: Econometrics (econ.EM); Applications (stat.AP)
 [14] arXiv:2112.07602 (replaced) [pdf, other]

Title: A Framework for the MetaAnalysis of Randomized Experiments with Applications to HeavyTailed Response DataAuthors: Nilesh Tripuraneni, Dhruv Madeka, Dean Foster, Dominique PerraultJoncas, Michael I. JordanSubjects: Methodology (stat.ME); Applications (stat.AP); Machine Learning (stat.ML)
 [15] arXiv:2201.06604 (replaced) [pdf, other]

Title: A tool set for random number generation on GPUs in RSubjects: Computation (stat.CO); Applications (stat.AP)
[ showing up to 2000 entries per page: fewer  more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, stat, recent, 2201, contact, help (Access key information)