We gratefully acknowledge support from
the Simons Foundation and member institutions.


New submissions

[ total of 32 entries: 1-32 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Tue, 25 Jan 22

[1]  arXiv:2201.08946 [pdf, ps, other]
Title: Estimation and Hypothesis Testing of Strain-Specific Vaccine Efficacy with Missing Strain Types, with Applications to a COVID-19 Vaccine Trial
Subjects: Methodology (stat.ME)

Statistical methods are developed for analysis of clinical and virus genetics data from phase 3 randomized, placebo-controlled trials of vaccines against novel coronavirus COVID-19. Vaccine efficacy (VE) of a vaccine to prevent COVID-19 caused by one of finitely many genetic strains of SARS-CoV-2 may vary by strain. The problem of assessing differential VE by viral genetics can be formulated under a competing risks model where the endpoint is virologically confirmed COVID-19 and the cause-of-failure is the infecting SARS-CoV-2 genotype. Strain-specific VE is defined as one minus the cause-specific hazard ratio (vaccine/placebo). For the COVID-19 VE trials, the time to COVID-19 is right-censored, and a substantial percentage of failure cases are missing the infecting virus genotype. We develop estimation and hypothesis testing procedures for strain-specific VE when the failure time is subject to right censoring and the cause-of-failure is subject to missingness, focusing on $J \ge 2$ discrete categorical unordered or ordered virus genotypes. The stratified Cox proportional hazards model is used to relate the cause-specific outcomes to explanatory variables. The inverse probability weighted complete-case (IPW) estimator and the augmented inverse probability weighted complete-case (AIPW) estimator are investigated. Hypothesis tests are developed to assess whether the vaccine provides at least a specified level of efficacy against some viral genotypes and whether VE varies across genotypes, adjusting for covariates. The finite-sample properties of the proposed tests are studied through simulations and are shown to have good performances. In preparation for the real data analyses, the developed methods are applied to a pseudo dataset mimicking the Moderna COVE trial.

[2]  arXiv:2201.09033 [pdf, other]
Title: Sample Size Considerations for Bayesian Multilevel Hidden Markov Models: A Simulation Study on Multivariate Continuous Data with highly overlapping Component Distributions based on Sleep Data
Comments: main text: 35 pages, 9 figures. Submitted to Computational Statistics & Data Analysis
Subjects: Methodology (stat.ME); Computation (stat.CO)

Spurred in part by the ever-growing number of sensors and web-based methods of collecting data, the use of Intensive Longitudinal Data (ILD) is becoming more common in the social and behavioural sciences. The ILD collected in this field are often hypothesised to be the result of latent states (e.g. behaviour, emotions), and the promise of ILD lies in its ability to capture the dynamics of these states as they unfold in time. In particular, by collecting data for multiple subjects, researchers can observe how such dynamics differ between subjects. The Bayesian Multilevel Hidden Markov Model (mHMM) is a relatively novel model that is suited to model the ILD of this kind while taking into account heterogeneity between subjects. While the mHMM has been applied in a variety of settings, large-scale studies that examine the required sample size for this model are lacking. In this paper, we address this research gap by conducting a simulation study to evaluate the effect of changing (1) the number of subjects, (2) the number of occasions, and (3) the between subjects variability on parameter estimates obtained by the mHMM. We frame this simulation study in the context of sleep research, which consists of multivariate continuous data that displays considerable overlap in the state dependent component distributions. In addition, we generate a set of baseline scenarios with more general data properties. Overall, the number of subjects has the largest effect on model performance. However, the number of occasions is important to adequately model latent state transitions. We discuss how the characteristics of the data influence parameter estimation and provide recommendations to researchers seeking to apply the mHMM to their own data.

[3]  arXiv:2201.09098 [pdf, other]
Title: Estimation of the covariance structure from SNP allele frequencies
Subjects: Methodology (stat.ME)

We propose two new statistics, V and S, to disentangle the population history of related populations from SNP frequency data. If the populations are related by a tree, we show by theoretical means as well as by simulation that the new statistics are able to identify the root of a tree correctly, in contrast to standard statistics, such as the observed matrix of F2-statistics (distances between pairs of populations). The statistic V is obtained by averaging over all SNPs (similar to standard statistics). Its expectation is the true covariance matrix of the observed population SNP frequencies, offset by a matrix with identical entries. In contrast, the statistic S is put in a Bayesian context and is obtained by averaging over pairs of SNPs, such that each SNP is only used once. It thus makes use of the joint distribution of pairs of SNPs.
In addition, we provide a number of novel mathematical results about old and new statistics, and their mutual relationship.

[4]  arXiv:2201.09179 [pdf, other]
Title: Combining Mixed Effects Hidden Markov Models with Latent Alternating Recurrent Event Processes to Model Diurnal Active-Rest Cycles
Subjects: Methodology (stat.ME)

Data collected from wearable devices and smartphones can shed light on an individual's patterns of behavior and circadian routine. Phone use can be modeled as alternating between the state of active use and the state of being idle. Markov chains and alternating recurrent event models are commonly used to model state transitions in cases such as these, and the incorporation of random effects can be used to introduce time-of-day effects. While state labels can be derived prior to modeling dynamics, this approach omits informative regression covariates that can influence state memberships. We instead propose a recurrent event proportional hazards (PH) regression to model the transitions between latent states. We propose an Expectation-Maximization (EM) algorithm for imputing latent state labels and estimating regression parameters. We show that our E-step simplifies to the hidden Markov model (HMM) forward-backward algorithm, allowing us to recover a HMM in addition to PH models. We derive asymptotic distributions for our model parameter estimates and compare our approach against competing methods through simulation as well as in a digital phenotyping study that followed smartphone use in a cohort of adolescents with mood disorders.

[5]  arXiv:2201.09192 [pdf, ps, other]
Title: High-dimensional model-assisted inference for treatment effects with multi-valued treatments
Subjects: Methodology (stat.ME)

Consider estimation of average treatment effects with multi-valued treatments using augmented inverse probability weighted (IPW) estimators, depending on outcome regression and propensity score models in high-dimensional settings. These regression models are often fitted by regularized likelihood-based estimation, while ignoring how the fitted functions are used in the subsequent inference about the treatment parameters. Such separate estimation can be associated with known difficulties in existing methods. We develop regularized calibrated estimation for fitting propensity score and outcome regression models, where sparsity-including penalties are employed to facilitate variable selection but the loss functions are carefully chosen such that valid confidence intervals can be obtained under possible model misspecification. Unlike in the case of binary treatments, the usual augmented IPW estimator is generalized by allowing different copies of coefficient estimators in outcome regression to ensure just-identification. For propensity score estimation, the new loss function and estimating functions are directly tied to achieving covariate balance between weighted treatment groups. We develop practical numerical algorithms for computing the regularized calibrated estimators with group Lasso by innovatively exploiting Fisher scoring, and provide rigorous high-dimensional analysis for the resulting augmented IPW estimators under suitable sparsity conditions, while tackling technical issues absent or overlooked in previous analyses. We present simulation studies and an empirical application to estimate the effects of maternal smoking on birth weights. The proposed methods are implemented in the R package mRCAL.

[6]  arXiv:2201.09194 [pdf, other]
Title: Distributed Learning of Generalized Linear Causal Networks
Comments: 27 pages, 3 tables, 3 figures
Subjects: Methodology (stat.ME); Machine Learning (cs.LG); Computation (stat.CO)

We consider the task of learning causal structures from data stored on multiple machines, and propose a novel structure learning method called distributed annealing on regularized likelihood score (DARLS) to solve this problem. We model causal structures by a directed acyclic graph that is parameterized with generalized linear models, so that our method is applicable to various types of data. To obtain a high-scoring causal graph, DARLS simulates an annealing process to search over the space of topological sorts, where the optimal graphical structure compatible with a sort is found by a distributed optimization method. This distributed optimization relies on multiple rounds of communication between local and central machines to estimate the optimal structure. We establish its convergence to a global optimizer of the overall score that is computed on all data across local machines. To the best of our knowledge, DARLS is the first distributed method for learning causal graphs with such theoretical guarantees. Through extensive simulation studies, DARLS has shown competing performance against existing methods on distributed data, and achieved comparable structure learning accuracy and test-data likelihood with competing methods applied to pooled data across all local machines. In a real-world application for modeling protein-DNA binding networks with distributed ChIP-Sequencing data, DARLS also exhibits higher predictive power than other methods, demonstrating a great advantage in estimating causal networks from distributed data.

[7]  arXiv:2201.09313 [pdf, other]
Title: Non-decimated 2D Wavelet Spectrum and Its Use in Breast Cancer Diagnostics
Subjects: Methodology (stat.ME)

To improve diagnostic accuracy of breast cancer detection, several researchers have used the wavelet-based tools, which provide additional insight and information for aiding diagnostic decisions. The accuracy of such diagnoses, however, can be improved. This paper introduces a wavelet-based technique, non-decimated wavelet transform (NDWT)-based scaling estimation, that improves scaling parameter estimation over the traditional methods. One distinctive feature of NDWT is that it does not decimate wavelet coefficients at multiscale levels resulting in redundant outputs which are used to lower the variance of scaling estimators. Another interesting feature of the proposed methodology is the freedom of dyadic constraints for inputs, typical for standard wavelet-based approaches. To compare the estimation performance of the NDWT method to a conventional orthogonal wavelet transform-based method, we use simulation to estimate the Hurst exponent in two-dimensional fractional Brownian fields. The results of the simulation show that the proposed method improves the conventional estimators of scaling and yields estimators with smaller mean-squared errors. We apply the NDWT method to classification of mammograms as cancer or control and, for publicly available mammogram images from the database at the University of South Florida, find the the diagnostic accuracy in excess of 80%.

[8]  arXiv:2201.09320 [pdf, other]
Title: Robust Wavelet-based Assessment of Scaling with Applications
Comments: 26 pages, 2 figures, 6 tables
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)

A number of approaches have dealt with statistical assessment of self-similarity, and many of those are based on multiscale concepts. Most rely on certain distributional assumptions which are usually violated by real data traces, often characterized by large temporal or spatial mean level shifts, missing values or extreme observations. A novel, robust approach based on Theil-type weighted regression is proposed for estimating self-similarity in two-dimensional data (images). The method is compared to two traditional estimation techniques that use wavelet decompositions; ordinary least squares (OLS) and Abry-Veitch bias correcting estimator (AV). As an application, the suitability of the self-similarity estimate resulting from the the robust approach is illustrated as a predictive feature in the classification of digitized mammogram images as cancerous or non-cancerous. The diagnostic employed here is based on the properties of image backgrounds, which is typically an unused modality in breast cancer screening. Classification results show nearly 68% accuracy, varying slightly with the choice of wavelet basis, and the range of multiresolution levels used.

[9]  arXiv:2201.09485 [pdf, other]
Title: Spherical Poisson Point Process Intensity Function Modeling and Estimation with Measure Transport
Comments: 23 pages, 5 figures
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)

Recent years have seen an increased interest in the application of methods and techniques commonly associated with machine learning and artificial intelligence to spatial statistics. Here, in a celebration of the ten-year anniversary of the journal Spatial Statistics, we bring together normalizing flows, commonly used for density function estimation in machine learning, and spherical point processes, a topic of particular interest to the journal's readership, to present a new approach for modeling non-homogeneous Poisson process intensity functions on the sphere. The central idea of this framework is to build, and estimate, a flexible bijective map that transforms the underlying intensity function of interest on the sphere into a simpler, reference, intensity function, also on the sphere. Map estimation can be done efficiently using automatic differentiation and stochastic gradient descent, and uncertainty quantification can be done straightforwardly via nonparametric bootstrap. We investigate the viability of the proposed method in a simulation study, and illustrate its use in a proof-of-concept study where we model the intensity of cyclone events in the North Pacific Ocean. Our experiments reveal that normalizing flows present a flexible and straightforward way to model intensity functions on spheres, but that their potential to yield a good fit depends on the architecture of the bijective map, which can be difficult to establish in practice.

[10]  arXiv:2201.09585 [pdf, ps, other]
Title: The Coupled Rejection Sampler
Comments: 23 pages
Subjects: Methodology (stat.ME); Probability (math.PR); Computation (stat.CO)

We propose a novel coupled rejection-sampling method for sampling from couplings of arbitrary distributions. The method relies on accepting or rejecting coupled samples coming from dominating marginals. Contrary to existing acceptance-rejection methods, the variance of the execution time of the proposed method is limited and stays finite as the two target marginals approach each other in the sense of the total variation norm. In the important special case of coupling multivariate Gaussians with different means and covariances, we derive positive lower bounds for the resulting coupling probability of our algorithm, and we then show how the coupling method can be optimised using convex optimisation. Finally, we show how we can modify the coupled-rejection method to propose from coupled ensemble of proposals, so as to asymptotically recover a maximal coupling. We then apply the method to derive a novel parallel coupled particle filter resampling algorithm, and show how it can be used to speed up unbiased MCMC methods based on couplings.

[11]  arXiv:2201.09681 [pdf, other]
Title: Multivariate sensitivity analysis for a large-scale climate impact and adaptation model
Subjects: Methodology (stat.ME); Applications (stat.AP)

We develop a new efficient methodology for Bayesian global sensitivity analysis for large-scale multivariate data. The focus is on computationally demanding models with correlated variables. A multivariate Gaussian process is used as a surrogate model to replace the expensive computer model. To improve the computational efficiency and performance of the model, compactly supported correlation functions are used. The goal is to generate sparse matrices, which give crucial advantages when dealing with large datasets, where we use cross-validation to determine the optimal degree of sparsity. This method was combined with a robust adaptive Metropolis algorithm coupled with a parallel implementation to speed up the convergence to the target distribution. The method was applied to a multivariate dataset from the IMPRESSIONS Integrated Assessment Platform (IAP2), an extension of the CLIMSAVE IAP, which has been widely applied in climate change impact, adaptation and vulnerability assessments. Our empirical results on synthetic and IAP2 data show that the proposed methods are efficient and accurate for global sensitivity analysis of complex models.

[12]  arXiv:2201.09706 [pdf, other]
Title: Valid belief updates for prequentially additive loss functions arising in Semi-Modular Inference
Comments: 39 pages including supplement, 6 figures
Subjects: Methodology (stat.ME)

Model-based Bayesian evidence combination leads to models with multiple parameteric modules. In this setting the effects of model misspecification in one of the modules may in some cases be ameliorated by cutting the flow of information from the misspecified module. Semi-Modular Inference (SMI) is a framework allowing partial cuts which modulate but do not completely cut the flow of information between modules. We show that SMI is part of a family of inference procedures which implement partial cuts. It has been shown that additive losses determine an optimal, valid and order-coherent belief update. The losses which arise in Cut models and SMI are not additive. However, like the prequential score function, they have a kind of prequential additivity which we define. We show that prequential additivity is sufficient to determine the optimal valid and order-coherent belief update and that this belief update coincides with the belief update in each of our SMI schemes.

[13]  arXiv:2201.09811 [pdf, other]
Title: Imputing Missing Values in the Occupational Requirements Survey
Comments: A (preliminary) software package implementing our method and further downstream analyses is available on Github at this https URL
Subjects: Methodology (stat.ME)

The U.S. Bureau of Labor Statistics allows public access to much of the data acquired through its Occupational Requirements Survey (ORS). This data can be used to draw inferences about the requirements of various jobs and job classes within the United States workforce. However, the dataset contains a multitude of missing observations and estimates, which somewhat limits its utility. Here, we propose a method by which to impute these missing values that leverages many of the inherent features present in the survey data, such as known population limit and correlations between occupations and tasks. An iterative regression fit, implemented with a recent version of XGBoost and executed across a set of simulated values drawn from the distribution described by the known values and their standard deviations reported in the survey, is the approach used to arrive at a distribution of predicted values for each missing estimate. This allows us to calculate a mean prediction and bound said estimate with a 95% confidence interval. We discuss the use of our method and how the resulting imputations can be utilized to inform and pursue future areas of study stemming from the data collected in the ORS. Finally, we conclude with an outline of WIGEM, a generalized version of our weighted, iterative imputation algorithm that could be applied to other contexts.

Cross-lists for Tue, 25 Jan 22

[14]  arXiv:2201.09040 (cross-list from math.ST) [pdf, other]
Title: Optimal Estimation and Computational Limit of Low-rank Gaussian Mixtures
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Methodology (stat.ME); Machine Learning (stat.ML)

Structural matrix-variate observations routinely arise in diverse fields such as multi-layer network analysis and brain image clustering. While data of this type have been extensively investigated with fruitful outcomes being delivered, the fundamental questions like its statistical optimality and computational limit are largely under-explored. In this paper, we propose a low-rank Gaussian mixture model (LrMM) assuming each matrix-valued observation has a planted low-rank structure. Minimax lower bounds for estimating the underlying low-rank matrix are established allowing a whole range of sample sizes and signal strength. Under a minimal condition on signal strength, referred to as the information-theoretical limit or statistical limit, we prove the minimax optimality of a maximum likelihood estimator which, in general, is computationally infeasible. If the signal is stronger than a certain threshold, called the computational limit, we design a computationally fast estimator based on spectral aggregation and demonstrate its minimax optimality. Moreover, when the signal strength is smaller than the computational limit, we provide evidences based on the low-degree likelihood ratio framework to claim that no polynomial-time algorithm can consistently recover the underlying low-rank matrix. Our results reveal multiple phase transitions in the minimax error rates and the statistical-to-computational gap. Numerical experiments confirm our theoretical findings. We further showcase the merit of our spectral aggregation method on the worldwide food trading dataset.

[15]  arXiv:2201.09350 (cross-list from math.ST) [pdf, ps, other]
Title: Elementary proofs of four standard results on false discovery rate
Authors: Ruodu Wang
Subjects: Statistics Theory (math.ST); Methodology (stat.ME)

We collect self-contained elementary proofs of four standard results in the literature on the false discovery rate of the Benjamini-Hochberg (BH) procedure for independent or positive-regression dependent p-values, the Benjamini-Yekutieli procedure for arbitrarily dependent p-values, and the e-BH procedure for arbitrarily dependent e-values. As a corollary, the above proofs also lead to some inequalities of Simes and Hommel.

[16]  arXiv:2201.09366 (cross-list from cs.LG) [pdf, other]
Title: Optimal transport for causal discovery
Subjects: Machine Learning (cs.LG); Methodology (stat.ME)

Approaches based on Functional Causal Models (FCMs) have been proposed to determine causal direction between two variables, by properly restricting model classes; however, their performance is sensitive to the model assumptions, which makes it difficult for practitioners to use. In this paper, we provide a novel dynamical-system view of FCMs and propose a new framework for identifying causal direction in the bivariate case. We first show the connection between FCMs and optimal transport, and then study optimal transport under the constraints of FCMs. Furthermore, by exploiting the dynamical interpretation of optimal transport under the FCM constraints, we determine the corresponding underlying dynamical process of the static cause-effect pair data under the least action principle. It provides a new dimension for describing static causal discovery tasks, while enjoying more freedom for modeling the quantitative causal influences. In particular, we show that Additive Noise Models (ANMs) correspond to volume-preserving pressureless flows. Consequently, based on their velocity field divergence, we introduce a criterion to determine causal direction. With this criterion, we propose a novel optimal transport-based algorithm for ANMs which is robust to the choice of models and extend it to post-noninear models. Our method demonstrated state-of-the-art results on both synthetic and causal discovery benchmark datasets.

[17]  arXiv:2201.09722 (cross-list from stat.CO) [pdf, other]
Title: Uniformly Ergodic Data-Augmented MCMC for Fitting the General Stochastic Epidemic Model to Incidence Data
Comments: 43 pages, 7 Figures, submitted to JASA, data and code to reproduce the experiments and the case study can be requested to the authors
Subjects: Computation (stat.CO); Methodology (stat.ME)

Stochastic epidemic models provide an interpretable probabilistic description of the spread of a disease through a population. Yet, fitting these models when the epidemic process is only partially observed is a notoriously difficult task due to the intractability of the likelihood for many classical models. To remedy this issue, this article introduces a novel data-augmented MCMC algorithm for fast and exact Bayesian inference for the stochastic SIR model given discretely observed infection incidence counts. In a Metropolis-Hastings step, new event times of the latent data are jointly proposed from a surrogate process that closely resembles the SIR, and from which we can efficiently generate epidemics compatible with the observed data.
The proposed DA-MCMC algorithm is fast and, since the latent data are generated from a faithful approximation of the target model, a large portion thereof can be updated per iteration without prohibitively lowering the acceptance rate. We find that the method explores the high-dimensional latent space efficiently and scales to outbreaks with hundreds of thousands of individuals, and we show that the Markov chain underlying the algorithm is uniformly ergodic. We validate its performance via thorough simulation experiments and a case study on the 2013-2015 Ebola outbreak in Western Africa.

[18]  arXiv:2201.09782 (cross-list from stat.AP) [pdf, other]
Title: Inferring taxonomic placement from DNA barcoding allowing discovery of new taxa
Subjects: Applications (stat.AP); Methodology (stat.ME)

In ecology it has become common to apply DNA barcoding to biological samples leading to datasets containing a large number of nucleotide sequences. The focus is then on inferring the taxonomic placement of each of these sequences by leveraging on existing databases containing reference sequences having known taxa. This is highly challenging because i) sequencing is typically only available for a relatively small region of the genome due to cost considerations; ii) many of the sequences are from organisms that are either unknown to science or for which there are no reference sequences available. These issues can lead to substantial classification uncertainty, particularly in inferring new taxa. To address these challenges, we propose a new class of Bayesian nonparametric taxonomic classifiers, BayesANT, which use species sampling model priors to allow new taxa to be discovered at each taxonomic rank. Using a simple product multinomial likelihood with conjugate Dirichlet priors at the lowest rank, a highly efficient algorithm is developed to provide a probabilistic prediction of the taxa placement of each sequence at each rank. BayesANT is shown to have excellent performance in real data, including when many sequences in the test set belong to taxa unobserved in training.

Replacements for Tue, 25 Jan 22

[19]  arXiv:2002.05550 (replaced) [pdf, other]
Title: Bayesian Kernel Two-Sample Testing
Subjects: Methodology (stat.ME); Computation (stat.CO)
[20]  arXiv:2004.05105 (replaced) [pdf, other]
Title: On the identifiability of Bayesian factor analytic models
Comments: to appear in STCO
Subjects: Methodology (stat.ME); Computation (stat.CO)
[21]  arXiv:2007.07953 (replaced) [pdf, other]
Title: A likelihood-based approach for multivariate categorical response regression in high dimensions
Comments: Accepted for publication in Journal of the American Statistical Association
Subjects: Methodology (stat.ME); Computation (stat.CO)
[22]  arXiv:2011.04168 (replaced) [pdf, other]
Title: Likelihood Inference for Possibly Non-Stationary Processes via Adaptive Overdifferencing
Subjects: Methodology (stat.ME)
[23]  arXiv:2012.08848 (replaced) [pdf, ps, other]
Title: Ensemble Kalman filter based Sequential Monte Carlo Sampler for sequential Bayesian inference
Subjects: Methodology (stat.ME); Computation (stat.CO)
[24]  arXiv:2107.06663 (replaced) [pdf, other]
Title: Time Series Estimation of the Dynamic Effects of Disaster-Type Shock
Subjects: Methodology (stat.ME); Econometrics (econ.EM)
[25]  arXiv:2109.05047 (replaced) [pdf, other]
Title: PAC Mode Estimation using PPR Martingale Confidence Sequences
Comments: 30 pages, 2 figures
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Applications (stat.AP); Machine Learning (stat.ML)
[26]  arXiv:2112.02621 (replaced) [pdf, other]
Title: Mean and median bias reduction: A concise review and application to adjacent-categories logit models
Authors: Ioannis Kosmidis
Subjects: Methodology (stat.ME); Applications (stat.AP)
[27]  arXiv:2201.00468 (replaced) [pdf, other]
Title: A General Framework for Treatment Effect Estimation in Semi-Supervised and High Dimensional Settings
Comments: Generalizations added (Appendix A); 59 pages (with supplement), 7 tables, 2 figures
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)
[28]  arXiv:2201.02958 (replaced) [pdf, other]
Title: Smooth Nested Simulation: Bridging Cubic and Square Root Convergence Rates in High Dimensions
Comments: Main body: 39 pages, 5 figures, 3 tables; Supplmental material: 19 pages
Subjects: Methodology (stat.ME); Portfolio Management (q-fin.PM); Risk Management (q-fin.RM); Machine Learning (stat.ML)
[29]  arXiv:2201.07396 (replaced) [pdf, other]
Title: Ordinal Causal Discovery
Authors: Yang Ni, Bani Mallick
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)
[30]  arXiv:1803.03348 (replaced) [pdf, other]
Title: Joint Estimation and Inference for Data Integration Problems based on Multiple Multi-layered Gaussian Graphical Models
Comments: Journal of Machine Learning Research, 2022, this https URL
Subjects: Machine Learning (stat.ML); Statistics Theory (math.ST); Methodology (stat.ME)
[31]  arXiv:1906.02884 (replaced) [pdf, other]
Title: A Statistical Recurrent Stochastic Volatility Model for Stock Markets
Comments: 51 pages, 16 figure, 18 tables
Subjects: Econometrics (econ.EM); Methodology (stat.ME); Machine Learning (stat.ML)
[32]  arXiv:2001.09560 (replaced) [pdf, other]
Title: Estimating Marginal Treatment Effects under Unobserved Group Heterogeneity
Subjects: Econometrics (econ.EM); Methodology (stat.ME)
[ total of 32 entries: 1-32 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, stat, recent, 2201, contact, help  (Access key information)