New submissions for Mon, 25 May 20

[1]  arXiv:2005.10892 [pdf, other]
Title: Combining Cluster Sampling and Link-Tracing Sampling to Estimate Totals and Means of Hidden Populations in Presence of Heterogeneous Probabilities of Links
Comments: 34 pages, 2 figures, technical report
Subjects: Methodology (stat.ME)

We propose Horvitz-Thompson-like and Hajek-like estimators of the total and mean of the values of a variable of interest associated with the elements of a hard-to-reach population sampled by the variant of link-tracing sampling proposed by Felix-Medina and Thompson (2004). As examples of this type of population are drug users, homeless people and sex workers. In this sampling variant, a frame of venues or places where the members of the population tend to gather, such as parks and bars, is constructed. The frame is not assumed to cover the whole population. An initial cluster sample of elements is selected from the frame, where the clusters are the venues, and the elements in the initial sample are asked to name their contacts who are also members of the population. The sample size is increased by including in the sample the named elements who are not in the initial sample. The proposed estimators do not use design-based inclusion probabilities, but model-based inclusion probabilities which are derived from a model proposed by Felix-Medina et al. (2015) and are estimated by maximum likelihood estimators. The inclusion probabilities are assumed to be heterogeneous, that is, that they depend on the sampled people. Estimates of the variances of the proposed estimators are obtained by bootstrap and they are used to construct confidence intervals of the totals and means. The performance of the proposed estimators and confidence intervals is evaluated by two numerical studies, one of them based on real data, and the results show that their performance is acceptable.

[2]  arXiv:2005.10998 [pdf, other]
Title: Navigated Weighting to Improve Inverse Probability Weighting for Missing Data Problems and Causal Inference
Authors: Hiroto Katsumata
Subjects: Methodology (stat.ME)

The inverse probability weighting (IPW) is broadly utilized to address missing data problems including causal inference but may suffer from large variances and biases due to propensity score model misspecification. To solve these problems, I propose an estimation method called the navigated weighting (NAWT), which utilizes estimating equations suitable for a specific pre-specified parameter of interest (e.g., the average treatment effects on the treated). Since these pre-specified parameters determine the relative importance of each unit as a function of propensity scores, the NAWT prioritizes important units in the propensity score estimation to improve efficiency and robustness to model misspecification. I investigate its large-sample properties and demonstrate its finite sample improvements through simulation studies and an empirical example. An R package nawtilus which implements the NAWT is developed.

[3]  arXiv:2005.11303 [pdf, other]
Title: Nonparametric inverse probability weighted estimators based on the highly adaptive lasso
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)

Inverse probability weighted estimators are the oldest and potentially most commonly used class of procedures for the estimation of causal effects. By adjusting for selection biases via a weighting mechanism, these procedures estimate an effect of interest by constructing a pseudo-population in which selection biases are eliminated. Despite their ease of use, these estimators require the correct specification of a model for the weighting mechanism, are known to be inefficient, and suffer from the curse of dimensionality. We propose a class of nonparametric inverse probability weighted estimators in which the weighting mechanism is estimated via undersmoothing of the highly adaptive lasso, a nonparametric regression function proven to converge at $n^{-1/3}$-rate to the true weighting mechanism. We demonstrate that our estimators are asymptotically linear with variance converging to the nonparametric efficiency bound. Unlike doubly robust estimators, our procedures require neither derivation of the efficient influence function nor specification of the conditional outcome model. Our theoretical developments have broad implications for the construction of efficient inverse probability weighted estimators in large statistical models and a variety of problem settings. We assess the practical performance of our estimators in simulation studies and demonstrate use of our proposed methodology with data from a large-scale epidemiologic study.

[4]  arXiv:1612.03608 (replaced) [pdf, other]
Title: A one-way ANOVA test for functional data with graphical interpretation
Comments: arXiv admin note: text overlap with arXiv:1506.01646
Subjects: Methodology (stat.ME)
[5]  arXiv:1910.10624 (replaced) [pdf, other]
Title: Doubly robust treatment effect estimation with missing attributes
Subjects: Methodology (stat.ME)
[6]  arXiv:1911.09260 (replaced) [pdf, other]
Title: A semiparametric instrumental variable approach to optimal treatment regimes under endogeneity
Subjects: Methodology (stat.ME); Statistics Theory (math.ST); Machine Learning (stat.ML)
[7]  arXiv:2005.07314 (replaced) [pdf, other]
Title: Hierarchical causal variance decomposition for institution and provider comparisons in healthcare
Authors: Bo Chen, Olli Saarela
Subjects: Methodology (stat.ME)
[8]  arXiv:1906.12072 (replaced) [pdf, other]
Title: Multiple Testing and Variable Selection along Least Angle Regression's path
Subjects: Statistics Theory (math.ST); Information Theory (cs.IT); Methodology (stat.ME); Machine Learning (stat.ML)
[9]  arXiv:2003.05221 (replaced) [pdf, other]
Title: A mixture autoregressive model based on Gaussian and Student's $t$-distributions
Authors: Savi Virolainen
Subjects: Econometrics (econ.EM); Statistics Theory (math.ST); Methodology (stat.ME)
