We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Bayesian clustering of multiple zero-inflated outcomes

Abstract: Several applications involving counts present a large proportion of zeros (excess-of-zeros data). A popular model for such data is the Hurdle model, which explicitly models the probability of a zero count, while assuming a sampling distribution on the positive integers. We consider data from multiple count processes. In this context, it is of interest to study the patterns of counts and cluster the subjects accordingly. We introduce a novel Bayesian nonparametric approach to cluster multiple, possibly related, zero-inflated processes. We propose a joint model for zero-inflated counts, specifying a Hurdle model for each process with a shifted Negative Binomial sampling distribution. Conditionally on the model parameters, the different processes are assumed independent, leading to a substantial reduction in the number of parameters as compared to traditional multivariate approaches. The subject-specific probabilities of zero-inflation and the parameters of the sampling distribution are flexibly modelled via an enriched finite mixture with random number of components. This induces a two-level clustering of the subjects based on the zero/non-zero patterns (outer clustering) and on the sampling distribution (inner clustering). Posterior inference is performed through tailored MCMC schemes. We demonstrate the proposed approach on an application involving the use of the messaging service WhatsApp.
Subjects: Methodology (stat.ME); Applications (stat.AP)
Cite as: arXiv:2205.05054 [stat.ME]
  (or arXiv:2205.05054v3 [stat.ME] for this version)

Submission history

From: Beatrice Franzolini [view email]
[v1] Tue, 10 May 2022 17:11:58 GMT (3882kb,D)
[v2] Wed, 11 May 2022 10:42:52 GMT (3332kb,D)
[v3] Mon, 29 Aug 2022 13:16:56 GMT (1121kb,D)

Link back to: arXiv, form interface, contact.