We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Machine Learning

Title: A Bayesian Finite Mixture Model with Variable Selection for Data with Mixed-type Variables

Abstract: Finite mixture model is an important branch of clustering methods and can be applied on data sets with mixed types of variables. However, challenges exist in its applications. First, it typically relies on the EM algorithm which could be sensitive to the choice of initial values. Second, biomarkers subject to limits of detection (LOD) are common to encounter in clinical data, which brings censored variables into finite mixture model. Additionally, researchers are recently getting more interest in variable importance due to the increasing number of variables that become available for clustering.
To address these challenges, we propose a Bayesian finite mixture model to simultaneously conduct variable selection, account for biomarker LOD and obtain clustering results. We took a Bayesian approach to obtain parameter estimates and the cluster membership to bypass the limitation of the EM algorithm. To account for LOD, we added one more step in Gibbs sampling to iteratively fill in biomarker values below or above LODs. In addition, we put a spike-and-slab type of prior on each variable to obtain variable importance. Simulations across various scenarios were conducted to examine the performance of this method. Real data application on electronic health records was also conducted.
Comments: 34 pages, 12 table and figures
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Applications (stat.AP)
Cite as: arXiv:1905.03680 [stat.ML]
  (or arXiv:1905.03680v1 [stat.ML] for this version)

Submission history

From: Shu Wang [view email]
[v1] Thu, 9 May 2019 15:13:13 GMT (384kb,D)

Link back to: arXiv, form interface, contact.