We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Linear Discriminant Analysis with High-dimensional Mixed Variables

Abstract: Datasets containing both categorical and continuous variables are frequently encountered in many areas, and with the rapid development of modern measurement technologies, the dimensions of these variables can be very high. Despite the recent progress made in modelling high-dimensional data for continuous variables, there is a scarcity of methods that can deal with a mixed set of variables. To fill this gap, this paper develops a novel approach for classifying high-dimensional observations with mixed variables. Our framework builds on a location model, in which the distributions of the continuous variables conditional on categorical ones are assumed Gaussian. We overcome the challenge of having to split data into exponentially many cells, or combinations of the categorical variables, by kernel smoothing, and provide new perspectives for its bandwidth choice to ensure an analogue of Bochner's Lemma, which is different to the usual bias-variance tradeoff. We show that the two sets of parameters in our model can be separately estimated and provide penalized likelihood for their estimation. Results on the estimation accuracy and the misclassification rates are established, and the competitive performance of the proposed classifier is illustrated by extensive simulation and real data studies.
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)
MSC classes: 62H30, 62H12, 62G05
Cite as: arXiv:2112.07145 [stat.ME]
  (or arXiv:2112.07145v3 [stat.ME] for this version)

Submission history

From: Binyan Jiang [view email]
[v1] Tue, 14 Dec 2021 03:57:56 GMT (85kb,D)
[v2] Thu, 5 May 2022 09:04:40 GMT (88kb,D)
[v3] Tue, 2 Jan 2024 09:27:21 GMT (107kb,D)

Link back to: arXiv, form interface, contact.