We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.AP

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Applications

Title: Robust Classification of High Dimension Low Sample Size Data

Abstract: The robustification of pattern recognition techniques has been the subject of intense research in recent years. Despite the multiplicity of papers on the subject, very few articles have deeply explored the topic of robust classification in the high dimension low sample size context. In this work, we explore and compare the predictive performances of robust classification techniques with a special concentration on robust discriminant analysis and robust PCA applied to a wide variety of large $p$ small $n$ data sets. We also explore the performance of random forest by way of comparing and contrasting the differences single model methods and ensemble methods in this context. Our work reveals that Random Forest, although not inherently designed to be robust to outliers, substantially outperforms the existing techniques specifically designed to achieve robustness. Indeed, random forest emerges as the best predictively on both real life and simulated data.
Comments: 17 pages, 29 figures
Subjects: Applications (stat.AP); Methodology (stat.ME)
MSC classes: 60K35
Cite as: arXiv:1501.00592 [stat.AP]
  (or arXiv:1501.00592v1 [stat.AP] for this version)

Submission history

From: Necla Gunduz [view email]
[v1] Sat, 3 Jan 2015 18:50:19 GMT (68kb)

Link back to: arXiv, form interface, contact.