We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Statistics > Methodology

Title: Dimension Reduction Forests: Local Variable Importance using Structured Random Forests

Abstract: Random forests are one of the most popular machine learning methods due to their accuracy and variable importance assessment. However, random forests only provide variable importance in a global sense. There is an increasing need for such assessments at a local level, motivated by applications in personalized medicine, policy-making, and bioinformatics. We propose a new nonparametric estimator that pairs the flexible random forest kernel with local sufficient dimension reduction to adapt to a regression function's local structure. This allows us to estimate a meaningful directional local variable importance measure at each prediction point. We develop a computationally efficient fitting procedure and provide sufficient conditions for the recovery of the splitting directions. We demonstrate significant accuracy gains of our proposed estimator over competing methods on simulated and real regression problems. Finally, we apply the proposed method to seasonal particulate matter concentration data collected in Beijing, China, which yields meaningful local importance measures. The methods presented here are available in the drforest Python package.
Comments: 36 pages, 7 figures
Subjects: Methodology (stat.ME)
Cite as: arXiv:2103.13233 [stat.ME]
  (or arXiv:2103.13233v1 [stat.ME] for this version)

Submission history

From: Joshua Loyal [view email]
[v1] Wed, 24 Mar 2021 14:50:43 GMT (1073kb,D)

Link back to: arXiv, form interface, contact.