We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ME

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Methodology

Title: Generalised Boosted Forests

Abstract: This paper extends recent work on boosting random forests to model non-Gaussian responses. Given an exponential family $\mathbb{E}[Y|X] = g^{-1}(f(X))$ our goal is to obtain an estimate for $f$. We start with an MLE-type estimate in the link space and then define generalised residuals from it. We use these residuals and some corresponding weights to fit a base random forest and then repeat the same to obtain a boost random forest. We call the sum of these three estimators a \textit{generalised boosted forest}. We show with simulated and real data that both the random forest steps reduces test-set log-likelihood, which we treat as our primary metric. We also provide a variance estimator, which we can obtain with the same computational cost as the original estimate itself. Empirical experiments on real-world data and simulations demonstrate that the methods can effectively reduce bias, and that confidence interval coverage is conservative in the bulk of the covariate distribution.
Comments: Paper: 14 pages, 4 figures, 3 tables; Appendix: 34 pages, 28 figures, 1 table
Subjects: Methodology (stat.ME); Machine Learning (stat.ML)
Cite as: arXiv:2102.12561 [stat.ME]
  (or arXiv:2102.12561v2 [stat.ME] for this version)

Submission history

From: Indrayudh Ghosal [view email]
[v1] Wed, 24 Feb 2021 21:17:31 GMT (2918kb,D)
[v2] Tue, 2 Mar 2021 23:15:16 GMT (2918kb,D)

Link back to: arXiv, form interface, contact.