Fast Interpretable Greedy-Tree Sums

Tan, Yan Shuo; Singh, Chandan; Nasseri, Keyan; Agarwal, Abhineet; Duncan, James; Ronen, Omer; Epland, Matthew; Kornblith, Aaron; Yu, Bin

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2201

Computer Science > Machine Learning

Title: Fast Interpretable Greedy-Tree Sums

Authors: Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, James Duncan, Omer Ronen, Matthew Epland, Aaron Kornblith, Bin Yu

(Submitted on 28 Jan 2022 (v1), last revised 8 Jul 2023 (this version, v3))

Abstract: Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in high-stakes domains such as medicine. In such settings, practitioners often use highly interpretable decision tree models, but these suffer from inductive bias against additive structure. To overcome this bias, we propose Fast Interpretable Greedy-Tree Sums (FIGS), which generalizes the CART algorithm to simultaneously grow a flexible number of trees in summation. By combining logical rules with addition, FIGS is able to adapt to additive structure while remaining highly interpretable. Extensive experiments on real-world datasets show that FIGS achieves state-of-the-art prediction performance. To demonstrate the usefulness of FIGS in high-stakes domains, we adapt FIGS to learn clinical decision instruments (CDIs), which are tools for guiding clinical decision-making. Specifically, we introduce a variant of FIGS known as G-FIGS that accounts for the heterogeneity in medical data. G-FIGS derives CDIs that reflect domain knowledge and enjoy improved specificity (by up to 20% over CART) without sacrificing sensitivity or interpretability. To provide further insight into FIGS, we prove that FIGS learns components of additive models, a property we refer to as disentanglement. Further, we show (under oracle conditions) that unconstrained tree-sum models leverage disentanglement to generalize more efficiently than single decision tree models when fitted to additive regression functions. Finally, to avoid overfitting with an unconstrained number of splits, we develop Bagging-FIGS, an ensemble version of FIGS that borrows the variance reduction techniques of random forests. Bagging-FIGS enjoys competitive performance with random forests and XGBoost on real-world datasets.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Applications (stat.AP); Methodology (stat.ME); Machine Learning (stat.ML)
Cite as:	arXiv:2201.11931 [cs.LG]
	(or arXiv:2201.11931v3 [cs.LG] for this version)

Submission history

From: Abhineet Agarwal [view email]
[v1] Fri, 28 Jan 2022 04:50:37 GMT (804kb,D)
[v2] Thu, 17 Feb 2022 17:04:11 GMT (832kb,D)
[v3] Sat, 8 Jul 2023 16:18:03 GMT (2501kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2201.11931

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Fast Interpretable Greedy-Tree Sums

Submission history