Fast Interpretable Greedy-Tree Sums (FIGS)

Tan, Yan Shuo; Singh, Chandan; Nasseri, Keyan; Agarwal, Abhineet; Yu, Bin

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2201

Computer Science > Machine Learning

Title: Fast Interpretable Greedy-Tree Sums (FIGS)

Authors: Yan Shuo Tan, Chandan Singh, Keyan Nasseri, Abhineet Agarwal, Bin Yu

(Submitted on 28 Jan 2022 (v1), revised 17 Feb 2022 (this version, v2), latest version 8 Jul 2023 (v3))

Abstract: Modern machine learning has achieved impressive prediction performance, but often sacrifices interpretability, a critical consideration in many problems. Here, we propose Fast Interpretable Greedy-Tree Sums (FIGS), an algorithm for fitting concise rule-based models. Specifically, FIGS generalizes the CART algorithm to simultaneously grow a flexible number of trees in a summation. The total number of splits across all the trees can be restricted by a pre-specified threshold, thereby keeping both the size and number of its trees under control. When both are small, the fitted tree-sum can be easily visualized and written out by hand, making it highly interpretable. A partially oracle theoretical result hints at the potential for FIGS to overcome a key weakness of single-tree models by disentangling additive components of generative additive models, thereby reducing redundancy from repeated splits on the same feature. Furthermore, given oracle access to optimal tree structures, we obtain L2 generalization bounds for such generative models in the case of C1 component functions, matching known minimax rates in some cases. Extensive experiments across a wide array of real-world datasets show that FIGS achieves state-of-the-art prediction performance (among all popular rule-based methods) when restricted to just a few splits (e.g. less than 20). We find empirically that FIGS is able to avoid repeated splits, and often provides more concise decision rules than fitted decision trees, without sacrificing predictive performance. All code and models are released in a full-fledged package on Github \url{this https URL}.

Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Applications (stat.AP); Methodology (stat.ME); Machine Learning (stat.ML)
Cite as:	arXiv:2201.11931 [cs.LG]
	(or arXiv:2201.11931v2 [cs.LG] for this version)

Submission history

From: Abhineet Agarwal [view email]
[v1] Fri, 28 Jan 2022 04:50:37 GMT (804kb,D)
[v2] Thu, 17 Feb 2022 17:04:11 GMT (832kb,D)
[v3] Sat, 8 Jul 2023 16:18:03 GMT (2501kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2201.11931v2

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Fast Interpretable Greedy-Tree Sums (FIGS)

Submission history