Boosted and Differentially Private Ensembles of Decision Trees

Nock, Richard; Henecka, Wilko

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 2001

Computer Science > Machine Learning

Title: Boosted and Differentially Private Ensembles of Decision Trees

Authors: Richard Nock, Wilko Henecka

(Submitted on 26 Jan 2020 (v1), last revised 3 Feb 2020 (this version, v2))

Abstract: Boosted ensemble of decision tree (DT) classifiers are extremely popular in international competitions, yet to our knowledge nothing is formally known on how to make them \textit{also} differential private (DP), up to the point that random forests currently reign supreme in the DP stage. Our paper starts with the proof that the privacy vs boosting picture for DT involves a notable and general technical tradeoff: the sensitivity tends to increase with the boosting rate of the loss, for any proper loss. DT induction algorithms being fundamentally iterative, our finding implies non-trivial choices to select or tune the loss to balance noise against utility to split nodes. To address this, we craft a new parametererized proper loss, called the M$\alpha$-loss, which, as we show, allows to finely tune the tradeoff in the complete spectrum of sensitivity vs boosting guarantees. We then introduce \textit{objective calibration} as a method to adaptively tune the tradeoff during DT induction to limit the privacy budget spent while formally being able to keep boosting-compliant convergence on limited-depth nodes with high probability. Extensive experiments on 19 UCI domains reveal that objective calibration is highly competitive, even in the DP-free setting. Our approach tends to very significantly beat random forests, in particular on high DP regimes ($\varepsilon \leq 0.1$) and even with boosted ensembles containing ten times less trees, which could be crucial to keep a key feature of DT models under differential privacy: interpretability.

Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
ACM classes:	I.2.6
Cite as:	arXiv:2001.09384 [cs.LG]
	(or arXiv:2001.09384v2 [cs.LG] for this version)

Submission history

From: Richard Nock [view email]
[v1] Sun, 26 Jan 2020 01:28:03 GMT (7091kb,D)
[v2] Mon, 3 Feb 2020 22:34:23 GMT (7901kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2001.09384

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Boosted and Differentially Private Ensembles of Decision Trees

Submission history