Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics

Wu, Xi; Li, Fengan; Kumar, Arun; Chaudhuri, Kamalika; Jha, Somesh; Naughton, Jeffrey F.

Full-text links:

Download:

Current browse context:

cs.LG

< prev | next >

new | recent | 1606

Computer Science > Machine Learning

Title: Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics

Authors: Xi Wu, Fengan Li, Arun Kumar, Kamalika Chaudhuri, Somesh Jha, Jeffrey F. Naughton

(Submitted on 15 Jun 2016 (v1), last revised 23 Mar 2017 (this version, v3))

Abstract: While significant progress has been made separately on analytics systems for scalable stochastic gradient descent (SGD) and private SGD, none of the major scalable analytics frameworks have incorporated differentially private SGD. There are two inter-related issues for this disconnect between research and practice: (1) low model accuracy due to added noise to guarantee privacy, and (2) high development and runtime overhead of the private algorithms. This paper takes a first step to remedy this disconnect and proposes a private SGD algorithm to address \emph{both} issues in an integrated manner. In contrast to the white-box approach adopted by previous work, we revisit and use the classical technique of {\em output perturbation} to devise a novel "bolt-on" approach to private SGD. While our approach trivially addresses (2), it makes (1) even more challenging. We address this challenge by providing a novel analysis of the $L_2$-sensitivity of SGD, which allows, under the same privacy guarantees, better convergence of SGD when only a constant number of passes can be made over the data. We integrate our algorithm, as well as other state-of-the-art differentially private SGD, into Bismarck, a popular scalable SGD-based analytics system on top of an RDBMS. Extensive experiments show that our algorithm can be easily integrated, incurs virtually no overhead, scales well, and most importantly, yields substantially better (up to 4X) test accuracy than the state-of-the-art algorithms on many real datasets.

Subjects:	Machine Learning (cs.LG); Cryptography and Security (cs.CR); Databases (cs.DB); Machine Learning (stat.ML)
Cite as:	arXiv:1606.04722 [cs.LG]
	(or arXiv:1606.04722v3 [cs.LG] for this version)

Submission history

From: Xi Wu [view email]
[v1] Wed, 15 Jun 2016 11:14:29 GMT (506kb,D)
[v2] Sun, 26 Feb 2017 16:26:59 GMT (728kb,D)
[v3] Thu, 23 Mar 2017 17:35:09 GMT (4819kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1606.04722

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Machine Learning

Title: Bolt-on Differential Privacy for Scalable Stochastic Gradient Descent-based Analytics

Submission history