### References & Citations

# Computer Science > Machine Learning

# Title: Optimal Schemes for Discrete Distribution Estimation under Locally Differential Privacy

(Submitted on 2 Feb 2017)

Abstract: We consider the minimax estimation problem of a discrete distribution with support size $k$ under privacy constraints. A privatization scheme is applied to each raw sample independently, and we need to estimate the distribution of the raw samples from the privatized samples. A positive number $\epsilon$ measures the privacy level of a privatization scheme. For a given $\epsilon,$ we consider the problem of constructing optimal privatization schemes with $\epsilon$-privacy level, i.e., schemes that minimize the expected estimation loss for the worst-case distribution. Two schemes in the literature provide order optimal performance in the high privacy regime where $\epsilon$ is very close to $0,$ and in the low privacy regime where $e^{\epsilon}\approx k,$ respectively.

In this paper, we propose a new family of schemes which substantially improve the performance of the existing schemes in the medium privacy regime when $1\ll e^{\epsilon} \ll k.$ More concretely, we prove that when $3.8 < \epsilon <\ln(k/9) ,$ our schemes reduce the expected estimation loss by $50\%$ under $\ell_2^2$ metric and by $30\%$ under $\ell_1$ metric over the existing schemes. We also prove a lower bound for the region $e^{\epsilon} \ll k,$ which implies that our schemes are order optimal in this regime.

Link back to: arXiv, form interface, contact.