Identification Risk Evaluation of Continuous Synthesized Variables

Hornby, Ryan; Hu, Jingchen

Full-text links:

Download:

Current browse context:

stat.ME

< prev | next >

new | recent | 2006

Statistics > Methodology

Title: Identification Risk Evaluation of Continuous Synthesized Variables

Authors: Ryan Hornby, Jingchen Hu

(Submitted on 1 Jun 2020 (this version), latest version 6 Apr 2021 (v4))

Abstract: We propose a general approach to evaluating identification risk of continuous synthesized variables in partially synthetic data. We introduce the use of a radius $r$ in the construction of identification risk probability of each target record, and illustrate with working examples for one or more continuous synthesized variables. We demonstrate our methods with applications to a data sample from the Consumer Expenditure Surveys (CE), and discuss the impacts on risk and data utility of 1) the choice of radius $r$, 2) the choice of synthesized variables, and 3) the choice of number of synthetic datasets. We give recommendations for statistical agencies for synthesizing and evaluating identification risk of continuous variables. An R package is created to perform our proposed methods of identification risk evaluation, and sample R scripts are included.

Comments:	16 pages with 11 figures. Submitted to Privacy in Statistical Databases 2020
Subjects:	Methodology (stat.ME); Applications (stat.AP)
MSC classes:	62-08 (Primary) 62P99 (Secondary)
Cite as:	arXiv:2006.01298 [stat.ME]
	(or arXiv:2006.01298v1 [stat.ME] for this version)

Submission history

From: Ryan Hornby [view email]
[v1] Mon, 1 Jun 2020 22:35:36 GMT (529kb,D)
[v2] Tue, 9 Mar 2021 00:03:00 GMT (503kb,D)
[v3] Wed, 10 Mar 2021 06:47:11 GMT (505kb,D)
[v4] Tue, 6 Apr 2021 02:05:50 GMT (525kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2006.01298v1

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Methodology

Title: Identification Risk Evaluation of Continuous Synthesized Variables

Submission history