We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Privacy-Preserving Data Synthetisation for Secure Information Sharing

Abstract: We can protect user data privacy via many approaches, such as statistical transformation or generative models. However, each of them has critical drawbacks. On the one hand, creating a transformed data set using conventional techniques is highly time-consuming. On the other hand, in addition to long training phases, recent deep learning-based solutions require significant computational resources. In this paper, we propose PrivateSMOTE, a technique designed for competitive effectiveness in protecting cases at maximum risk of re-identification while requiring much less time and computational resources. It works by synthetic data generation via interpolation to obfuscate high-risk cases while minimizing data utility loss of the original data. Compared to multiple conventional and state-of-the-art privacy-preservation methods on 20 data sets, PrivateSMOTE demonstrates competitive results in re-identification risk. Also, it presents similar or higher predictive performance than the baselines, including generative adversarial networks and variational autoencoders, reducing their energy consumption and time requirements by a minimum factor of 9 and 12, respectively.
Comments: 10 pages, 7 figures and 3 tables
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR)
Cite as: arXiv:2212.00484 [cs.LG]
  (or arXiv:2212.00484v1 [cs.LG] for this version)

Submission history

From: Tânia Carvalho [view email]
[v1] Thu, 1 Dec 2022 13:20:37 GMT (874kb,D)

Link back to: arXiv, form interface, contact.