Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging

Arasteh, Soroosh Tayebi; Ziller, Alexander; Kuhl, Christiane; Makowski, Marcus; Nebelung, Sven; Braren, Rickmer; Rueckert, Daniel; Truhn, Daniel; Kaissis, Georgios

doi:10.1038/s43856-024-00462-6

Full-text links:

Download:

Current browse context:

eess.IV

< prev | next >

new | recent | 2302

Electrical Engineering and Systems Science > Image and Video Processing

Title: Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging

Authors: Soroosh Tayebi Arasteh, Alexander Ziller, Christiane Kuhl, Marcus Makowski, Sven Nebelung, Rickmer Braren, Daniel Rueckert, Daniel Truhn, Georgios Kaissis

(Submitted on 3 Feb 2023 (v1), last revised 16 Mar 2024 (this version, v5))

Abstract: Artificial intelligence (AI) models are increasingly used in the medical domain. However, as medical data is highly sensitive, special precautions to ensure its protection are required. The gold standard for privacy preservation is the introduction of differential privacy (DP) to model training. Prior work indicates that DP has negative implications on model accuracy and fairness, which are unacceptable in medicine and represent a main barrier to the widespread use of privacy-preserving techniques. In this work, we evaluated the effect of privacy-preserving training of AI models regarding accuracy and fairness compared to non-private training. For this, we used two datasets: (1) A large dataset (N=193,311) of high quality clinical chest radiographs, and (2) a dataset (N=1,625) of 3D abdominal computed tomography (CT) images, with the task of classifying the presence of pancreatic ductal adenocarcinoma (PDAC). Both were retrospectively collected and manually labeled by experienced radiologists. We then compared non-private deep convolutional neural networks (CNNs) and privacy-preserving (DP) models with respect to privacy-utility trade-offs measured as area under the receiver-operator-characteristic curve (AUROC), and privacy-fairness trade-offs, measured as Pearson's r or Statistical Parity Difference. We found that, while the privacy-preserving trainings yielded lower accuracy, they did largely not amplify discrimination against age, sex or co-morbidity. Our study shows that -- under the challenging realistic circumstances of a real-life clinical dataset -- the privacy-preserving training of diagnostic deep learning models is possible with excellent diagnostic accuracy and fairness.

Comments:	Published in Communications Medicine. Nature Portfolio
Subjects:	Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Journal reference:	Commun Med 4(1), 46 (2024)
DOI:	10.1038/s43856-024-00462-6
Cite as:	arXiv:2302.01622 [eess.IV]
	(or arXiv:2302.01622v5 [eess.IV] for this version)

Submission history

From: Soroosh Tayebi Arasteh [view email]
[v1] Fri, 3 Feb 2023 09:49:13 GMT (1122kb)
[v2] Tue, 7 Mar 2023 10:00:43 GMT (2619kb,D)
[v3] Thu, 25 Jan 2024 14:55:17 GMT (13271kb,D)
[v4] Tue, 27 Feb 2024 09:42:10 GMT (13271kb,D)
[v5] Sat, 16 Mar 2024 12:52:18 GMT (13271kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> eess > arXiv:2302.01622

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Electrical Engineering and Systems Science > Image and Video Processing

Title: Private, fair and accurate: Training large-scale, privacy-preserving AI models in medical imaging

Submission history