We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.AI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: PPSGCN: A Privacy-Preserving Subgraph Sampling Based Distributed GCN Training Method

Abstract: Graph convolutional networks (GCNs) have been widely adopted for graph representation learning and achieved impressive performance. For larger graphs stored separately on different clients, distributed GCN training algorithms were proposed to improve efficiency and scalability. However, existing methods directly exchange node features between different clients, which results in data privacy leakage. Federated learning was incorporated in graph learning to tackle data privacy, while they suffer from severe performance drop due to non-iid data distribution. Besides, these approaches generally involve heavy communication and memory overhead during the training process. In light of these problems, we propose a Privacy-Preserving Subgraph sampling based distributed GCN training method (PPSGCN), which preserves data privacy and significantly cuts back on communication and memory overhead. Specifically, PPSGCN employs a star-topology client-server system. We firstly sample a local node subset in each client to form a global subgraph, which greatly reduces communication and memory costs. We then conduct local computation on each client with features or gradients of the sampled nodes. Finally, all clients securely communicate with the central server with homomorphic encryption to combine local results while preserving data privacy. Compared with federated graph learning methods, our PPSGCN model is trained on a global graph to avoid the negative impact of local data distribution. We prove that our PPSGCN algorithm would converge to a local optimum with probability 1. Experiment results on three prevalent benchmarks demonstrate that our algorithm significantly reduces communication and memory overhead while maintaining desirable performance. Further studies not only demonstrate the fast convergence of PPSGCN, but discuss the trade-off between communication and local computation cost as well.
Comments: 9 pages, 5 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
Cite as: arXiv:2110.12906 [cs.LG]
  (or arXiv:2110.12906v1 [cs.LG] for this version)

Submission history

From: Zhang Binchi [view email]
[v1] Fri, 22 Oct 2021 08:22:36 GMT (248kb,D)

Link back to: arXiv, form interface, contact.