We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Denoising Noisy Neural Networks: A Bayesian Approach with Compensation

Abstract: Noisy neural networks (NoisyNNs) refer to the inference and training of NNs in the presence of noise. Noise is inherent in most communication and storage systems; hence, NoisyNNs emerge in many new applications, including federated edge learning, where wireless devices collaboratively train a NN over a noisy wireless channel, or when NNs are implemented/stored in an analog storage medium. This paper studies a fundamental problem of NoisyNNs: how to estimate the uncontaminated NN weights from their noisy observations or manifestations. Whereas all prior works relied on the maximum likelihood (ML) estimation to maximize the likelihood function of the estimated NN weights, this paper demonstrates that the ML estimator is in general suboptimal. To overcome the suboptimality of the conventional ML estimator, we put forth an $\text{MMSE}_{pb}$ estimator to minimize a compensated mean squared error (MSE) with a population compensator and a bias compensator. Our approach works well for NoisyNNs arising in both 1) noisy inference, where noise is introduced only in the inference phase on the already-trained NN weights; and 2) noisy training, where noise is introduced over the course of training. Extensive experiments on the CIFAR-10 and SST-2 datasets with different NN architectures verify the significant performance gains of the $\text{MMSE}_{pb}$ estimator over the ML estimator when used to denoise the NoisyNN. For noisy inference, the average gains are up to $156\%$ for a noisy ResNet34 model and $14.7\%$ for a noisy BERT model; for noisy training, the average gains are up to $18.1$ dB for a noisy ResNet18 model.
Comments: Keywords: Noisy neural network, Bayesian estimation, analog device, federated edge learning, over-the-air computation, analog storage
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Signal Processing (eess.SP)
Cite as: arXiv:2105.10699 [cs.LG]
  (or arXiv:2105.10699v1 [cs.LG] for this version)

Submission history

From: Yulin Shao [view email]
[v1] Sat, 22 May 2021 11:51:20 GMT (3444kb,D)
[v2] Wed, 15 Dec 2021 15:23:27 GMT (3512kb,D)
[v3] Thu, 19 May 2022 15:28:09 GMT (4767kb,D)

Link back to: arXiv, form interface, contact.