We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Denoising Noisy Neural Networks: A Bayesian Approach with Compensation

Abstract: Deep neural networks (DNNs) with noisy weights, which we refer to as noisy neural networks (NoisyNNs), arise from the training and inference of DNNs in the presence of noise. NoisyNNs emerge in many new applications, including the wireless transmission of DNNs, the efficient deployment or storage of DNNs in analog devices, and the truncation or quantization of DNN weights. This paper studies a fundamental problem of NoisyNNs: how to reconstruct the DNN weights from their noisy manifestations. While all prior works relied on the maximum likelihood (ML) estimation, this paper puts forth a denoising approach to reconstruct DNNs with the aim of maximizing the inference accuracy of the reconstructed models. The superiority of our denoiser is rigorously proven in two small-scale problems, wherein we consider a quadratic neural network function and a shallow feedforward neural network, respectively. When applied to advanced learning tasks with modern DNN architectures, our denoiser exhibits significantly better performance than the ML estimator. Consider the average test accuracy of the denoised DNN model versus the weight variance to noise power ratio (WNR) performance. When denoising a noisy BERT model arising from noisy inference, our denoiser outperforms ML estimation by 1.1 dB to achieve a test accuracy of 75%. When denoising a noisy ResNet18 model arising from noisy training, our denoiser outperforms ML estimation by 13.4 dB and 8.3 dB to achieve a test accuracy of 60% and 80%, respectively.
Comments: Keywords: Noisy neural network, denoiser, wireless transmission of neural networks, federated edge learning, analog device. 17 pages, 9 figures
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Information Theory (cs.IT); Signal Processing (eess.SP)
Cite as: arXiv:2105.10699 [cs.LG]
  (or arXiv:2105.10699v2 [cs.LG] for this version)

Submission history

From: Yulin Shao [view email]
[v1] Sat, 22 May 2021 11:51:20 GMT (3444kb,D)
[v2] Wed, 15 Dec 2021 15:23:27 GMT (3512kb,D)
[v3] Thu, 19 May 2022 15:28:09 GMT (4767kb,D)

Link back to: arXiv, form interface, contact.