We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Learning Discretized Neural Networks under Ricci Flow

Abstract: In this paper, we consider Discretized Neural Networks (DNNs) consisting of low-precision weights and activations, which suffer from either infinite or zero gradients caused by the non-differentiable discrete function in the training process. In this case, most training-based DNNs use the standard Straight-Through Estimator (STE) to approximate the gradient w.r.t. discrete values. However, the STE will cause the problem of gradient mismatch, which implies that the approximated gradient is with perturbations. We propose an analysis that this mismatch can be viewed as a metric perturbation in a Riemannian manifold through the lens of duality theory. To address this problem, based on the information geometry, we construct the Linearly Nearly Euclidean (LNE) manifold for DNNs as a background to deal with perturbations. By introducing a partial differential equation on metrics, the Ricci flow, we prove the dynamical stability and convergence of the LNE metric with the $L^2$-norm perturbation. And unlike the previous perturbation theory which gives the rate of convergence is the fractional powers, we yield the metric perturbation under the Ricci flow can be exponentially decayed in the LNE manifold. The experimental results on various datasets demonstrate that our method achieves better and more stable performance for DNNs than other representative training-based methods.
Subjects: Machine Learning (cs.LG); Information Theory (cs.IT); Neural and Evolutionary Computing (cs.NE)
Cite as: arXiv:2302.03390 [cs.LG]
  (or arXiv:2302.03390v2 [cs.LG] for this version)

Submission history

From: Jun Chen [view email]
[v1] Tue, 7 Feb 2023 10:51:53 GMT (474kb,D)
[v2] Mon, 6 Mar 2023 07:11:19 GMT (418kb,D)

Link back to: arXiv, form interface, contact.