We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Machine Learning

Title: On Information Plane Analyses of Neural Network Classifiers -- A Review

Abstract: We review the current literature concerned with information plane analyses of neural network classifiers. While the underlying information bottleneck theory and the claim that information-theoretic compression is causally linked to generalization are plausible, empirical evidence was found to be both supporting and conflicting. We review this evidence together with a detailed analysis of how the respective information quantities were estimated. Our survey suggests that compression visualized in information planes is not necessarily information-theoretic, but is rather often compatible with geometric compression of the latent representations. This insight gives the information plane a renewed justification.
Aside from this, we shed light on the problem of estimating mutual information in deterministic neural networks and its consequences. Specifically, we argue that even in feed-forward neural networks the data processing inequality need not hold for estimates of mutual information. Similarly, while a fitting phase, in which the mutual information between the latent representation and the target increases, is necessary (but not sufficient) for good classification performance, depending on the specifics of mutual information estimation such a fitting phase need not be visible in the information plane.
Comments: 12 pages, 3 figures; accepted for publication in IEEE Transactions on Neural Networks and Learning Systems. (c) 2021 IEEE
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (stat.ML)
Journal reference: IEEE Trans. Neural Networks and Learning Systems 33(12):7039-7051
DOI: 10.1109/TNNLS.2021.3089037
Cite as: arXiv:2003.09671 [cs.LG]
  (or arXiv:2003.09671v3 [cs.LG] for this version)

Submission history

From: Bernhard C. Geiger [view email]
[v1] Sat, 21 Mar 2020 14:43:45 GMT (77kb,D)
[v2] Thu, 27 Aug 2020 15:19:33 GMT (138kb,D)
[v3] Thu, 10 Jun 2021 15:06:30 GMT (605kb,D)

Link back to: arXiv, form interface, contact.