Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Learning relevant features for statistical inference
(Submitted on 23 Apr 2019 (v1), last revised 24 Mar 2020 (this version, v4))
Abstract: Given two views of data, we consider the problem of finding the features of one view which can be most faithfully inferred from the other. We find that these are also the most correlated variables in the sense of deep canonical correlation analysis (DCCA). Moreover, we show that these variables can be used to construct a non-parametric representation of the implied joint probability distribution, which can be thought of as a classical version of the Schmidt decomposition of quantum states. This representation can be used to compute the expectations of functions over one view of data conditioned on the other, such as Bayesian estimators and their standard deviations. We test the approach using inference on occluded MNIST images, and show that our representation contains multiple modes. Surprisingly, when applied to supervised learning (one dataset consists of labels), this approach automatically provides regularization and faster convergence compared to the cross-entropy objective. We also explore using this approach to discover salient independent variables of a single dataset.
Submission history
From: Cédric Bény [view email][v1] Tue, 23 Apr 2019 15:29:04 GMT (727kb,D)
[v2] Mon, 6 May 2019 03:27:20 GMT (718kb,D)
[v3] Sun, 2 Jun 2019 02:56:39 GMT (733kb,D)
[v4] Tue, 24 Mar 2020 13:47:56 GMT (526kb,D)
Link back to: arXiv, form interface, contact.