We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.IT

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Information Theory

Title: Estimating Conditional Mutual Information for Discrete-Continuous Mixtures using Multi-Dimensional Adaptive Histograms

Abstract: Estimating conditional mutual information (CMI) is an essential yet challenging step in many machine learning and data mining tasks. Estimating CMI from data that contains both discrete and continuous variables, or even discrete-continuous mixture variables, is a particularly hard problem. In this paper, we show that CMI for such mixture variables, defined based on the Radon-Nikodym derivate, can be written as a sum of entropies, just like CMI for purely discrete or continuous data. Further, we show that CMI can be consistently estimated for discrete-continuous mixture variables by learning an adaptive histogram model. In practice, we estimate such a model by iteratively discretizing the continuous data points in the mixture variables. To evaluate the performance of our estimator, we benchmark it against state-of-the-art CMI estimators as well as evaluate it in a causal discovery setting.
Comments: Extended version including supplementary material for main paper which is (will be) published in: Proceedings of the SIAM International Conference on Data Mining (SDM'21)
Subjects: Information Theory (cs.IT); Applications (stat.AP)
Cite as: arXiv:2101.05009 [cs.IT]
  (or arXiv:2101.05009v1 [cs.IT] for this version)

Submission history

From: Alexander Marx [view email]
[v1] Wed, 13 Jan 2021 11:21:25 GMT (500kb,D)

Link back to: arXiv, form interface, contact.