Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: Analyzing Data Selection Techniques with Tools from the Theory of Information Losses
(Submitted on 25 Feb 2019 (v1), last revised 19 Jan 2022 (this version, v4))
Abstract: In this paper, we present and illustrate some new tools for rigorously analyzing training data selection methods. These tools focus on the information theoretic losses that occur when sampling data. We use this framework to prove that two methods, Facility Location Selection and Transductive Experimental Design, reduce these losses. These are meant to act as generalizable theoretical examples of applying the field of Information Theoretic Deep Learning Theory to the fields of data selection and active learning. Both analyses yield insight into their respective methods and increase their interpretability. In the case of Transductive Experimental Design, the provided analysis greatly increases the method's scope as well.
Submission history
From: Brandon Foggo [view email][v1] Mon, 25 Feb 2019 20:43:28 GMT (685kb,D)
[v2] Tue, 14 Jan 2020 06:48:43 GMT (2903kb,D)
[v3] Wed, 15 Jan 2020 19:44:16 GMT (2963kb,D)
[v4] Wed, 19 Jan 2022 23:03:35 GMT (2451kb,D)
Link back to: arXiv, form interface, contact.