Current browse context:
stat.CO
Change to browse by:
References & Citations
Statistics > Computation
Title: Feature Selection based on the Local Lift Dependence Scale
(Submitted on 11 Nov 2017 (v1), last revised 18 Dec 2017 (this version, v3))
Abstract: This paper uses a classical approach to feature selection: minimization of a cost function applied on estimated joint distributions. However, the search space in which such minimization is performed is extended. In the original formulation, the search space is the Boolean lattice of features sets (BLFS), while, in the present formulation, it is a collection of Boolean lattices of ordered pairs (features, associated value) (CBLOP), indexed by the elements of the BLFS. In this approach, we may not only select the features that are most related to a variable Y, but also select the values of the features that most influence the variable or that are most prone to have a specific value of Y. A local formulation of Shanon's mutual information is applied on a CBLOP to select features, namely, the Local Lift Dependence Scale, an scale for measuring variable dependence in multiple resolutions. The main contribution of this paper is to define and apply this local measure, which permits to analyse local properties of joint distributions that are neglected by the classical Shanon's global measure. The proposed approach is applied to a dataset consisting of student performances on a university entrance exam, as well as on undergraduate courses. The approach is also applied to two datasets of the UCI Machine Learning Repository.
Submission history
From: Diego Marcondes [view email][v1] Sat, 11 Nov 2017 18:51:13 GMT (106kb,D)
[v2] Wed, 15 Nov 2017 10:34:57 GMT (106kb,D)
[v3] Mon, 18 Dec 2017 20:03:31 GMT (113kb,D)
Link back to: arXiv, form interface, contact.