We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Feature Selection for Regression Problems Based on the Morisita Estimator of Intrinsic Dimension

Abstract: Data acquisition, storage and management have been improved, while the key factors of many phenomena are not well known. Consequently, irrelevant and redundant features artificially increase the size of datasets, which complicates learning tasks, such as regression. To address this problem, feature selection methods have been proposed. This paper introduces a new supervised filter based on the Morisita estimator of intrinsic dimension. It can identify relevant features and distinguish between redundant and irrelevant information. Besides, it offers a clear graphical representation of the results, and it can be easily implemented in different programming languages. Comprehensive numerical experiments are conducted using simulated datasets characterized by different levels of complexity, sample size and noise. The suggested algorithm is also successfully tested on a selection of real world applications and compared with RReliefF using extreme learning machine. In addition, a new measure of feature relevance is presented and discussed.
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:1602.00216 [stat.ML]
  (or arXiv:1602.00216v6 [stat.ML] for this version)

Submission history

From: Jean Golay [view email]
[v1] Sun, 31 Jan 2016 09:59:27 GMT (2090kb,D)
[v2] Wed, 3 Feb 2016 17:03:26 GMT (2090kb,D)
[v3] Mon, 7 Mar 2016 20:40:06 GMT (2218kb,D)
[v4] Fri, 11 Mar 2016 14:39:24 GMT (2218kb,D)
[v5] Fri, 8 Apr 2016 18:37:17 GMT (2222kb,D)
[v6] Tue, 4 Apr 2017 13:28:48 GMT (1203kb,D)

Link back to: arXiv, form interface, contact.