Current browse context:
cs.LG
Change to browse by:
References & Citations
Computer Science > Machine Learning
Title: FsNet: Feature Selection Network on High-dimensional Biological Data
(Submitted on 23 Jan 2020 (v1), last revised 18 Dec 2020 (this version, v3))
Abstract: Biological data including gene expression data are generally high-dimensional and require efficient, generalizable, and scalable machine-learning methods to discover their complex nonlinear patterns. The recent advances in machine learning can be attributed to deep neural networks (DNNs), which excel in various tasks in terms of computer vision and natural language processing. However, standard DNNs are not appropriate for high-dimensional datasets generated in biology because they have many parameters, which in turn require many samples. In this paper, we propose a DNN-based, nonlinear feature selection method, called the feature selection network (FsNet), for high-dimensional and small number of sample data. Specifically, FsNet comprises a selection layer that selects features and a reconstruction layer that stabilizes the training. Because a large number of parameters in the selection and reconstruction layers can easily result in overfitting under a limited number of samples, we use two tiny networks to predict the large, virtual weight matrices of the selection and reconstruction layers. Experimental results on several real-world, high-dimensional biological datasets demonstrate the efficacy of the proposed method.
Submission history
From: Makoto Yamada [view email][v1] Thu, 23 Jan 2020 00:49:57 GMT (162kb,D)
[v2] Tue, 29 Sep 2020 06:46:11 GMT (275kb,D)
[v3] Fri, 18 Dec 2020 00:48:37 GMT (275kb,D)
Link back to: arXiv, form interface, contact.