We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: A User-Guided Bayesian Framework for Ensemble Feature Selection in Life Science Applications (UBayFS)

Abstract: Training predictive models on high-dimensional datasets is a challenging task in artificial intelligence. Users must take measures to prevent overfitting and keep model complexity low. Thus, the feature selection plays a key role in data preprocessing and delivers insights into the systematic variation in the data. The latter aspect is crucial in domains that rely on model interpretability, such as life sciences. We propose UBayFS, an ensemble feature selection technique, embedded in a Bayesian statistical framework. Our approach enhances the feature selection process by considering two sources of information: data and domain knowledge. Therefore, we build an ensemble of elementary feature selectors that extract information from empirical data, leading to a meta-model, which compensates for inconsistencies between elementary feature selectors. The user guides UBayFS by weighting features and penalizing specific feature combinations. The framework builds on a multinomial likelihood and a novel version of constrained Dirichlet-type prior distribution, involving initial feature weights and side constraints. In a quantitative evaluation, we demonstrate that the presented framework allows for a balanced trade-off between user knowledge and data observations. A comparison with standard feature selectors underlines that UBayFS achieves competitive performance, while providing additional flexibility to incorporate domain knowledge.
Subjects: Machine Learning (cs.LG); Methodology (stat.ME)
Cite as: arXiv:2104.14787 [cs.LG]
  (or arXiv:2104.14787v1 [cs.LG] for this version)

Submission history

From: Stefan Schrunner [view email]
[v1] Fri, 30 Apr 2021 06:51:33 GMT (367kb,D)
[v2] Fri, 28 May 2021 16:21:17 GMT (519kb,D)
[v3] Sat, 11 Dec 2021 09:59:13 GMT (596kb,D)

Link back to: arXiv, form interface, contact.