Importance measures derived from random forests: characterisation and extension

Sutera, Antonio

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 2106

Statistics > Machine Learning

Title: Importance measures derived from random forests: characterisation and extension

Authors: Antonio Sutera

(Submitted on 17 Jun 2021 (v1), last revised 21 Jun 2021 (this version, v2))

Abstract: Nowadays new technologies, and especially artificial intelligence, are more and more established in our society. Big data analysis and machine learning, two sub-fields of artificial intelligence, are at the core of many recent breakthroughs in many application fields (e.g., medicine, communication, finance, ...), including some that are strongly related to our day-to-day life (e.g., social networks, computers, smartphones, ...). In machine learning, significant improvements are usually achieved at the price of an increasing computational complexity and thanks to bigger datasets. Currently, cutting-edge models built by the most advanced machine learning algorithms typically became simultaneously very efficient and profitable but also extremely complex. Their complexity is to such an extent that these models are commonly seen as black-boxes providing a prediction or a decision which can not be interpreted or justified. Nevertheless, whether these models are used autonomously or as a simple decision-making support tool, they are already being used in machine learning applications where health and human life are at stake. Therefore, it appears to be an obvious necessity not to blindly believe everything coming out of those models without a detailed understanding of their predictions or decisions. Accordingly, this thesis aims at improving the interpretability of models built by a specific family of machine learning algorithms, the so-called tree-based methods. Several mechanisms have been proposed to interpret these models and we aim along this thesis to improve their understanding, study their properties, and define their limitations.

Comments:	PhD thesis, Li\`ege, Belgium, June 2019. Permalink : this http URL
Subjects:	Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as:	arXiv:2106.09473 [stat.ML]
	(or arXiv:2106.09473v2 [stat.ML] for this version)

Submission history

From: Antonio Sutera [view email]
[v1] Thu, 17 Jun 2021 13:23:57 GMT (1743kb,D)
[v2] Mon, 21 Jun 2021 08:15:29 GMT (1743kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:2106.09473

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Importance measures derived from random forests: characterisation and extension

Submission history