Inferring statistically significant features from random forests

Paul, Jérôme; Dupont, Pierre

DIAL.pr - BOREAL

Accès à distance ? S'identifier sur le proxy UCLouvain

Inferring statistically significant features from random forests

Primary tabs

download

NEUCOMP2014.pdf

Open access
PDF
761.76 K

Paul, Jérôme [UCL] Dupont, Pierre [UCL]

Embedded feature selection can be performed by analyzing the variables used in a Random Forest. Such a multivariate selection takes into account the interactions between variables but is not straightforward to interpret in a statistical sense. We propose a statistical procedure to measure variable importance that tests if variables are significantly useful in combination with others in a forest. We show experimentally that this new importance index correctly identifies relevant variables. The top of the variable ranking is largely correlated with Breiman׳s importance index based on a permutation test. Our measure has the additional benefit to produce p-values from the forest voting process. Such p-values offer a very natural way to decide which features are significantly relevant while controlling the false discovery rate. Practical experiments are conducted on synthetic and real data including low and high-dimensional datasets for binary or multi-class problems. Results show that the proposed technique is effective and outperforms recent alternatives by reducing the computational complexity of the selection process by an order of magnitude while keeping similar performances.

metadata

Document type	Article de périodique (Journal article) – Article de recherche
Access type	Accès libre
Publication date	2015
Language	Anglais
Journal information	"Neurocomputing" - Vol. 150, no.part B, p. 471–480 (20 February 2015)
Peer reviewed	yes
Publisher	Elsevier BV ((Netherlands) Amsterdam)
issn	0925-2312
e-issn	1872-8286
Publication status	Publié
Affiliation	UCL - SST/ICTM/INGI - Pôle en ingénierie informatique
Links	http://hdl.handle.net/2078.1/153478[Handle] https://doi.org/10.1016/j.neucom.2014.07.067[DOI]

Bibliographic reference	Paul, Jérôme ; Dupont, Pierre. Inferring statistically significant features from random forests. In: Neurocomputing, Vol. 150, no.part B, p. 471–480 (20 February 2015)
Permanent URL	http://hdl.handle.net/2078.1/153478

User menu

Inferring statistically significant features from random forests

Primary tabs

Footer Help

Languages

Footer menu

User menu

Search form

You are here

Inferring statistically significant features from random forests

Primary tabs

Footer Help

Languages

Footer menu