We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

math.ST

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Mathematics > Statistics Theory

Title: Non-asymptotic model selection in block-diagonal mixture of polynomial experts models

Abstract: Model selection, via penalized likelihood type criteria, is a standard task in many statistical inference and machine learning problems. Progress has led to deriving criteria with asymptotic consistency results and an increasing emphasis on introducing non-asymptotic criteria. We focus on the problem of modeling non-linear relationships in regression data with potential hidden graph-structured interactions between the high-dimensional predictors, within the mixture of experts modeling framework. In order to deal with such a complex situation, we investigate a block-diagonal localized mixture of polynomial experts (BLoMPE) regression model, which is constructed upon an inverse regression and block-diagonal structures of the Gaussian expert covariance matrices. We introduce a penalized maximum likelihood selection criterion to estimate the unknown conditional density of the regression model. This model selection criterion allows us to handle the challenging problem of inferring the number of mixture components, the degree of polynomial mean functions, and the hidden block-diagonal structures of the covariance matrices, which reduces the number of parameters to be estimated and leads to a trade-off between complexity and sparsity in the model. In particular, we provide a strong theoretical guarantee: a finite-sample oracle inequality satisfied by the penalized maximum likelihood estimator with a Jensen-Kullback-Leibler type loss, to support the introduced non-asymptotic model selection criterion. The penalty shape of this criterion depends on the complexity of the considered random subcollection of BLoMPE models, including the relevant graph structures, the degree of polynomial mean functions, and the number of mixture components.
Comments: Corrected typos. Extended results from arXiv:2104.02640
Subjects: Statistics Theory (math.ST); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Methodology (stat.ME); Machine Learning (stat.ML)
MSC classes: 62E17 (Primary) 62H30, 62H12 (Secondary)
Cite as: arXiv:2104.08959 [math.ST]
  (or arXiv:2104.08959v2 [math.ST] for this version)

Submission history

From: TrungTin Nguyen [view email]
[v1] Sun, 18 Apr 2021 21:32:20 GMT (37kb)
[v2] Mon, 10 May 2021 21:05:06 GMT (37kb)

Link back to: arXiv, form interface, contact.