Learning with tree tensor networks: complexity estimates and model selection

Michel, Bertrand; Nouy, Anthony

Full-text links:

Download:

Current browse context:

math.ST

< prev | next >

new | recent | 2007

Mathematics > Statistics Theory

Title: Learning with tree tensor networks: complexity estimates and model selection

Authors: Bertrand Michel, Anthony Nouy

(Submitted on 2 Jul 2020 (this version), latest version 19 May 2021 (v3))

Abstract: In this paper, we propose and analyze a model selection method for tree tensor networks in an empirical risk minimization framework. Tree tensor networks, or tree-based tensor formats, are prominent model classes for the approximation of high-dimensional functions in numerical analysis and data science. They correspond to sum-product neural networks with a sparse connectivity associated with a dimension partition tree $T$, widths given by a tuple $r$ of tensor ranks, and multilinear activation functions (or units). The approximation power of these model classes has been proved to be near-optimal for classical smoothness classes. However, in an empirical risk minimization framework with a limited number of observations, the dimension tree $T$ and ranks $r$ should be selected carefully to balance estimation and approximation errors. In this paper, we propose a complexity-based model selection strategy \`a la Barron, Birg\'e, Massart. Given a family of model classes, with different trees, ranks and tensor product feature spaces, a model is selected by minimizing a penalized empirical risk, with a penalty depending on the complexity of the model class. After deriving bounds of the metric entropy of tree tensor networks with bounded parameters, we deduce a form of the penalty from bounds on suprema of empirical processes. This choice of penalty yields a risk bound for the predictor associated with the selected model. For classical smoothness spaces, we show that the proposed strategy is minimax optimal in a least-squares setting. In practice, the amplitude of the penalty is calibrated with a slope heuristics method. Numerical experiments in a least-squares regression setting illustrate the performance of the strategy for the approximation of multivariate functions and univariate functions identified with tensors by tensorization (quantization).

Subjects:	Statistics Theory (math.ST); Machine Learning (stat.ML)
Cite as:	arXiv:2007.01165 [math.ST]
	(or arXiv:2007.01165v1 [math.ST] for this version)

Submission history

From: Anthony Nouy [view email]
[v1] Thu, 2 Jul 2020 14:52:08 GMT (1494kb)
[v2] Thu, 11 Mar 2021 15:59:26 GMT (1510kb)
[v3] Wed, 19 May 2021 10:15:30 GMT (1510kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> math > arXiv:2007.01165v1

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Mathematics > Statistics Theory

Title: Learning with tree tensor networks: complexity estimates and model selection

Submission history