Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Fu, Xiao; Huang, Kejun; Hong, Mingyi; Sidiropoulos, Nicholas D.; So, Anthony Man-Cho

doi:10.1109/TSP.2017.2698365

Full-text links:

Download:

Current browse context:

stat.ML

< prev | next >

new | recent | 1605

Statistics > Machine Learning

Title: Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Authors: Xiao Fu, Kejun Huang, Mingyi Hong, Nicholas D. Sidiropoulos, Anthony Man-Cho So

(Submitted on 31 May 2016 (v1), last revised 4 May 2017 (this version, v4))

Abstract: Generalized canonical correlation analysis (GCCA) aims at finding latent low-dimensional common structure from multiple views (feature vectors in different domains) of the same entities. Unlike principal component analysis (PCA) that handles a single view, (G)CCA is able to integrate information from different feature spaces. Here we focus on MAX-VAR GCCA, a popular formulation which has recently gained renewed interest in multilingual processing and speech modeling. The classic MAX-VAR GCCA problem can be solved optimally via eigen-decomposition of a matrix that compounds the (whitened) correlation matrices of the views; but this solution has serious scalability issues, and is not directly amenable to incorporating pertinent structural constraints such as non-negativity and sparsity on the canonical components. We posit regularized MAX-VAR GCCA as a non-convex optimization problem and propose an alternating optimization (AO)-based algorithm to handle it. Our algorithm alternates between {\em inexact} solutions of a regularized least squares subproblem and a manifold-constrained non-convex subproblem, thereby achieving substantial memory and computational savings. An important benefit of our design is that it can easily handle structure-promoting regularization. We show that the algorithm globally converges to a critical point at a sublinear rate, and approaches a global optimal solution at a linear rate when no regularization is considered. Judiciously designed simulations and large-scale word embedding tasks are employed to showcase the effectiveness of the proposed algorithm.

Subjects:	Machine Learning (stat.ML)
DOI:	10.1109/TSP.2017.2698365
Cite as:	arXiv:1605.09459 [stat.ML]
	(or arXiv:1605.09459v4 [stat.ML] for this version)

Submission history

From: Xiao Fu [view email]
[v1] Tue, 31 May 2016 01:01:52 GMT (397kb)
[v2] Fri, 17 Jun 2016 16:20:21 GMT (397kb)
[v3] Fri, 30 Sep 2016 14:56:25 GMT (218kb,D)
[v4] Thu, 4 May 2017 21:19:48 GMT (223kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> stat > arXiv:1605.09459

Download:

Current browse context:

Change to browse by:

References & Citations

Bookmark

Statistics > Machine Learning

Title: Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis

Submission history