We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Scalable and Optimal Generalized Canonical Correlation Analysis via Alternating Optimization

Abstract: This paper considers generalized (multiview) canonical correlation analysis (GCCA) for large-scale datasets. A memory-efficient and computationally lightweight algorithm is proposed for the classic MAX-VAR GCCA formulation, which is gaining renewed interest in various applications, such as speech recognition and natural language processing. The MAX-VAR GCCA problem can be solved optimally via eigen-decomposition of a matrix that compounds the (whitened) correlation matrices of the views. However, this route can easily lead to memory explosion and a heavy computational burden when the size of the views becomes large. Instead, we propose an alternating optimization (AO)-based algorithm, which avoids instantiating the correlation matrices of the views and thus can achieve substantial saving in memory. The algorithm also maintains data sparsity, which can be exploited to alleviate the computational burden. Consequently, the proposed algorithm is highly scalable. Despite the non-convexity of the MAX-VAR GCCA problem, the proposed iterative algorithm is shown to converge to a globally optimal solution under certain mild conditions. The proposed framework ensures global convergence even when the subproblems are inexactly solved, which can further reduce the complexity in practice. Simulations and large-scale word embedding tasks are employed to showcase the effectiveness of the proposed algorithm.
Comments: 17 pages, 3 figures
Subjects: Machine Learning (stat.ML)
Cite as: arXiv:1605.09459 [stat.ML]
  (or arXiv:1605.09459v1 [stat.ML] for this version)

Submission history

From: Xiao Fu [view email]
[v1] Tue, 31 May 2016 01:01:52 GMT (397kb)
[v2] Fri, 17 Jun 2016 16:20:21 GMT (397kb)
[v3] Fri, 30 Sep 2016 14:56:25 GMT (218kb,D)
[v4] Thu, 4 May 2017 21:19:48 GMT (223kb,D)

Link back to: arXiv, form interface, contact.