We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

stat.ML

Change to browse by:

References & Citations

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Statistics > Machine Learning

Title: Sparse Quadratic Logistic Regression in Sub-quadratic Time

Abstract: We consider support recovery in the quadratic logistic regression setting - where the target depends on both p linear terms $x_i$ and up to $p^2$ quadratic terms $x_i x_j$. Quadratic terms enable prediction/modeling of higher-order effects between features and the target, but when incorporated naively may involve solving a very large regression problem. We consider the sparse case, where at most $s$ terms (linear or quadratic) are non-zero, and provide a new faster algorithm. It involves (a) identifying the weak support (i.e. all relevant variables) and (b) standard logistic regression optimization only on these chosen variables. The first step relies on a novel insight about correlation tests in the presence of non-linearity, and takes $O(pn)$ time for $n$ samples - giving potentially huge computational gains over the naive approach. Motivated by insights from the boolean case, we propose a non-linear correlation test for non-binary finite support case that involves hashing a variable and then correlating with the output variable. We also provide experimental results to demonstrate the effectiveness of our methods.
Subjects: Machine Learning (stat.ML); Information Theory (cs.IT); Machine Learning (cs.LG)
Cite as: arXiv:1703.02682 [stat.ML]
  (or arXiv:1703.02682v1 [stat.ML] for this version)

Submission history

From: Murat Kocaoglu [view email]
[v1] Wed, 8 Mar 2017 03:21:08 GMT (233kb,D)

Link back to: arXiv, form interface, contact.