We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.LG

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: DS-MLR: Exploiting Double Separability for Scaling up Distributed Multinomial Logistic Regression

Abstract: Scaling multinomial logistic regression to datasets with very large number of data points and classes is challenging. This is primarily because one needs to compute the log-partition function on every data point. This makes distributing the computation hard. In this paper, we present a distributed stochastic gradient descent based optimization method (DS-MLR) for scaling up multinomial logistic regression problems to massive scale datasets without hitting any storage constraints on the data and model parameters. Our algorithm exploits double-separability, an attractive property that allows us to achieve both data as well as model parallelism simultaneously. In addition, we introduce a non-blocking and asynchronous variant of our algorithm that avoids bulk-synchronization. We demonstrate the versatility of DS-MLR to various scenarios in data and model parallelism, through an extensive empirical study using several real-world datasets. In particular, we demonstrate the scalability of DS-MLR by solving an extreme multi-class classification problem on the Reddit dataset (159 GB data, 358 GB parameters) where, to the best of our knowledge, no other existing methods apply.
Subjects: Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as: arXiv:1604.04706 [cs.LG]
  (or arXiv:1604.04706v7 [cs.LG] for this version)

Submission history

From: Parameswaran Raman [view email]
[v1] Sat, 16 Apr 2016 07:26:58 GMT (3347kb,D)
[v2] Fri, 31 Mar 2017 18:45:59 GMT (3320kb,D)
[v3] Tue, 23 May 2017 08:06:02 GMT (2899kb,D)
[v4] Thu, 15 Feb 2018 01:02:54 GMT (2585kb,D)
[v5] Wed, 18 Apr 2018 01:15:04 GMT (2586kb,D)
[v6] Mon, 21 May 2018 23:44:36 GMT (2701kb,D)
[v7] Fri, 3 Aug 2018 22:13:06 GMT (2701kb,D)

Link back to: arXiv, form interface, contact.