References & Citations
Statistics > Applications
Title: Data Sharing and Resampled LASSO: A word based sentiment Analysis for IMDb data
(Submitted on 16 May 2017 (v1), last revised 18 May 2017 (this version, v2))
Abstract: In this article we study variable selection problem using LASSO with new improvisations. LASSO uses $\ell_{1}$ penalty, it shrinks most of the coefficients to zero when number of explanatory variables $(p)$ are much larger the number of observations $(N)$. Novelty of the approach developed in this article blends basic ideas behind resampling and LASSO together which provides a significant variable reduction and improved prediction accuracy in terms of mean squared error in the test sample. Different weighting schemes have been explored using Bootstrapped LASSO, the basic methodology developed in here. Weighting schemes determine to what extent of data blending in case of grouped data. Data sharing (DSL) technique developed by [11] lies at the root of the present methodology. We apply the technique to analyze the IMDb dataset as discussed in [11] and compare our result with [11].
Submission history
From: Ashutosh Maurya [view email][v1] Tue, 16 May 2017 14:13:47 GMT (471kb,D)
[v2] Thu, 18 May 2017 05:34:10 GMT (471kb,D)
Link back to: arXiv, form interface, contact.