We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CL

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Computation and Language

Title: Study of sampling methods in sentiment analysis of imbalanced data

Abstract: This work investigates the application of sampling methods for sentiment analysis on two different highly imbalanced datasets. One dataset contains online user reviews from the cooking platform Epicurious and the other contains comments given to the Planned Parenthood organization. In both these datasets, the classes of interest are rare. Word n-grams were used as features from these datasets. A feature selection technique based on information gain is first applied to reduce the number of features to a manageable space. A number of different sampling methods were then applied to mitigate the class imbalance problem which are then analyzed.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2106.06673 [cs.CL]
  (or arXiv:2106.06673v1 [cs.CL] for this version)

Submission history

From: Zeeshan Sayyed [view email]
[v1] Sat, 12 Jun 2021 03:16:18 GMT (222kb,D)

Link back to: arXiv, form interface, contact.