We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Social and Information Networks

Title: Utilizing Deep Learning to Identify Drug Use on Twitter Data

Abstract: The collection and examination of social media has become a useful mechanism for studying the mental activity and behavior tendencies of users. Through the analysis of collected Twitter data, models were developed for classifying drug-related tweets. Using topic pertaining keywords, such as slang and methods of drug consumption, a set of tweets was generated. Potential candidates were then preprocessed resulting in a dataset of 3,696,150 rows. The classification power of multiple methods was compared including support vector machines (SVM), XGBoost, and convolutional neural network (CNN) based classifiers. Rather than simple feature or attribute analysis, a deep learning approach was implemented to screen and analyze the tweets' semantic meaning. The two CNN-based classifiers presented the best result when compared against other methodologies. The first was trained with 2,661 manually labeled samples, while the other included synthetically generated tweets culminating in 12,142 samples. The accuracy scores were 76.35% and 82.31%, with an AUC of 0.90 and 0.91. Additionally, association rule mining showed that commonly mentioned drugs had a level of correspondence with frequently used illicit substances, proving the practical usefulness of the system. Lastly, the synthetically generated set provided increased scores, improving the classification capability and proving the worth of this methodology.
Comments: 20 pages, 12 figures, 8 tables
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL); Machine Learning (cs.LG)
Cite as: arXiv:2003.11522 [cs.SI]
  (or arXiv:2003.11522v1 [cs.SI] for this version)

Submission history

From: Joseph Tassone [view email]
[v1] Sun, 8 Mar 2020 07:52:40 GMT (3836kb,D)

Link back to: arXiv, form interface, contact.