We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Social and Information Networks

Title: Surveillance of COVID-19 Pandemic using Social Media: A Reddit Study in North Carolina

Abstract: Coronavirus disease (COVID-19) pandemic has changed various aspects of people's lives and behaviors. At this stage, there are no other ways to control the natural progression of the disease than adopting mitigation strategies such as wearing masks, watching distance, and washing hands. Moreover, at this time of social distancing, social media plays a key role in connecting people and providing a platform for expressing their feelings. In this study, we tap into social media to surveil the uptake of mitigation and detection strategies, and capture issues and concerns about the pandemic. In particular, we explore the research question, "how much can be learned regarding the public uptake of mitigation strategies and concerns about COVID-19 pandemic by using natural language processing on Reddit posts?" After extracting COVID-related posts from the four largest subreddit communities of North Carolina over six months, we performed NLP-based preprocessing to clean the noisy data. We employed a custom Named-entity Recognition (NER) system and a Latent Dirichlet Allocation (LDA) method for topic modeling on a Reddit corpus. We observed that 'mask', 'flu', and 'testing' are the most prevalent named-entities for "Personal Protective Equipment", "symptoms", and "testing" categories, respectively. We also observed that the most discussed topics are related to testing, masks, and employment. The mitigation measures are the most prevalent theme of discussion across all subreddits.
Comments: 12 pages, 6 figures, 7 tables, to be published in ACM-BCB 2021, corrected misspelled author
Subjects: Social and Information Networks (cs.SI); Information Retrieval (cs.IR); Machine Learning (cs.LG)
Cite as: arXiv:2106.04515 [cs.SI]
  (or arXiv:2106.04515v3 [cs.SI] for this version)

Submission history

From: Christopher Whitfield [view email]
[v1] Mon, 7 Jun 2021 06:55:25 GMT (1376kb)
[v2] Wed, 9 Jun 2021 01:04:35 GMT (2086kb)
[v3] Thu, 10 Jun 2021 03:48:19 GMT (2086kb)

Link back to: arXiv, form interface, contact.