We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SI

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Social and Information Networks

Title: Addressing machine learning concept drift reveals declining vaccine sentiment during the COVID-19 pandemic

Abstract: Social media analysis has become a common approach to assess public opinion on various topics, including those about health, in near real-time. The growing volume of social media posts has led to an increased usage of modern machine learning methods in natural language processing. While the rapid dynamics of social media can capture underlying trends quickly, it also poses a technical problem: algorithms trained on annotated data in the past may underperform when applied to contemporary data. This phenomenon, known as concept drift, can be particularly problematic when rapid shifts occur either in the topic of interest itself, or in the way the topic is discussed. Here, we explore the effect of machine learning concept drift by focussing on vaccine sentiments expressed on Twitter, a topic of central importance especially during the COVID-19 pandemic. We show that while vaccine sentiment has declined considerably during the COVID-19 pandemic in 2020, algorithms trained on pre-pandemic data would have largely missed this decline due to concept drift. Our results suggest that social media analysis systems must address concept drift in a continuous fashion in order to avoid the risk of systematic misclassification of data, which is particularly likely during a crisis when the underlying data can change suddenly and rapidly.
Comments: 9 pages, 4 figures, 3 pages of SI; Minor correction in Figure 1: Bracket was not visible
Subjects: Social and Information Networks (cs.SI); Computation and Language (cs.CL)
ACM classes: I.2.7; J.3
Cite as: arXiv:2012.02197 [cs.SI]
  (or arXiv:2012.02197v2 [cs.SI] for this version)

Submission history

From: Martin Muller [view email]
[v1] Thu, 3 Dec 2020 18:53:57 GMT (640kb,D)
[v2] Mon, 7 Dec 2020 11:28:31 GMT (640kb,D)

Link back to: arXiv, form interface, contact.