We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:


Current browse context:


Change to browse by:

References & Citations

DBLP - CS Bibliography


(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo ScienceWISE logo

Computer Science > Machine Learning

Title: Checkovid: A COVID-19 misinformation detection system on Twitter using network and content mining perspectives

Abstract: During the COVID-19 pandemic, social media platforms were ideal for communicating due to social isolation and quarantine. Also, it was the primary source of misinformation dissemination on a large scale, referred to as the infodemic. Therefore, automatic debunking misinformation is a crucial problem. To tackle this problem, we present two COVID-19 related misinformation datasets on Twitter and propose a misinformation detection system comprising network-based and content-based processes based on machine learning algorithms and NLP techniques. In the network-based process, we focus on social properties, network characteristics, and users. On the other hand, we classify misinformation using the content of the tweets directly in the content-based process, which contains text classification models (paragraph-level and sentence-level) and similarity models. The evaluation results on the network-based process show the best results for the artificial neural network model with an F1 score of 88.68%. In the content-based process, our novel similarity models, which obtained an F1 score of 90.26%, show an improvement in the misinformation classification results compared to the network-based models. In addition, in the text classification models, the best result was achieved using the stacking ensemble-learning model by obtaining an F1 score of 95.18%. Furthermore, we test our content-based models on the Constraint@AAAI2021 dataset, and by getting an F1 score of 94.38%, we improve the baseline results. Finally, we develop a fact-checking website called Checkovid that uses each process to detect misinformative and informative claims in the domain of COVID-19 from different perspectives.
Comments: 20 Pages, 18 Figures, 7 Tables, Submitted for Review Process in a Journal
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Social and Information Networks (cs.SI)
MSC classes: 68T05, 68T07
ACM classes: I.2; I.5
Cite as: arXiv:2107.09768 [cs.LG]
  (or arXiv:2107.09768v1 [cs.LG] for this version)

Submission history

From: Mehdi Ghatee Dr. [view email]
[v1] Tue, 20 Jul 2021 20:58:23 GMT (1971kb)

Link back to: arXiv, form interface, contact.