On Analyzing Annotation Consistency in Online Abusive Behavior Datasets

Awal, Md Rabiul; Cao, Rui; Lee, Roy Ka-Wei; Mitrović, Sandra

Full-text links:

Download:

Current browse context:

cs.SI

< prev | next >

new | recent | 2006

Computer Science > Social and Information Networks

Title: On Analyzing Annotation Consistency in Online Abusive Behavior Datasets

Authors: Md Rabiul Awal, Rui Cao, Roy Ka-Wei Lee, Sandra Mitrović

(Submitted on 24 Jun 2020)

Abstract: Online abusive behavior is an important issue that breaks the cohesiveness of online social communities and even raises public safety concerns in our societies. Motivated by this rising issue, researchers have proposed, collected, and annotated online abusive content datasets. These datasets play a critical role in facilitating the research on online hate speech and abusive behaviors. However, the annotation of such datasets is a difficult task; it is often contentious on what should be the true label of a given text as the semantic difference of the labels may be blurred (e.g., abusive and hate) and often subjective. In this study, we proposed an analytical framework to study the annotation consistency in online hate and abusive content datasets. We applied our proposed framework to evaluate the consistency of the annotation in three popular datasets that are widely used in online hate speech and abusive behavior studies. We found that there is still a substantial amount of annotation inconsistency in the existing datasets, particularly when the labels are semantically similar.

Subjects:	Social and Information Networks (cs.SI); Computation and Language (cs.CL)
Cite as:	arXiv:2006.13507 [cs.SI]
	(or arXiv:2006.13507v1 [cs.SI] for this version)

Submission history

From: Md Rabiul Awal [view email]
[v1] Wed, 24 Jun 2020 06:34:25 GMT (98kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2006.13507

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Social and Information Networks

Title: On Analyzing Annotation Consistency in Online Abusive Behavior Datasets

Submission history