Identifying and Categorizing Offensive Language in Social Media

Oswal, Nikhil

Full-text links:

Download:

Current browse context:

cs.CL

< prev | next >

new | recent | 2104

Computer Science > Computation and Language

Title: Identifying and Categorizing Offensive Language in Social Media

Authors: Nikhil Oswal

(Submitted on 10 Apr 2021)

Abstract: Offensive language is pervasive in social media. Individuals frequently take advantage of the perceived anonymity of computer-mediated communication, using this to engage in behavior that many of them would not consider in real life. The automatic identification of offensive content online is an important task that has gained more attention in recent years. This task can be modeled as a supervised classification problem in which systems are trained using a dataset containing posts that are annotated with respect to the presence of some form(s) of abusive or offensive content. The objective of this study is to provide a description of a classification system built for SemEval-2019 Task 6: OffensEval. This system classifies a tweet as either offensive or not offensive (Sub-task A) and further classifies offensive tweets into categories (Sub-tasks B \& C). We trained machine learning and deep learning models along with data preprocessing and sampling techniques to come up with the best results. Models discussed include Naive Bayes, SVM, Logistic Regression, Random Forest and LSTM.

Subjects:	Computation and Language (cs.CL); Social and Information Networks (cs.SI)
Cite as:	arXiv:2104.04871 [cs.CL]
	(or arXiv:2104.04871v1 [cs.CL] for this version)

Submission history

From: Nikhil Oswal [view email]
[v1] Sat, 10 Apr 2021 22:53:43 GMT (2443kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2104.04871

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Identifying and Categorizing Offensive Language in Social Media

Submission history