Sentiment analysis model for Twitter data in Polish language

Chlasta, Karol

Full-text links:

Download:

PDF only

Current browse context:

cs.CL

< prev | next >

new | recent | 1911

Computer Science > Computation and Language

Title: Sentiment analysis model for Twitter data in Polish language

Authors: Karol Chlasta

(Submitted on 3 Nov 2019)

Abstract: Text mining analysis of tweets gathered during Polish presidential election on May 10th, 2015. The project included implementation of engine to retrieve information from Twitter, building document corpora, corpora cleaning, and creating Term-Document Matrix. Each tweet from the text corpora was assigned a category based on its sentiment score. The score was calculated using the number of positive and/or negative emoticons and Polish words in each document. The result data set was used to train and test four machine learning classifiers, to select these providing most accurate automatic tweet classification results. The Naive Bayes and Maximum Entropy algorithms achieved the best accuracy of respectively 71.76% and 77.32%. All implementation tasks were completed using R programming language.

Subjects:	Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
Cite as:	arXiv:1911.00985 [cs.CL]
	(or arXiv:1911.00985v1 [cs.CL] for this version)

Submission history

From: Karol Chlasta [view email]
[v1] Sun, 3 Nov 2019 22:06:03 GMT (766kb)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:1911.00985

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computation and Language

Title: Sentiment analysis model for Twitter data in Polish language

Submission history