SMS Sentiment Classification based on Lexical Features, Emoticons and Informal Abbreviations

Authors

  • Branislava Šandrih Faculty of Philology, University of Belgrade, Belgrade, Serbia

DOI:

https://doi.org/10.55630/sjc.2019.13.81-96

Keywords:

computer application in arts and humanities, web-based services, document analysis

Abstract

In this paper we investigate the influence of emoticons, informal speech, lexical and other linguistic features on the sentiment contained in SMS messages. Using the dataset of ~6,000 samples, we trained a linear SVM classifier able to determine positive, negative and neutral sentiments. The dataset mostly contains messages in Serbian, but also in English and German. The classifier had an average accuracy score of 92.3% in a 5-fold Cross Validation setting, and F1-score of 92.1%, 74.0% and 93.3% in favor of positive, negative and neutral class, respectively.

Downloads

Published

2019-10-03

Issue

Section

Articles