JCT at SemEval-2023 Tasks 12 A and 12B: Sentiment Analysis for Tweets Written in Low-resource African Languages using Various Machine Learning and Deep Learning Methods, Resampling, and HyperParameter Tuning

Ron Keinan, Yaakov Hacohen-Kerner

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 12: afrisenti-semeval: sentiment analysis for low-resource african languages using twitter dataset Paper

TLDR: In this paper, we describe our submissions to the SemEval-2023 contest. We tackled subtask 12 - "AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset". We developed different models for 12 African languages and a 13th model for a multilingual dataset built f
You can open the #paper-SemEval_57 channel in a separate window.
Abstract: In this paper, we describe our submissions to the SemEval-2023 contest. We tackled subtask 12 - "AfriSenti-SemEval: Sentiment Analysis for Low-resource African Languages using Twitter Dataset". We developed different models for 12 African languages and a 13th model for a multilingual dataset built from these 12 languages. We applied a wide variety of word and char n-grams based on their tf-idf values, 4 classical machine learning methods, 2 deep learning methods, and 3 oversampling methods. We used 12 sentiment lexicons and applied extensive hyperparameter tuning.