UMUTeam at SemEval-2023 Task 12: Ensemble Learning of LLMs applied to Sentiment Analysis for Low-resource African Languages

José Antonio García-Díaz; Camilo Caparros-laiz; Ángela Almela; Gema Alcaráz-Mármol; María José Marín-Pérez; Rafael Valencia-García

UMUTeam at SemEval-2023 Task 12: Ensemble Learning of LLMs applied to Sentiment Analysis for Low-resource African Languages

José Antonio García-Díaz, Camilo Caparros-laiz, Ángela Almela, Gema Alcaráz-Mármol, María José Marín-Pérez, Rafael Valencia-García

Add to Favorites

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 12: afrisenti-semeval: sentiment analysis for low-resource african languages using twitter dataset Paper

TLDR: These working notes summarize the participation of the UMUTeam in the SemEval 2023 shared task: AfriSenti, focused on Sentiment Analysis in several African languages. Two subtasks are proposed, one in which each language is considered separately and another one in which all languages are merged. Our

RocketChat
Abstract

You can open the #paper-SemEval_41 channel in a separate window.

Abstract: These working notes summarize the participation of the UMUTeam in the SemEval 2023 shared task: AfriSenti, focused on Sentiment Analysis in several African languages. Two subtasks are proposed, one in which each language is considered separately and another one in which all languages are merged. Our proposal to solve both subtasks is grounded on the combination of features extracted from several multilingual Large Language Models and a subset of language-independent linguistic features. Our best results are achieved with the African languages less represented in the training set: Xitsonga, a Mozambique dialect, with a weighted f1-score of 54.89\textbackslash{}\%; Algerian Arabic, with a weighted f1-score of 68.52\textbackslash{}\%; Swahili, with a weighted f1-score of 60.52\textbackslash{}\%; and Twi, with a weighted f1-score of 71.14\%.