HausaNLP at SemEval-2023 Task 10: Transfer Learning, Synthetic Data and Side-information for Multi-level Sexism Classification

Saminu Mohammad Aliyu, Idris Abdulmumin, Shamsuddeen Hassan Muhammad, Ibrahim Said Ahmad, Saheed Abdullahi Salahudeen, Aliyu Yusuf, Falalu Ibrahim Lawan

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 10: towards explainable detection of online sexism Paper

TLDR: We present the findings of our participation in the SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) task, a shared task on offensive language (sexism) detection on English Gab and Reddit dataset. We investigated the effects of transferring two language models: XLM-T (sentiment cl
You can open the #paper-SemEval_299 channel in a separate window.
Abstract: We present the findings of our participation in the SemEval-2023 Task 10: Explainable Detection of Online Sexism (EDOS) task, a shared task on offensive language (sexism) detection on English Gab and Reddit dataset. We investigated the effects of transferring two language models: XLM-T (sentiment classification) and HateBERT (same domain - Reddit) for multilevel classification into Sexist or not Sexist, and other subsequent sub-classifications of the sexist data. We also use synthetic classification of unlabelled dataset and intermediary class information to maximize the performance of our models. We submitted a system in Task A, and it ranked 49th with F1-score of 0.82. This result showed to be competitive as it only under-performed the best system by 0.052\%F1-score.