NLP_CHRISTINE at SemEval-2023 Task 10: Utilizing Transformer Contextual Representations and Ensemble Learning for Sexism Detection on Social Media Texts

Christina Christodoulou

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 10: towards explainable detection of online sexism Paper

TLDR: The paper describes the SemEval-2023 Task 10: "Explainable Detection of Online Sexism (EDOS)", which investigates the detection of sexism on two social media sites, Gab and Reddit, by encouraging the development of machine learning models that perform binary and multi-class classification on English
You can open the #paper-SemEval_92 channel in a separate window.
Abstract: The paper describes the SemEval-2023 Task 10: "Explainable Detection of Online Sexism (EDOS)", which investigates the detection of sexism on two social media sites, Gab and Reddit, by encouraging the development of machine learning models that perform binary and multi-class classification on English texts. The EDOS Task consisted of three hierarchical sub-tasks: binary sexism detection in sub-task A, category of sexism detection in sub-task B and fine-grained vector of sexism detection in sub-task C. My participation in EDOS comprised fine-tuning of different layer representations of Transformer-based pre-trained language models, namely BERT, AlBERT and RoBERTa, and ensemble learning via majority voting of the best performing models. Despite the low rank mainly due to a submission error, the system employed the largest version of the aforementioned Transformer models (BERT-Large, ALBERT-XXLarge-v1, ALBERT-XXLarge-v2, RoBERTa-Large), experimented with their multi-layer structure and aggregated their predictions so as to get the final result. My predictions on the test sets achieved 82.88\%, 63.77\% and 43.08\% Macro-F1 score in sub-tasks A, B and C respectively.