HHS at SemEval-2023 Task 10: A Comparative Analysis of Sexism Detection Based on the RoBERTa Model
Yao Zhang, Liqing Wang
The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 10: towards explainable detection of online sexism Paper
TLDR:
This paper describes the methods and models applied by our team HHS in SubTask-A of SemEval-2023 Task 10 about sexism detection. In this task, we trained with the officially released data and analyzed the performance of five models, TextCNN, BERT, RoBERTa, XLNet, and Sup-SimCSE-RoBERTa. The experime
You can open the
#paper-SemEval_149
channel in a separate window.
Abstract:
This paper describes the methods and models applied by our team HHS in SubTask-A of SemEval-2023 Task 10 about sexism detection. In this task, we trained with the officially released data and analyzed the performance of five models, TextCNN, BERT, RoBERTa, XLNet, and Sup-SimCSE-RoBERTa. The experiments show that most of the models can achieve good results. Then, we tried data augmentation, model ensemble, dropout, and other operations on several of these models, and compared the results for analysis. In the end, the most effective approach that yielded the best results on the test set involved the following steps: enhancing the sexist data using dropout, feeding it as input to the Sup-SimCSE-RoBERTa model, and providing the raw data as input to the XLNet model. Then, combining the outputs of the two methods led to even better results. This method yielded a Macro-F1 score of 0.823 in the final evaluation phase of the SubTask-A of the competition.