CNLP-NITS at SemEval-2023 Task 10: Online sexism prediction, PREDHATE!

Advaitha Vetagiri, Prottay Adhikary, Partha Pakray, Amitava Das

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 10: towards explainable detection of online sexism Paper

TLDR: Online sexism is a rising issue that threatens women's safety, fosters hostile situations, and upholds social inequities. We describe a task SemEval-2023 Task 10 for creating English-language models that can precisely identify and categorize sexist content on internet forums and social platforms lik
You can open the #paper-SemEval_128 channel in a separate window.
Abstract: Online sexism is a rising issue that threatens women's safety, fosters hostile situations, and upholds social inequities. We describe a task SemEval-2023 Task 10 for creating English-language models that can precisely identify and categorize sexist content on internet forums and social platforms like Gab and Reddit as well to provide an explainability in order to address this problem. The problem is divided into three hierarchically organized subtasks: binary sexism detection, sexism by category, and sexism by fine-grained vector. The dataset consists of 20,000 labelled entries. For Task A, pertained models like Convolutional Neural Network (CNN) and Bidirectional Long Short-Term Memory (BiLSTM), which is called CNN-BiLSTM and Generative Pretrained Transformer 2 (GPT-2) models were used, as well as the GPT-2 model for Task B and C, and have provided experimental configurations. According to our findings, the GPT-2 model performs better than the CNN-BiLSTM model for Task A, while GPT-2 is highly accurate for Tasks B and C on the training, validation and testing splits of the training data provided in the task. Our proposed models allow researchers to create more precise and understandable models for identifying and categorizing sexist content in online forums, thereby empowering users and moderators.