Appeal for Attention at SemEval-2023 Task 3: Data augmentation extension strategies for detection of online news persuasion techniques

Sergiu Amihaesei, Laura Cornei, George Stoica

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 3: detecting the category, the framing, and the persuasion techniques in online news in a multi-lingual setup Paper

TLDR: In this paper, we proposed and explored the impact of four different dataset augmentation andextension strategies that we used for solving the subtask 3 of SemEval-2023 Task 3: multi-label persuasion techniques classification in a multi-lingual context. We consider two types of augmentation methods
You can open the #paper-SemEval_95 channel in a separate window.
Abstract: In this paper, we proposed and explored the impact of four different dataset augmentation andextension strategies that we used for solving the subtask 3 of SemEval-2023 Task 3: multi-label persuasion techniques classification in a multi-lingual context. We consider two types of augmentation methods (one based on a modified version of synonym replacement and one based on translations) and two ways of extending the training dataset (using filtered data generated by GPT-3 and using a dataset from a previous competition). We studied the effects of the aforementioned techniques by using theaugmented and/or extended training dataset to fine-tune a pretrained XLM-RoBERTa-Large model. Using the augmentation methods alone, we managed to obtain 3rd place for English, 13th place for Italian and between the 5th to 9th places for the other 7 languages during the competition.