iREL at SemEval-2023 Task 9: Improving understanding of multilingual Tweets using Translation-Based Augmentation and Domain Adapted Pre-Trained Models

Bhavyajeet Singh, Ankita Maity, Pavan Kandru, Aditya Hari, Vasudeva Varma

The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 9: multilingual tweet intimacy analysis Paper

TLDR: This paper describes our system (iREL) for Tweet intimacy analysis sharedtask of the SemEval 2023 workshop at ACL 2023. Oursystem achieved an overall Pearson's r score of 0.5924 and ranked 10th on the overall leaderboard. For the unseen languages, we ranked third on the leaderboard and achieved a P
You can open the #paper-SemEval_309 channel in a separate window.
Abstract: This paper describes our system (iREL) for Tweet intimacy analysis sharedtask of the SemEval 2023 workshop at ACL 2023. Oursystem achieved an overall Pearson's r score of 0.5924 and ranked 10th on the overall leaderboard. For the unseen languages, we ranked third on the leaderboard and achieved a Pearson's r score of 0.485. We used a single multilingual model for all languages, as discussed in this paper. We provide a detailed description of our pipeline along with multiple ablation experiments to further analyse each component of the pipeline. We demonstrate how translation-based augmentation, domain-specific features, and domain-adapted pre-trained models improve the understanding of intimacy in tweets. The codecan be found at \textbackslash{}href\{https://github.com/bhavyajeet/Multilingual-tweet-intimacy\}\{https://github.com/bhavyajeet/Multilingual-tweet-intimacy\}