iREL at SemEval-2023 Task 9: Improving understanding of multilingual Tweets using Translation-Based Augmentation and Domain Adapted Pre-Trained Models
Bhavyajeet Singh, Ankita Maity, Pavan Kandru, Aditya Hari, Vasudeva Varma
The 17th International Workshop on Semantic Evaluation (SemEval-2023) Task 9: multilingual tweet intimacy analysis Paper
TLDR:
This paper describes our system (iREL) for Tweet intimacy analysis sharedtask of the SemEval 2023 workshop at ACL 2023. Oursystem achieved an overall Pearson's r score of 0.5924 and ranked 10th on the overall leaderboard. For the unseen languages, we ranked third on the leaderboard and achieved a P
You can open the
#paper-SemEval_309
channel in a separate window.
Abstract:
This paper describes our system (iREL) for Tweet intimacy analysis sharedtask of the SemEval 2023 workshop at ACL 2023. Oursystem achieved an overall Pearson's r score of 0.5924 and ranked 10th on the overall leaderboard. For the unseen languages, we ranked third on the leaderboard and achieved a Pearson's r score of 0.485. We used a single multilingual model for all languages, as discussed in this paper. We provide a detailed description of our pipeline along with multiple ablation experiments to further analyse each component of the pipeline. We demonstrate how translation-based augmentation, domain-specific features, and domain-adapted pre-trained models improve the understanding of intimacy in tweets. The codecan be found at \textbackslash{}href\{https://github.com/bhavyajeet/Multilingual-tweet-intimacy\}\{https://github.com/bhavyajeet/Multilingual-tweet-intimacy\}