GSAC: A Gujarati Sentiment Analysis Corpus from Twitter

Monil Gokani, Radhika Mamidi

The 13th Workshop on Computational Approaches to Subjectivity, Sentiment, & Social Media Analysis Long Paper

TLDR: Sentiment Analysis is an important task for analysing online content across languages for tasks such as content moderation and opinion mining. Though a significant amount of resources are available for Sentiment Analysis in several Indian languages, there do not exist any large-scale, open-access co
You can open the #paper-WASSA_22 channel in a separate window.
Abstract: Sentiment Analysis is an important task for analysing online content across languages for tasks such as content moderation and opinion mining. Though a significant amount of resources are available for Sentiment Analysis in several Indian languages, there do not exist any large-scale, open-access corpora for Gujarati. Our paper presents and describes the Gujarati Sentiment Analysis Corpus (GSAC), which has been sourced from Twitter and manually annotated by native speakers of the language. We describe in detail our collection and annotation processes and conduct extensive experiments on our corpus to provide reliable baselines for future work using our dataset.