Zero-Shot Text Classification via Self-Supervised Tuning

Chaoqun Liu; Wenxuan Zhang; Guizhen Chen; Xiaobao Wu; Anh Tuan Luu; Chip Hong Chang; Lidong Bing

Zero-Shot Text Classification via Self-Supervised Tuning

Chaoqun Liu, Wenxuan Zhang, Guizhen Chen, Xiaobao Wu, Anh Tuan Luu, Chip Hong Chang, Lidong Bing

📝 Paper

Anthology

Underline 🪧 Poster 🧑‍🏫 Slides 📺 Watch Video on Underline Add to Favorites

Findings: NLP Applications Findings Paper

Session 7: NLP Applications (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 12, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 12, Session 7 (15:00-16:30 UTC)

Keywords: educational applications, gec, essay scoring

TLDR: Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive to the choices of templates, or rely on large-scale annotated data of relevant tasks for meta-tuning. In this work, we propose a new paradigm based on self-supervised lea...

You can open the #paper-P1678 channel in a separate window.

Abstract: Existing solutions to zero-shot text classification either conduct prompting with pre-trained language models, which is sensitive to the choices of templates, or rely on large-scale annotated data of relevant tasks for meta-tuning. In this work, we propose a new paradigm based on self-supervised learning to solve zero-shot text classification tasks by tuning the language models with unlabeled data, called self-supervised tuning. By exploring the inherent structure of free texts, we propose a new learning objective called first sentence prediction to bridge the gap between unlabeled data and text classification tasks. After tuning the model to learn to predict the first sentence in a paragraph based on the rest, the model is able to conduct zero-shot inference on unseen tasks such as topic classification and sentiment analysis. Experimental results show that our model outperforms the state-of-the-art baselines on 7 out of 10 tasks. Moreover, the analysis reveals that our model is less sensitive to the prompt design. Our code and pre-trained models are publicly available at https://github.com/DAMO-NLP-SG/SSTuning.