Pre-trained Language Models Can be Fully Zero-Shot Learners

Xuandong Zhao; Siqi Ouyang; Zhiguo Yu; Ming Wu; Lei Li

Pre-trained Language Models Can be Fully Zero-Shot Learners

Xuandong Zhao, Siqi Ouyang, Zhiguo Yu, Ming Wu, Lei Li

📝 Paper

Anthology

Underline 🪧 Poster 📺 Watch Video on Underline Add to Favorites

Main: Large Language Models Main-oral Paper

Session 4: Large Language Models (Oral)

Conference Room: Metropolitan Centre

Conference Time: July 11, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 11, Session 4 (15:00-16:30 UTC)

Keywords: prompting

TLDR: How can we extend a pre-trained model to many language understanding tasks, without labeled or additional unlabeled data? Pre-trained language models (PLMs) have been effective for a wide range of NLP tasks. However, existing approaches either require fine-tuning on downstream labeled datasets or ma...

You can open the #paper-P2262 channel in a separate window.

Abstract: How can we extend a pre-trained model to many language understanding tasks, without labeled or additional unlabeled data? Pre-trained language models (PLMs) have been effective for a wide range of NLP tasks. However, existing approaches either require fine-tuning on downstream labeled datasets or manually constructing proper prompts. In this paper, we propose nonparametric prompting PLM (NPPrompt) for fully zero-shot language understanding. Unlike previous methods, NPPrompt uses only pre-trained language models and does not require any labeled data or additional raw corpus for further fine-tuning, nor does it rely on humans to construct a comprehensive set of prompt label words. We evaluate NPPrompt against previous major few-shot and zero-shot learning methods on diverse NLP tasks: including text classification, text entailment, similar text retrieval, paraphrasing, and multiple-choice question answering. Experimental results demonstrate that our NPPrompt outperforms the previous best fully zero-shot method by big margins, with absolute gains of 12.8\% in accuracy on text classification and 15.6\% on the GLUE benchmark. Our source code is available at https://anonymous.4open. science/r/NPPrompt.