Is Fine-tuning Needed? Pre-trained Language Models Are Near Perfect for Out-of-Domain Detection

Rheeya Uppaal, Junjie Hu, Yixuan Li

Main: Machine Learning for NLP Main-poster Paper

Poster Session 6: Machine Learning for NLP (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 12, 09:00-10:30 (EDT) (America/Toronto)
Global Time: July 12, Poster Session 6 (13:00-14:30 UTC)
Keywords: representation learning
TLDR: Out-of-distribution (OOD) detection is a critical task for reliable predictions over text. Fine-tuning with pre-trained language models has been a de facto procedure to derive OOD detectors with respect to in-distribution (ID) data. Despite its common use, the understanding of the role of fine-tunin...
You can open the #paper-P5688 channel in a separate window.
Abstract: Out-of-distribution (OOD) detection is a critical task for reliable predictions over text. Fine-tuning with pre-trained language models has been a de facto procedure to derive OOD detectors with respect to in-distribution (ID) data. Despite its common use, the understanding of the role of fine-tuning and its necessity for OOD detection is largely unexplored. In this paper, we raise the question: is fine-tuning necessary for OOD detection? We present a study investigating the efficacy of directly leveraging pre-trained language models for OOD detection, without any model fine-tuning on the ID data. We compare the approach with several competitive fine-tuning objectives, and offer new insights under various types of distributional shifts. Extensive experiments demonstrate near-perfect OOD detection performance (with 0\% FPR95 in many cases), strongly outperforming the fine-tuned counterpart.