Improving Syntactic Probing Correctness and Robustness with Control Tasks

Weicheng Ma; Brian C Wang; Hefan Zhang; Lili Wang; Rolando Coto-Solano; Saeed Hassanpour; Soroush Vosoughi

Improving Syntactic Probing Correctness and Robustness with Control Tasks

Weicheng Ma, Brian C Wang, Hefan Zhang, Lili Wang, Rolando Coto-Solano, Saeed Hassanpour, Soroush Vosoughi

📝 Paper

Anthology

Underline 🪧 Poster 🧑‍🏫 Slides 📺 Watch Video on Underline Add to Favorites

Main: Interpretability and Analysis of Models for NLP Main-poster Paper

Poster Session 2: Interpretability and Analysis of Models for NLP (Poster)

Conference Room: Frontenac Ballroom and Queen's Quay

Conference Time: July 10, 14:00-15:30 (EDT) (America/Toronto)

Global Time: July 10, Poster Session 2 (18:00-19:30 UTC)

Keywords: probing, robustness

TLDR: Syntactic probing methods have been used to examine whether and how pre-trained language models (PLMs) encode syntactic features. However, the probing methods are usually biased by the PLMs' memorization of common word co-occurrences, even if they do not form syntactic relations. This paper presents...

You can open the #paper-P5812 channel in a separate window.

Abstract: Syntactic probing methods have been used to examine whether and how pre-trained language models (PLMs) encode syntactic features. However, the probing methods are usually biased by the PLMs' memorization of common word co-occurrences, even if they do not form syntactic relations. This paper presents a random-word-substitution and random-label-matching control task to reduce these biases and improve the robustness of syntactic probing methods. Our control tasks are also shown to notably improve the consistency of probing results between different probing methods and make the methods more robust with respect to the text attributes of the probing instances. Our control tasks make syntactic probing methods better at reconstructing syntactic features and more generalizable to unseen text domains. Our experiments show that our proposed control tasks are effective on different PLMs, probing methods, and syntactic features.