Robust Natural Language Understanding with Residual Attention Debiasing

Fei Wang; James Y. Huang; Tianyi Yan; Wenxuan Zhou; Muhao Chen

Robust Natural Language Understanding with Residual Attention Debiasing

Fei Wang, James Y. Huang, Tianyi Yan, Wenxuan Zhou, Muhao Chen

📝 Paper

Anthology

Underline 🪧 Poster 📺 Watch Video on Underline Add to Favorites

Findings: Interpretability and Analysis of Models for NLP Findings Paper

Session 1: Interpretability and Analysis of Models for NLP (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 10, Session 1 (15:00-16:30 UTC)

Spotlight Session: Spotlight - Metropolitan West (Spotlight)

Conference Room: Metropolitan West

Conference Time: July 10, 19:00-21:00 (EDT) (America/Toronto)

Global Time: July 10, Spotlight Session (23:00-01:00 UTC)

Keywords: data shortcuts/artifacts, robustness

TLDR: Natural language understanding (NLU) models often suffer from unintended dataset biases. Among bias mitigation methods, ensemble-based debiasing methods, especially product-of-experts (PoE), have stood out for their impressive empirical success. However, previous ensemble-based debiasing methods typ...

You can open the #paper-P4811 channel in a separate window.

Abstract: Natural language understanding (NLU) models often suffer from unintended dataset biases. Among bias mitigation methods, ensemble-based debiasing methods, especially product-of-experts (PoE), have stood out for their impressive empirical success. However, previous ensemble-based debiasing methods typically apply debiasing on top-level logits without directly addressing biased attention patterns. Attention serves as the main media of feature interaction and aggregation in PLMs and plays a crucial role in providing robust prediction. In this paper, we propose REsidual Attention Debiasing (READ), an end-to-end debiasing method that mitigates unintended biases from attention. Experiments on three NLU benchmarks show that READ significantly improves the OOD performance of BERT-based models, including +12.9\% accuracy on HANS, +11.0\% accuracy on FEVER-Symmetric, and +2.7\% F1 on PAWS. Detailed analyses demonstrate the crucial role of unbiased attention in robust NLU models and that READ effectively mitigates biases in attention.