A Multi-dimensional study on Bias in Vision-Language models
Gabriele Ruggeri, Debora Nozza
Findings: Ethics and NLP Findings Paper
Session 1: Ethics and NLP (Virtual Poster)
Conference Room: Pier 7&8
Conference Time: July 10, 11:00-12:30 (EDT) (America/Toronto)
Global Time: July 10, Session 1 (15:00-16:30 UTC)
Keywords:
model bias/fairness evaluation
TLDR:
In recent years, joint Vision-Language (VL) models have increased in popularity and capability. Very few studies have attempted to investigate bias in VL models, even though it is a well-known issue in both individual modalities.
This paper presents the first multi-dimensional analysis of bias in En...
You can open the
#paper-P2145
channel in a separate window.
Abstract:
In recent years, joint Vision-Language (VL) models have increased in popularity and capability. Very few studies have attempted to investigate bias in VL models, even though it is a well-known issue in both individual modalities.
This paper presents the first multi-dimensional analysis of bias in English VL models, focusing on gender, ethnicity, and age as dimensions.
When subjects are input as images, pre-trained VL models complete a neutral template with a hurtful word 5\% of the time, with higher percentages for female and young subjects.
Bias presence in downstream models has been tested on Visual Question Answering. We developed a novel bias metric called the Vision-Language Association Test based on questions designed to elicit biased associations between stereotypical concepts and targets. Our findings demonstrate that pre-trained VL models contain biases that are perpetuated in downstream tasks.