T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

Jialu Wang; Xinyue Gabby Liu; Zonglin Di; Yang Liu; Xin Eric Wang

T2IAT: Measuring Valence and Stereotypical Biases in Text-to-Image Generation

Jialu Wang, Xinyue Gabby Liu, Zonglin Di, Yang Liu, Xin Eric Wang

📝 Paper

Anthology

Underline 🪧 Poster 🧑‍🏫 Slides 📺 Watch Video on Underline Add to Favorites

Findings: Ethics and NLP Findings Paper

Session 7: Ethics and NLP (Virtual Poster)

Conference Room: Pier 7&8

Conference Time: July 12, 11:00-12:30 (EDT) (America/Toronto)

Global Time: July 12, Session 7 (15:00-16:30 UTC)

Spotlight Session: Spotlight - Metropolitan West (Spotlight)

Conference Room: Metropolitan West

Conference Time: July 10, 19:00-21:00 (EDT) (America/Toronto)

Global Time: July 10, Spotlight Session (23:00-01:00 UTC)

Keywords: model bias/fairness evaluation

TLDR: *Warning: This paper contains several contents that may be toxic, harmful, or offensive.* In the last few years, text-to-image generative models have gained remarkable success in generating images with unprecedented quality accompanied by a breakthrough of inference speed. Despite their rapid progr...

You can open the #paper-P4937 channel in a separate window.

Abstract: *Warning: This paper contains several contents that may be toxic, harmful, or offensive.* In the last few years, text-to-image generative models have gained remarkable success in generating images with unprecedented quality accompanied by a breakthrough of inference speed. Despite their rapid progress, human biases that manifest in the training examples, particularly with regard to common stereotypical biases, like gender and skin tone, still have been found in these generative models. In this work, we seek to measure more complex human biases exist in the task of text-to-image generations. Inspired by the well-known Implicit Association Test (IAT) from social psychology, we propose a novel Text-to-Image Association Test (T2IAT) framework that quantifies the implicit stereotypes between concepts and valence, and those in the images. We replicate the previously documented bias tests on generative models, including morally neutral tests on flowers and insects as well as demographic stereotypical tests on diverse social attributes. The results of these experiments demonstrate the presence of complex stereotypical behaviors in image generations.