A Weakly Supervised Classifier and Dataset of White Supremacist Language

Michael Miller Yoder; Ahmad Diab; David West Brown; Kathleen M Carley

A Weakly Supervised Classifier and Dataset of White Supremacist Language

Michael Miller Yoder, Ahmad Diab, David West Brown, Kathleen M Carley

📝 Paper

Anthology

Underline 🪧 Poster 📺 Watch Video on Underline Add to Favorites

Main: Computational Social Science and Cultural Analytics Main-poster Paper

Poster Session 2: Computational Social Science and Cultural Analytics (Poster)

Conference Room: Frontenac Ballroom and Queen's Quay

Conference Time: July 10, 14:00-15:30 (EDT) (America/Toronto)

Global Time: July 10, Poster Session 2 (18:00-19:30 UTC)

Keywords: hate-speech detection

TLDR: We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar ...

You can open the #paper-P2226 channel in a separate window.

Abstract: We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.