A Weakly Supervised Classifier and Dataset of White Supremacist Language
Michael Miller Yoder, Ahmad Diab, David West Brown, Kathleen M Carley
Main: Computational Social Science and Cultural Analytics Main-poster Paper
Poster Session 2: Computational Social Science and Cultural Analytics (Poster)
Conference Room: Frontenac Ballroom and Queen's Quay
Conference Time: July 10, 14:00-15:30 (EDT) (America/Toronto)
Global Time: July 10, Poster Session 2 (18:00-19:30 UTC)
Keywords:
hate-speech detection
TLDR:
We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar ...
You can open the
#paper-P2226
channel in a separate window.
Abstract:
We present a dataset and classifier for detecting the language of white supremacist extremism, a growing issue in online hate speech. Our weakly supervised classifier is trained on large datasets of text from explicitly white supremacist domains paired with neutral and anti-racist data from similar domains. We demonstrate that this approach improves generalization performance to new domains. Incorporating anti-racist texts as counterexamples to white supremacist language mitigates bias.