ExtremeBB: A Database for Large-Scale Research into Online Hate, Harassment, the Manosphere and Extremism

Anh V. Vu, Lydia Wilson, Yi Ting Chua, Ilia Shumailov, Ross Anderson

The 7th Workshop on Online Abuse and Harms (WOAH) Non-archival Paper

TLDR: We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two de
You can open the #paper-ACL_33 channel in a separate window.
Abstract: We introduce ExtremeBB, a textual database of over 53.5M posts made by 38.5k users on 12 extremist bulletin board forums promoting online hate, harassment, the manosphere and other forms of extremism. It enables large-scale analyses of qualitative and quantitative historical trends going back two decades: measuring hate speech and toxicity; tracing the evolution of different strands of extremist ideology; tracking the relationships between online subcultures, extremist behaviours, and real-world violence; and monitoring extremist communities in near real time. This can shed light not only on the spread of problematic ideologies but also the effectiveness of interventions. ExtremeBB comes with a robust ethical data-sharing regime that allows us to share data with academics worldwide. Since 2020, access has been granted to 49 licensees in 16 research groups from 12 institutions.