[Findings] The State of Profanity Obfuscation in Natural Language Processing Scientific Publications
Debora Nozza, Dirk Hovy
The 7th Workshop on Online Abuse and Harms (WOAH) Findings Paper
TLDR:
Work on hate speech has made considering rude and harmful examples in scientific publications inevitable. This situation raises various problems, such as whether or not to obscure profanities. While science must accurately disclose what it does, the unwarranted spread of hate speech can harm readers
You can open the
#paper-ACL_F10
channel in a separate window.
Abstract:
Work on hate speech has made considering rude and harmful examples in scientific publications inevitable. This situation raises various problems, such as whether or not to obscure profanities. While science must accurately disclose what it does, the unwarranted spread of hate speech can harm readers and increases its internet frequency. While maintaining publications'' professional appearance, obfuscating profanities makes it challenging to evaluate the content, especially for non-native speakers.
Surveying 150 ACL papers, we discovered that obfuscation is usually used for English but not other languages, and even then, quite unevenly.
We discuss the problems with obfuscation and suggest a multilingual community resource called PrOf with a Python module to standardize profanity obfuscation processes. We believe PrOf can help scientific publication policies to make hate speech work accessible and comparable, irrespective of language.