Hate
speech detectors are easily tricked. A test of systems designed to
identify offensive speech online shows that a few innocuous words or
spelling errors can easily trip them up. The results cast doubt on the
use of technology to tame online discourse.
N. Asokan at Aalto University in Finland and colleagues investigated
seven different systems used to identify offensive text. These included a
tool built to detoxify bitter arguments in Wikipedia’s edits section, Perspective – a tool created by Google’s Counter Abuse team and Jigsaw, …
https://www.newscientist.com/article/2178965-googles-ai-hate-speech-detector-is-easily-fooled-by-a-few-typos/