doi.org
Explaining Offensive Language Detection
July 2020 • Julian Risch, Robin Ruff, Ralf Krestel
Machine learning approaches have proven to be on or even above human-level accuracy for the task of offensive language detection.In contrast to human experts, however, they often lack the capability of giving explanations for their decisions.This article compares four different approaches to make offensive language detection explainable: an interpretable machine learning model (naive Bayes), a model-agnostic explainability method (LIME), a model-based explainability method (LRP), and a self-explanatory model (LSTM…