Online hate speech and disinformation have emerged as a major problem for democratic societies worldwide. Governments, companies and civil society groups have responded to this phenomenon by increasingly turning to Artificial Intelligence (AI) as a tool that can detect, decelerate and remove online extreme speech. However, such efforts confront many challenges. One of the key challenges is the quality, scope, and inclusivity of training data sets. The second challenge is the lack of procedural guidelines and frameworks that can bring cultural contextualization to these systems. Lack of cultural contextualization has resulted in false positives, over-application and systemic bias.

Project AI4Dignity takes a step towards addressing this large problem through a bottom up process model of collaborative coding.

AI4Dignity is a proof-of-concept project funded by the European Research Council. It is the latest project of For Digital Dignity, a research initiative on Digital Politics and Digital Cultures steered by Prof Dr Sahana Udupa at LMU Munich, Germany.

The ongoing ERC project (ONLINERPOL), the parent project of AI4Dignity, has identified the need for a global comparative framework in AI-assisted solutions in order to address cultural variation, since there is no catch-all algorithm that can work for different contexts. Following this, AI4Dignity will address major challenges facing AI assisted extreme speech moderation by developing a framework that can move beyond keyword-based detection systems by pioneering a community-based classification approach. It identifies fact-checkers as critical human interlocutors who can bring cultural contextualization to AI-assisted speech moderation in a meaningful and feasible manner.

The project will develop a replicable process model that will enable collaboration between fact-checkers, AI developers and academic intermediaries in a facilitated event space. The project will develop an open access toolkit for the adoption of this model in different locations. AI4Dignity will be a significant step towards setting procedural benchmarks to operationalize “the human in the loop” principle and bring inclusive training datasets for AI systems tackling urgent issues of digital hate and extreme speech.