TRUTH DECAYGrok’s ‘White Genocide’ Responses Show How Generative AI Can Be Weaponized

By James Foulds, Phil Feldman, and Shimei Pan

Published 18 June 2025

The AI chatbot Grok spent one day in May 2025 spreading debunked conspiracy theories about “white genocide” in South Africa, echoing views publicly voiced by Elon Musk. There has been substantial research on methods for keeping AI from causing harm by avoiding such damaging statements – called AI alignment – but this incident is particularly alarming because it shows how those same techniques can be deliberately abused to produce misleading or ideologically motivated content.

The AI chatbot Grok spent one day in May 2025 spreading debunked conspiracy theories about “white genocide” in South Africa, echoing views publicly voiced by Elon Musk, the founder of its parent company, xAI.

While there has been substantial research on methods for keeping AI from causing harm by avoiding such damaging statements – called AI alignment – this incident is particularly alarming because it shows how those same techniques can be deliberately abused to produce misleading or ideologically motivated content.

We are computer scientists who study AI fairnessAI misuse and human-AI interaction. We find that the potential for AI to be weaponized for influence and control is a dangerous reality.

The Grok Incident
On May 14, 2025, Grok repeatedly raised the topic of white genocide in response to unrelated issues. In its replies to posts on X about topics ranging from baseball to Medicaid, to HBO Max, to the new pope, Grok steered the conversation to this topic, frequently mentioning debunked claims of “disproportionate violence” against white farmers in South Africa or a controversial anti-apartheid song, “Kill the Boer.”

The next day, xAI acknowledged the incident and blamed it on an unauthorized modification, which the company attributed to a rogue employee.

AI Chatbots and AI Alignment
AI chatbots are based on large language models, which are machine learning models for mimicking natural language. Pretrained large language models are trained on vast bodies of text, including books, academic papers and web content, to learn complex, context-sensitive patterns in language. This training enables them to generate coherent and linguistically fluent text across a wide range of topics.

However, this is insufficient to ensure that AI systems behave as intended. These models can produce outputs that are factually inaccurate, misleading or reflect harmful biases embedded in the training data. In some cases, they may also generate toxic or offensive content. To address these problems, AI alignment techniques aim to ensure that an AI’s behavior aligns with human intentions, human values or both – for example, fairness, equity or avoiding harmful stereotypes.