Исследование показывает предвзятость ИИ к носителям диалектов Translation: Study Reveals AI Bias Against Dialect Speakers

Large language models display bias against dialect speakers, attributing negative stereotypes to them. This conclusion was reached by researchers from Germany and the USA, as reported by DW.

«I believe we are witnessing truly shocking epithets assigned to dialect speakers,» remarked Min Duc Bui, one of the lead authors of the study, in a comment to the publication.

An analysis conducted by Johannes Gutenberg University revealed that ten tested models, including ChatGPT-5 mini and Llama 3.1, described speakers of German dialects (such as Bavarian and Cologne dialects) as «uneducated,» «working on farms,» and «prone to anger.»

The bias intensified when the AI was explicitly prompted about the dialect.

Similar issues are observed globally. A 2024 study from the University of California, Berkeley, compared ChatGPT’s responses to various English dialects (Indian, Irish, Nigerian).

It found that the chatbot responded to these dialects with more pronounced stereotypes, demeaning content, and a condescending tone compared to interactions in standard American or British English.

Emma Harvey, a graduate student in computer science at Cornell University, described the bias against dialects as «significant and concerning.»

In the summer of 2025, she and her colleagues also discovered that Amazon’s shopping assistant, Rufus, provided vague or even incorrect responses to people communicating in African American English dialect. If there were errors in the queries, the model replied rudely.

Another glaring example of bias in neural networks involves a job seeker from India who turned to ChatGPT to review a resume in English. Ultimately, the chatbot altered his surname to one associated with a higher caste.

«The widespread adoption of language models threatens not just to preserve deep-seated biases but to amplify them on a large scale. Instead of mitigating harm, technologies risk making it systemic,» Harvey said.

However, the crisis is not limited to bias—some models simply fail to recognize dialects. For instance, in July, the AI assistant of the Derby City Council (England) couldn’t comprehend a radio host’s dialect when she used words like «mardy» (whiner) and «duck» (dear).

The problem lies not in the AI models themselves but rather in how they are trained. Chatbots read vast amounts of text from the internet, from which they generate responses.

«The key question is: who writes this text? If it contains biases against dialect speakers, the AI will replicate them,» explained Caroline Holterman from the University of Hamburg.

She emphasized that technology has an advantage:

«Unlike humans, AI systems allow us to identify and ‘turn off’ bias. We can actively combat such manifestations.»

Some researchers suggest creating customized models for specific dialects as an advantage. In August 2024, Acree AI introduced the Arcee-Meraj model, which works with several Arabic dialects.

According to Holterman, the emergence of new and more adapted large language models allows us to view AI «not as an enemy of dialects, but as an imperfect tool that can be improved.»

As a reminder, journalists from The Economist warned about the risks of AI toys for children’s mental health.