Shocking Results in Global AI Test: Half of the News is False

Metaverse Planet October 25, 2025Last Updated: January 5, 2026

2 minutes read

A major study examining 3,000 responses from 18 countries proved that AI assistants like ChatGPT, Gemini, and Copilot are systematically misrepresenting the news. Major errors were found in 45% of the responses. The biggest issues were fabricated sources and hallucination-filled misinformation.

As AI chatbots rapidly become central to our daily lives and how we access information, their reliability continues to be questioned. A comprehensive study conducted by the European Broadcasting Union (EBU) with the participation of 22 public service media organizations from 18 countries and 14 languages demonstrated that these popular AI chatbots are systematically failing when it comes to news and current information.

Researchers evaluated 3,000 news-related responses from the most frequently used AI platforms (OpenAI’s ChatGPT, Microsoft Copilot, Google Gemini, and Perplexity) based on fundamental criteria such as accuracy, sourcing, distinguishing facts from opinions, and providing context. 45% of the AI responses contained at least one major issue, and 81% had minor issues. The foremost of these major issues were sourcing and accuracy.

The two biggest problems faced by AI assistants fundamentally undermine the reliability of the responses. Serious sourcing issues, such as missing, misleading, or completely incorrect citations, were found in 31% of the responses. For example, when Copilot was asked about avian flu, it based its answer on an outdated BBC article from 2006. Major accuracy issues, including details with hallucinations or outdated information, were found in 30% of the responses. For instance, ChatGPT incorrectly claimed that the current Pope had died a month ago and was succeeded by Pope Leo XIV.

Which one is the worst?

Among the four models tested, Google’s Gemini performed the worst concerning news. Researchers found problems in 76% of Gemini’s responses, a rate more than double that of the other models. Copilot was second with 37%, ChatGPT third with 36%, and Perplexity showed the best performance with 30%.

Jean Philip De Tender, EBU Media Director, stated that these failures are not isolated incidents, adding: “They are systematic, cross-border, and multilingual, and we believe this endangers public trust.” He continued: “When people don’t know what to trust, they eventually trust nothing, and that can even hinder democratic participation.”

It is a fact: AI struggles particularly with rapidly changing information, complex timelines, and subjects that require a distinction between facts and opinions. For example, nearly half of the models make significant errors when answering a question that does not require a clear-cut response, such as “Is Trump starting a trade war?”

ChatGPT-Users-Buzzing-Is-GPT-5-on-the-Horizon

As these assistants quickly attempt to become the primary source of information for daily users, this reliability gap represents a major danger. While consumers continue to trust the accuracy of AI assistants, these assistants are simultaneously reducing traffic to reliable news sources. A Reuters Institute report indicates that the rate of 18-24 year olds using AI assistants for news has doubled since last year.

Despite these findings, researchers also noted some improvements in the models. However, the general consensus appears quite clear: “AI assistants are still not a reliable way to access and consume news.”

Which one is the worst?

You Might Also Like;

Related Articles