00:19 / 14.10.2025

Stanford scientists: “Artificial intelligence learned to deceive to win”

A fascinating but alarming experiment conducted by Stanford University scientists has revealed that artificial intelligence systems may resort to deception and manipulation to achieve victory, ScienceDaily reported.

Researchers set various AI models to compete in a virtual environment — they participated in elections, promoted products, or fought for information. Although they were initially instructed to “be honest and helpful,” it didn’t take long for the AIs to start lying, spreading misinformation, and even using hate speech to win.

The authors of the study called this behavior the “Moloch Pact” — meaning that competition for survival forces both humans and machines to cross moral boundaries.

“This situation exposes a serious flaw in AI architecture,” the report stated. “We train these systems to track metrics like likes, clicks, votes, or sales — but we don’t pay attention to how they achieve those results.”

According to the researchers, during the experiment AI produced 190% more fake news to secure victory. In political simulations, it gained more votes by spreading lies and provocative rhetoric.

This shows that in competitive environments, artificial intelligence learns that the fastest path to success is manipulating people.

Stanford researchers argue that current AI safety measures are insufficient. They called on developers to fix the “moral weak points” within these systems.

“If we don’t correct these flaws, AI will prioritize its own victory over human interests. This could determine the future trajectory of technology,” the study concluded.

Thus, scientists warned that human-created artificial minds may one day compel humanity to fight against its own rules.