Does AI Not Actually Think? Scientists Reveal the Secret Behind "Reasoning" Chains

23:24 / 27.06.2026·51·Technology

Recently, large reasoning models (LRM) such as OpenAI o1 and DeepSeek R1 have amazed the world with their seemingly human-like thinking abilities. However, a new scientific paper published by researchers at Arizona State University, led by Subbarao Kambampati, casts doubt on these perceptions. The scientists emphasize that the long "Chain of Thought" (CoT) of neural networks is not actually a real cognitive process, but simply statistical manipulation. This is reported by Ixbt.com news reports.

According to the researchers, the logical sequences created by modern AI systems form a convincing illusion in the user that an intellectual process is occurring. In reality, these models based on the Transformer architecture only statistically predict the next token (word fragment) based on the previous context. Equating this process with the human mechanism of logical inference is considered scientifically incorrect.

The "Eureka" moment — mere imitation

The scientific paper pays special attention to the use of phrases like "Aha-moment," or "Yes, now I understand," by AI models, as if they have suddenly grasped the problem. Scientists call this not a qualitative change in the neural network's internal calculations, but simply an imitation of human style found in the training data. From a technical point of view, these systems are optimized only for the final correct answer, and the intermediate chains undergo no semantic verification.

According to ixbt.com, researchers used mathematical tasks such as exiting mazes and finding the shortest path to prove their hypotheses. An unexpected result was recorded during the experiments: the models continued to find the correct answer even when the chain of logical explanations was intentionally incorrect or confusing. This shows that the system does not "read" its own reasoning, but simply uses it as an additional statistical template.

Another interesting case was observed in an experiment called "no-maze instances." In this, the AI was given an extremely simple maze task without any obstacles. Despite this, the models generated several pages of "reasoning." This case invalidates the view that the length of reasoning indicates computing power or complexity. Long texts are simply a statistical artifact resulting from the fact that complex problems in the training database were accompanied by long explanations.

The "Theater of Reasoning" and its risks

Scientists warn the AI field against falling into the trap of the "theater of reasoning." The convincing explanations provided by the systems can evoke false trust in users. This is especially dangerous in fields such as medicine, engineering, and law, as a human cannot physically keep up with verifying dozens of pages of logical chains generated by a machine in real time.

The authors of the study propose the LLM-Modulo approach as an alternative. In this, language models are used only as hypothesis generators, and their correctness is verified using external, mathematically rigorous algorithms. The main conclusion is that we must stop anthropomorphizing AI models and evaluate their quality not by their "internal monologue," but by results that can be independently verified.