OpenAI Cut AI Costs by More Than Half Without New Chips

22:56 / 01.07.2026·32·Technology

Engineers at OpenAI, the leader in AI technologies, have found a way to drastically reduce system operating costs without purchasing new hardware. According to The Information, the company has succeeded in reducing the computing power required to process ChatGPT user requests by more than twofold. This achievement represents not only financial savings but also a strategic advantage amidst the global shortage of computing resources. This is reported by Ixbt.com news reports.

The focus is on inference optimization—the process by which a trained model responds directly to user queries. Today, inference is the largest cost item for companies developing generative AI. While model training occurs over a specific period, communicating with users requires separate resources for every single request.

Rational Resource Utilization Strategy

Sources indicate that the new optimization system introduced by OpenAI is targeted at the layer of users who use ChatGPT without registration or via the free tier. As a result, the number of NVIDIA GPUs required to serve these users over a certain period has decreased by several hundred. For a service on such a global scale, this reduction is unexpectedly significant.

So far, OpenAI has not officially disclosed the technical methods used to achieve this result. Experts speculate that this efficiency was reached not through additional hardware installation, but through rational use of existing server infrastructure, improved memory management, or refined batch processing algorithms.

This news could shift the economic balance in the AI market. At a time when queues for NVIDIA chips have emerged and billions of dollars are being spent on building data centers, reducing costs through software is the most effective path. This allows OpenAI to offer its services to an even wider audience.

Future Strategic Opportunities

If this technology is applied on a large scale, OpenAI will have the following opportunities:

Expanding the scope of free services;
Lowering tariffs for corporate clients;
Increasing the computing power of AI agents without additional costs;
Serving more users with the existing infrastructure.

It remains unknown whether this optimization applies to paid subscribers or the company's most complex reasoning models. However, such breakthroughs in software optimization prove that in the AI race, not just the number of chips, but the efficiency of their use plays a decisive role.

Such news is also of great importance for developing markets like Uzbekistan. Because the reduction in computing costs will lead to ChatGPT and similar technologies becoming even cheaper and more popular in the future.