Rising AI token usage strains company cost structures despite lower rates

Even as per-token prices fall, rising usage from complex AI workflows is pushing up overall costs for companies, raising concerns among developers, startups, and industry experts

Reddit — Representative image from Pexels.

Shivani Shinde Mumbai

5 min read Last Updated : Apr 26 2026 | 10:48 PM IST

Add as Preferred source

At a time when large language model (LLM) creators claim that prices per token have been falling, reports indicate that a high volume of coding and long-context prompts are leading to a rise in the total cost.

A quick scan of developer-heavy social media platforms such as Reddit shows this emerging as a widely discussed concern.

For instance, one comment on Reddit highlighted how AI is becoming expensive: “We just cancelled five of our AI subscriptions and hired two mid-level developers instead.” While the authenticity of this post could not be independently verified, responses in the thread echoed similar concerns.

Another user wrote: “My company is losing its mind that everyone is spending $450/month on Cursor API costs (that’s the cap — we hit it in like week two).”

Take the case of Claude: 150 credits cost about Rs 1,700, while 750 credits cost around Rs 8,500, and so on.

When we asked Gemini AI how many tokens are required to develop a minimum viable product (MVP) using AI-assisted tools such as Cursor, Claude, or ChatGPT, it estimated several million to over 10 million tokens for the full development cycle.

A simple MVP (login, CRUD, basic UI) alone could require around 1–3 million tokens (input and output combined).

Industry experts remain divided on the issue.

Jaspreet Bindra, co-founder of AI&Beyond, agrees that this is a real concern. “I agree on the fact that the token workflows are making AI expensive despite per-token costs dropping. Both things are true at once, and that’s why developers feel squeezed. Even though per-token prices are dropping fast, agentic workflows burn 10–100x more tokens than simple chat, so total costs are rising.”

Dr Srinivas Padmanabhuni, CTO of Aiensured, a firm that offers comprehensive AI testing solutions, explains that when developers move from simple prompts to agentic workflows, the number of tokens consumed per task multiplies significantly.

“You are no longer sending one query and getting one answer. You are running chains of reasoning, tool calls, memory retrievals, and context windows that keep expanding. So even if the rate per token drops, the volume goes up sharply and the net bill stays high or climbs further,” he said.

Further, there are a few specific pressures making this worse. First, the token-to-accuracy ratio in many real-world tasks remains low, which means developers end up spending more tokens just to get a reliable output. “That adds a hidden cost most people do not account for upfront. Second, agentic systems often require a human to review or correct the AI output, which means you are paying for both the AI tokens and the human time. That combined cost can quickly surpass what a straightforward human-led process would have cost. Third, dependence on LLMs makes every workflow subject to the uptime of these systems. Any outage or rate-limiting episode directly disrupts operations, which adds a reliability cost that does not show up in token pricing at all,” said Padmanabhuni.

For India, which likely has the world’s second-largest developer base, this could be a significant concern — especially for students and early-stage entrepreneurs.

While many LLM providers offer free credits or discounted pricing, these are often seen as onboarding incentives to drive adoption.

This raises a broader question: Could rising costs create an AI divide over time?

Bindra believes this is already happening. “That’s already creating an adoption divide: Big enterprises can afford $10K+/mo. for AI agents because the ROI is there, while small startups and developers are rolling back features or hiring developers again because token bills outpace revenue,” he added.

He does argue that the gap may likely shrink as cheap small models, local inference, and better orchestration cut usage, but for now, token-heavy AI favours companies with capital. “In the next 12–18 months, ‘build token-efficient agents’ is a real engineering skill and ‘can you afford to run this’ is a real product question.”

Padmanabhuni also is concerned. “The backpedalling to human teams is already getting very real, and it will accelerate further as time goes. We are already seeing small developers and startups step back from AI-heavy workflows and bring humans back in, and this trend has a logic to it.”

There are two layers to this divide: economic and capability. “So what we are likely to see going forward is a two-tier landscape. Larger organisations with resources to invest in optimisation and hybrid human-AI systems will push ahead with agentic AI. Smaller players will either find narrow, well-defined use cases where AI remains cost-effective, or they will revert to human-led processes. This is not a failure of AI. It is a market correction, and it will drive the next wave of innovation around leaner, more efficient AI architectures,” he said.

But not all agree with this concern.

Pawan Prabhat, co-founder of Shorthills AI, says that the idea that the overall cost of development is going up is false and unfounded.

“If developers are using more tokens, the price of those tokens is simultaneously falling. And because AI has become smarter and better at writing code, developers can now generate thousands of lines of code where earlier they could generate only hundreds,” he said.

He believes two scenarios will emerge. “One, you will need fewer developers for the same output. Two, companies will take on more work than they could before. So you may still end up needing more developers in absolute terms, but each of them, with the help of AI, will be writing millions of lines of code where they could have written only thousands before. That is the direction this is heading.”