In late April 2026, DeepSeek quietly removed the expiration date from its most aggressive pricing promotion. What had been a temporary discount on the V4 Pro model became permanent. The 75 percent reduction on output tokens was no longer a limited-time offer. It was the new normal.

The decision sent a shockwave through the AI industry. Competitors like OpenAI, Anthropic, and Google had been bracing for continued price compression, but locking in such steep cuts just one month after launching the V4 family signaled a strategic pivot that few expected. The deepseek price cuts were not merely a response to market pressure. They were a deliberate attempt to reshape the economics of large language model access.
This article breaks down the five specific permanent price reductions DeepSeek implemented, why each one matters, and how they collectively change the calculus for enterprise buyers, startups, and developers.
The Five Permanent Price Cuts That Shifted the AI Economic Landscape
The scale of the reductions becomes clear only when you examine each cut individually. DeepSeek did not apply a single across-the-board discount. Instead, it restructured pricing across multiple tiers and usage types, creating a new floor for the entire industry.
V4 Pro Output Tokens: 75 Percent Off, Made Permanent
The flagship cut is the one that draws the most attention. DeepSeek reduced the cost of output tokens on the V4 Pro model from $3.48 per million tokens to $0.87. That is a 75 percent reduction. The previous price was already competitive with mid-range models from other providers. The new price undercuts even the cheapest offerings from Google and Anthropic.
The permanent nature of the cut is what changes the conversation. Temporary discounts create uncertainty. Enterprises hesitate to build workflows around pricing that might revert. By locking in the 75 percent discount indefinitely, DeepSeek gave procurement teams a stable cost basis for budgeting. For any application that generates long responses, such as code generation, report writing, or customer support summarisation, the savings compound quickly.
At the old GPT-5 output price of $10 per million tokens, a company processing 100 million output tokens per month would pay $1,000. At DeepSeek’s new permanent rate, the same volume costs $87. The gap is not small enough to ignore.
V4 Pro Input Tokens: Near-Commodity Pricing for Prompts
Input tokens rarely receive the same spotlight as outputs, but they account for a large share of total costs in applications that send lengthy prompts. DeepSeek cut V4 Pro input pricing to $0.003625 per million tokens. That is down from $0.0145, a 75 percent reduction as well.
To put that in context, Google’s Gemini 3.5 Flash charges $0.15 per million input tokens. OpenAI’s GPT-5 charges $2.50. DeepSeek’s input rate is roughly 40 times cheaper than Google’s cost-optimised model and nearly 700 times cheaper than GPT-5. For document analysis, legal contract review, or any task that requires feeding an entire file into the prompt, input costs are the binding constraint. DeepSeek’s permanent input cut removes that barrier almost entirely.
A company analyzing 500 million input tokens per month would pay about $1,812 at the new rate. At GPT-5 input pricing, the same volume would cost $1.25 million. The deepseek price cuts on input tokens alone justify a migration for any high-volume text processing pipeline.
V4 Base Model: Full Range Reduction for General-Purpose Tasks
Beyond the Pro tier, DeepSeek also permanently reduced pricing on the standard V4 model. The new range runs from $0.003625 per million tokens on the low end to $0.87 on the high end. Previously, the range was $0.0145 to $3.48. That represents a broad compression of more than 75 percent across all usage types.
The base V4 model is designed for tasks that do not require the full reasoning capability of the Pro variant. Many internal chatbots, content classifiers, and data extraction tools run well on the base model. The permanent pricing cut means teams no longer have to choose between a cheaper model that lacks context and a more expensive one that strains the budget. They can default to the base V4 and pay rates that were unthinkable a year earlier.
For startups that process millions of tokens daily, the difference between $3.48 and $0.87 per million output tokens can mean the difference between a profitable product and a loss leader. Permanent pricing provides the predictability needed to build a business model around AI inference.
One-Million-Token Context Window at Reduced Rates
DeepSeek marketed the V4 models as welcoming an “era of cost-effective 1M context length.” Supporting a one-million-token context window is technically demanding. The cost per token typically rises as the context lengthens because the model must attend to more positions. DeepSeek decided not to pass that premium on to customers.
The permanent price cut applies fully to long-context usage. There is no surcharge for exceeding a certain context length. A developer who feeds a 500,000-token document into the prompt pays the same per-token rate as someone sending a short query. That is rare among frontier models. OpenAI and Anthropic both charge escalating rates for extended contexts. DeepSeek’s flat pricing for up to one million tokens effectively cuts the cost of long-document analysis by a factor of four or more compared to the closest competitor.
Legal teams reviewing contracts, medical researchers parsing clinical trial documents, and software engineers analyzing entire codebases can now do so without worrying about context-length price spikes. This is the quietest of the deepseek price cuts, but for niche applications, it is the most transformative.
Enterprise High-Volume Tiers: Implicit Structural Discounts
The fifth permanent price cut is not a specific number. It is the structural shift in how DeepSeek approaches enterprise pricing. By making the published rates the permanent standard, DeepSeek eliminated the need for negotiated discounts. Any enterprise that consumes millions of tokens per day automatically receives the lowest rates without signing a volume commitment.
In contrast, OpenAI and Anthropic typically reserve their best pricing for customers who commit to annual spending targets. Those targets often run into the millions of dollars. DeepSeek’s published rates are now so low that even a small startup can access the same per-token cost as a Fortune 500 company. The implicit discount comes from not having to negotiate, not having to lock in spending, and not having to worry about price increases at renewal time.
You may also enjoy reading: 7 Reasons This Gorgeous Lego Set Brings Disneyland Home.
Salesforce projects $300 million in Anthropic token spending this year. At DeepSeek’s new permanent pricing, the same volume of tokens would cost roughly $30 million to $40 million, depending on the mix of input and output. The savings potential for large enterprises is enormous. The permanent nature of the cuts means those savings are not a one-time promotion. They are the new baseline.
Why These Cuts Are Different from Temporary Discounts
Price promotions are common in the AI industry. Anthropic and OpenAI have both run limited-time offers. The difference with DeepSeek’s approach is the permanence. A temporary discount creates urgency but also uncertainty. Enterprises cannot build long-term infrastructure around pricing that may vanish in a month. The permanent lock-in changes that dynamic.
DeepSeek signalled that it is willing to forego short-term per-unit profit for market share. The company appears to be betting that volume growth will compensate for lower margins. If that bet pays off, it will force competitors to respond with their own permanent reductions, accelerating the commoditisation of AI tokens.
The deepseek price cuts also differ in scope. Instead of lowering the price of a single model, DeepSeek compressed the entire pricing tier structure. Every model, every usage type, and every context length benefits from the reduction. That makes the cost comparison straightforward. A buyer does not have to wonder whether the discount applies to input or output, short or long context. It applies to everything.
Impact on Competitors: Margins Under Pressure
Anthropic’s annualised revenue surged from $9 billion to $30 billion between the end of 2025 and early April 2026. Much of that growth came from enterprise adoption of Claude Code. DeepSeek’s permanent price cuts threaten the revenue-per-token economics that support that trajectory.
If enterprise customers begin routing low-complexity tasks to DeepSeek while reserving Claude for high-stakes reasoning, Anthropic’s token volume could hold steady while revenue per token declines. That would compress margins even as usage grows. The accusation of “distillation attacks” from Anthropic highlights the tension. DeepSeek has not addressed the claim publicly. If substantiated, it would mean that some of DeepSeek’s capability advantage was built on Claude’s responses, making the price differential a form of intellectual property arbitrage rather than pure engineering efficiency. Unresolved, the allegation hangs over any cost comparison.
OpenAI faces a similar challenge. Its GPT-5 charges $2.50 input and $10 output per million tokens. DeepSeek’s rates are roughly 10 to 40 times cheaper across the board. OpenAI has pivoted toward consumer platform features, including personal finance tools and advertising. That suggests a recognition that API token revenue alone may not sustain its $852 billion valuation. DeepSeek’s cuts make that recognition more urgent.
Google’s Gemini 3.5 Flash, its cost-optimised model, charges $0.15 input and $0.60 output. DeepSeek’s input price is about 40 times cheaper. Even Google’s margin headroom disappears when the cheapest competitor is an order of magnitude lower.
Geopolitical Risk: The Hidden Cost of Low Prices
The question for enterprise buyers is whether DeepSeek’s model quality, reliability, and compliance posture justify the switch. The price advantage is clear. But the geopolitical risk of routing sensitive workloads through a Chinese AI provider may offset the savings for regulated industries.
A compliance officer in a financial institution faces a difficult calculus. The cost of using DeepSeek could be a fraction of what the bank pays for GPT-5. Yet data sovereignty requirements, export control regulations, and the possibility of government-mandated data access create legal exposure that no spreadsheet captures. The cheapest option is also the one with the most geopolitical complexity.
The calculus varies by industry. A small startup processing publicly available text data may face minimal risk. A defence contractor handling classified information would face prohibitive risk. For most mid-size enterprises operating in non-sensitive sectors, the trade-off may tilt toward DeepSeek, especially for non-critical workloads.
The era of high-margin AI tokens may be ending faster than anyone expected. DeepSeek’s permanent price cuts are not just a pricing decision. They are a declaration that the cost of intelligence is about to become a commodity.






