DeepSeek slashes V4‑Pro API prices by 75% and cuts cache fees to one‑tenth

DeepSeek slashes V4‑Pro API prices by 75% and cuts cache fees to one‑tenth
The Next Web

Key Points

  • DeepSeek offers a 75% promotional discount on its V4‑Pro model until May 5, 2026.
  • Cache‑hit fees across the entire API are reduced to 10% of previous rates.
  • At full price, V4‑Pro costs $0.145 per million input tokens, already cheaper than GPT‑5.5, Gemini 3.1 Pro, and Claude Opus 4.7.
  • The model supports a 1 million‑token context window and runs on Huawei Ascend 950 and Cambricon chips.
  • Pricing move follows U.S. accusations of Chinese AI model distillation and follows similar price cuts by OpenAI, Anthropic and Google.
  • Analysts cite V4‑Pro’s architecture as a potential turning point for long‑context AI applications.
  • DeepSeek aims to attract developers, startups and small enterprises with low‑cost, open‑weight AI.

DeepSeek announced a 75% promotional discount on its new V4‑Pro model and reduced cache‑hit charges across its entire API to 10% of previous rates. The price cut, effective immediately and running through May 5, 2026, makes the model cheaper than OpenAI, Anthropic and Google offerings even at full price. The move intensifies a pricing battle amid U.S. accusations that Chinese firms are distilling American AI models at scale, positioning DeepSeek as a low‑cost alternative for developers and enterprises.

DeepSeek unveiled a sweeping price reduction on Monday, offering a 75% discount on its flagship V4‑Pro model and trimming cache‑hit fees across the whole API suite to one‑tenth of earlier levels. The promotional rates apply until May 5, 2026, and are live immediately.

At standard pricing, V4‑Pro already costs $0.145 per million input tokens and $3.48 per million output tokens, undercutting OpenAI’s GPT‑5.5, Google’s Gemini 3.1 Pro, and Anthropic’s Claude Opus 4.7 on a per‑token basis. The new discount drives the input‑token price down to roughly $0.036 per million tokens, a stark contrast to its rivals.

The cache‑hit price cut targets frequent users and enterprise developers who send repetitive requests—a dominant pattern in production‑grade, agentic applications. By charging only ten percent of the previous cache rate, DeepSeek aims to lower the total cost of running large‑context workloads.

V4‑Pro, released last Friday, is a mixture‑of‑experts model with 1.6 trillion total parameters and 49 billion active parameters per task. It supports a 1 million‑token context window, enabling developers to process extensive codebases or long documents without splitting calls. The model runs on Huawei’s Ascend 950 chips and Cambricon hardware rather than Nvidia GPUs, a design choice that could reshape the AI hardware landscape.

Industry observers note the strategic timing. The discount arrives just days after the White House warned that foreign entities, primarily in China, were conducting “industrial‑scale” distillation of U.S. AI models. While DeepSeek was not named in the memo, the company has faced accusations from OpenAI and Anthropic of model distillation. Rather than engage directly, DeepSeek answered with aggressive pricing, signaling confidence that cost competitiveness, open‑weight access, and long‑context capability will win developers over.

Analysts highlight the broader impact. Zhang Yi of iiMedia called V4’s architecture a “genuine inflection point” for ultra‑long‑context AI, predicting rapid adoption beyond research labs. Wei Sun of Counterpoint Research added that using domestic chips reduces reliance on Nvidia and could accelerate both Chinese and global AI deployment.

DeepSeek’s pricing strategy follows a pattern set earlier this year when its R1 model entered the market at a fraction of OpenAI’s cost. The company continues to pair open‑source model availability with aggressive API rates, aiming to remove both access and cost barriers for startups, small enterprises, and individual developers. Akshar Keremane, co‑founder of Bangalore‑based AI startup O‑Health, described the combination as lowering entry hurdles for “developers, startups and small enterprises.”

U.S. AI providers have also been trimming prices. OpenAI, Anthropic and Google have each adjusted their API fees in recent months. DeepSeek’s latest move stands out for its scale—a 75% promotional cut layered on an already low‑priced model—while the timing coincides with the rollout of OpenAI’s GPT‑5.5 and heightened geopolitical tension over AI technology transfer.

For developers weighing API options, DeepSeek now offers a compelling proposition: a high‑parameter, long‑context model at a cost that undercuts the leading Western alternatives, bundled with reduced cache fees that further shrink operational expenses. Whether the price war will drive broader industry consolidation or spark new innovation remains to be seen, but the immediate effect is clear—DeepSeek is positioning itself as the most affordable gateway to frontier AI performance.

#DeepSeek#AI pricing#V4-Pro#API costs#large language model#machine learning#cloud computing#hardware#geopolitics#developer tools
Generated with  News Factory -  Source: The Next Web

Also available in:

DeepSeek slashes V4‑Pro API prices by 75% and cuts cache fees to one‑tenth | AI News