DeepSeek Unveils Open‑Source V4 Models, Claiming Lead in Coding Benchmarks and Low‑Cost Token Pricing

Chinese AI firm DeepSeek released two new large language models, V4‑Pro and V4‑Flash, both featuring a one‑million token context window and open‑source licenses on Hugging Face. V4‑Pro, a 1.6‑trillion‑parameter model, outperformed leading U.S. models in coding and agentic tasks, while V4‑Flash delivered comparable speed at a fraction of the compute cost. DeepSeek also announced a token price of $3.48 per million output tokens, dramatically undercutting OpenAI and Anthropic rates, positioning the models as cost‑effective alternatives for developers.

DeepSeek, the Hangzhou‑based artificial‑intelligence startup, announced on April 24 that it is making two new large language models publicly available. The company calls the offerings V4‑Pro, an "Expert" mode with 1.6 trillion total parameters and 49 billion active ones, and V4‑Flash, an "Instant" mode that runs on 284 billion total parameters and 13 billion active ones. Both models support a one‑million token context window, a capability rarely seen outside a handful of proprietary systems.

Unlike most cutting‑edge models, DeepSeek released the code and weights on Hugging Face, allowing developers to download and run the models on their own hardware. While V4‑Flash can operate on more modest GPU setups, V4‑Pro demands substantial VRAM, reflecting its larger scale. The open‑source stance marks a clear departure from the closed‑source approach of competitors such as OpenAI, Google and Anthropic.

Benchmark Results and Pricing

In a series of public benchmarks, V4‑Pro posted a Codeforces rating of 3,206, edging out GPT‑5.4’s 3,168 and Google Gemini’s 3,052, making it the strongest open model for competitive‑programming tasks. On LiveCodeBench, the model achieved a 93.5 % pass‑rate, surpassing Claude Opus 4.6’s 88.8 % and Gemini’s 91.7 %. For agentic workloads, V4‑Pro scored 51.8 on Toolathlon, again beating Claude (47.2) and Gemini (48.8). V4‑Flash matched V4‑Pro on simpler agent tasks while consuming far less compute.

DeepSeek’s models did not dominate every category. Claude Opus 4.6 retained the lead on long‑context retrieval, scoring 92.9 on the MRCR 1M benchmark versus V4‑Pro’s 83.5. Likewise, OpenAI’s GPT‑5.4 remained ahead on Terminal Bench 2.0, posting 75.1 compared with V4‑Pro’s 67.9.

The pricing announcement drew particular attention. DeepSeek set the cost of V4‑Pro at $3.48 per million output tokens, a fraction of OpenAI’s $30 and Anthropic’s $25 for comparable usage. The company argues that the price gap could make its models attractive to developers building AI‑powered applications, especially those needing extensive context windows.

Industry observers note that the combination of open‑source availability, strong performance on coding and agentic tasks, and aggressive pricing could shift the dynamics of the large‑model market. However, the hardware requirements for V4‑Pro may limit adoption among smaller teams lacking high‑end GPU clusters.

DeepSeek’s move underscores a broader trend toward democratizing access to powerful AI models. By publishing the weights and offering a low‑cost pricing tier, the firm hopes to spur innovation across the developer community while challenging the dominance of closed‑source providers.

DeepSeek Unveils Open‑Source V4 Models, Claiming Lead in Coding Benchmarks and Low‑Cost Token Pricing

Key Points

Benchmark Results and Pricing

Also available in: