The Middle Kingdom Moves

DeepSeek V4 has launched on Chinese silicon explicitly targeting Western integration, triggering immediate backlash from Washington.

May 06, 2026

DeepSeek released V4 in preview on April 24, the Hangzhou lab’s first major model since R1 triggered a trillion-dollar market selloff in January 2025. V4 comes in two variants: V4-Pro, a mixture-of-experts architecture with 1.6 trillion total parameters and 49 billion active per task, and V4-Flash, a lighter model at 284 billion total parameters with 13 billion active. Both support a one-million-token context window, ship under MIT licensing, and are available through an API that serves two formats natively. The model’s headline benchmarks place it near but below the US frontier, trailing GPT-5.4 and Claude Opus 4.6 on DeepSeek’s own numbers and falling further behind on independent assessments. The product decisions surrounding the release tell a more specific story than the benchmarks do.

Juggling APIs

The first thing V4’s documentation advertises is API format compatibility. The model serves both the OpenAI ChatCompletions and Anthropic Messages formats natively through api.deepseek.com. Existing OpenAI-SDK callers can switch to V4 by changing the base URL and model string, with no SDK replacement required. The Anthropic Messages support covers tool calls, thinking-mode passthrough, streaming responses, and tool results, with unsupported fields limited to an explicit short list: image and document content blocks, citations, cache-control hints, server-side tool use, and the MCP servers field.

DeepSeek’s documentation goes further than compatibility tables. The company publishes step-by-step integration guides for Claude Code, OpenClaw, OpenCode, and GitHub Copilot CLI, each with ready-made environment variable configurations. The Claude Code guide is the most revealing: it maps every model tier — Opus, Sonnet, Haiku, and subagent — to DeepSeek equivalents, treating V4-Pro as the primary model across the first three tiers and V4-Flash as the lightweight fallback for subagent tasks. A developer who follows these instructions has a working DeepSeek-powered Claude Code installation in under a minute.

A merciless undercut

V4-Pro’s standard pricing is $1.74 per million input tokens and $3.48 per million output tokens. For comparison, GPT-5.5 lists at $5/$30 and Claude Opus 4.7 at $15/$75. A 75% promotional discount running through May 31, 2026 drops V4-Pro’s effective rate to $0.435 per million input and $0.87 per million output, which makes the ratios difficult to express without sounding like a typo. V4-Flash, the lighter variant at 284 billion total parameters with 13 billion active, is cheaper still: $0.14 per million input and $0.28 per million output at standard rates, with no promotional discount needed to undercut every Western model in its class.

Two days after launch, DeepSeek cut the cache-hit input price to one-tenth of the listed rate. Cache hits occur when a request reuses a prompt prefix that the system has already processed, and agentic coding loops with stable system prompts see cache-hit rates above 70%. The discount rewards exactly the prompt-reuse patterns that production coding agents generate at scale. At promotional pricing, an eight-hour autonomous coding run that costs $50–200 on Claude Opus 4.7 lands at $1.50–6 on V4-Pro.

An infrastructural interloper

DeepSeek trained V4 on Huawei Ascend 950 and Cambricon chips, with Huawei publicly confirming on launch day that its Ascend clusters had supported the model. R1, the January 2025 release that put DeepSeek on the global map, ran on Nvidia H800 GPUs. V4 is DeepSeek’s first frontier-class release built on primarily domestic Chinese silicon, and the largest open-weight model — 1.6 trillion total parameters in the V4-Pro variant, with 49 billion active per task — to train on non-Nvidia hardware.

Jensen Huang framed the shift as a strategic threat before V4 even launched. On the Dwarkesh Podcast, the Nvidia CEO said that DeepSeek models optimized for Huawei chips would represent “a horrible outcome for our nation.” His concern centers on the CUDA ecosystem: Nvidia’s dominance rests on its software stack’s position as the default development environment for AI, and a Chinese lab demonstrating that frontier-adjacent models can train outside that stack weakens the argument for American hardware as an irreplaceable dependency.

The investment floodgates open

DeepSeek began talks in mid-April to raise at least $300 million at a valuation of at least $10 billion, its first external capital raise. Within days, investor demand pushed the valuation benchmark past $20 billion. Tencent proposed acquiring up to 20% of the company, a stake DeepSeek has resisted on grounds of control. Alibaba entered separate talks. The shares of both prospective investors slid in Hong Kong trading on the news, which suggests that the market read the investment less as a growth bet for Tencent and Alibaba than as a concession that DeepSeek had become too important to leave unfunded.

The valuation prices a company that distributes its models under MIT licensing, has no subscription tier, and undercuts every Western API on price. DeepSeek remains owned by Zhejiang High-Flyer Asset Management, the hedge fund whose co-founder Liang Wenfeng formed the lab in 2023, and the round would mark the first time either Tencent or Alibaba held a formal stake in China’s most closely watched AI company. The capital buys a position in a lab whose combination of technical capability, strategic hardware independence, and open distribution has no equivalent elsewhere in the Chinese AI ecosystem.

Freakout on cue

On April 23, the day before V4 launched, OSTP director Michael Kratsios issued a memo accusing foreign entities, principally in China, of running “industrial-scale” campaigns to extract capabilities from US frontier AI models through distillation. The next day, the State Department sent a diplomatic cable to every embassy and consulate worldwide, instructing diplomats to warn foreign governments about the risks of AI models derived from US proprietary systems. The cable named DeepSeek alongside Moonshot AI and MiniMax, building on earlier allegations from Anthropic, which had reported that the three companies used 24,000 fraudulent accounts to make 16 million exchanges with its Claude model, and from OpenAI, which accused DeepSeek of similar activity in communications with US lawmakers.

On May 1, NIST’s Center for AI Standards and Innovation published an independent evaluation that measured V4-Pro against US frontier models using non-public benchmarks. CAISI found that V4’s capabilities lag the frontier by approximately eight months, a wider gap than DeepSeek’s self-reported results suggest, since the company’s own benchmarks had placed V4 roughly on par with GPT-5.4 and Claude Opus 4.6. The same evaluation found V4 more cost-efficient than the most competitive US reference model on five of seven benchmarks. The two findings sit comfortably together: V4 trails on raw capability and leads on cost efficiency, which is precisely the competitive position that its API compatibility and pricing were designed to exploit.

The siren call

V4 remains in preview, with no full-release date announced and the promotional pricing set to expire on May 31. DeepSeek’s response to the political escalation that surrounded the launch has followed the pattern that the company established with R1: cut the price, open the code, and leave developers to weigh economic gravity against political risk on their own timelines. The NIST evaluation confirmed that the capability gap remains real, and it confirmed that the cost advantage remains real. The history of enterprise software suggests that cost advantages at this scale tend to close capability debates faster than capability gaps close cost debates.

AI Central

Discussion about this post

Ready for more?