The February Model Blitz

Every major lab shipped this month, but DeepSeek is yet to join them.

Feb 23, 2026

∙ Paid

The first three weeks of February 2026 have delivered what may be the most compressed stretch of frontier model releases the AI industry has yet seen. OpenAI launched GPT-5.3-Codex on February 5, a model so capable at cybersecurity tasks that the company rated it “High” risk under its own Preparedness Framework and rolled it out with unusually tight controls. Anthropic followed with Claude Opus 4.6 on February 5 and then Claude Sonnet 4.6 just twelve days later, the latter closing the gap between Anthropic’s premium and workhorse tiers so aggressively that some early users preferred it to last year’s Opus. In China, Zhipu AI released GLM-5 on February 11, a 744-billion-parameter open-source model trained entirely on Huawei Ascend chips, which promptly sent the company’s Hong Kong shares up 34%. ByteDance updated Doubao. MiniMax dropped two models that claim near-frontier performance at a twentieth of Claude Opus pricing.

But DeepSeek, the Hangzhou startup whose R1 launch triggered a trillion-dollar market selloff in January 2025, has released nothing.

That silence is the most interesting thing about February’s release calendar. Reuters reported in January, citing The Information’s sources with direct knowledge of the project, that V4 was targeting a mid-February release around the Lunar New Year holiday on February 17. The timing would have mirrored R1’s strategy of launching during a major holiday for maximum visibility. Research papers published in early January on Engram conditional memory and manifold-constrained hyper-connections were immediately read as V4’s architectural blueprints. Internal benchmarks were said to show coding performance exceeding both Claude and GPT. The anticipation infrastructure was enormous: tracker pages, GitHub watchers, integration checklists, even a dedicated fan site at deepseekv4.app.

February 17 came and went. Then the 18th, the 19th, the weekend. As of this morning, there is no V4. DeepSeek remains characteristically taciturn.

The company did make one move during the window: on February 11, the same day Zhipu launched GLM-5, DeepSeek expanded its V3 model’s context handling tenfold. That looked less like a flagship launch and more like competitive positioning to avoid ceding the news cycle entirely. It is, in any case, a far cry from a new generation.

There are several possible readings. The most boring is that V4 simply slipped, which is not unusual in the industry and which DeepSeek’s own history would suggest is plausible given the longer gap between V3 and this release. The more interesting reading is that DeepSeek is watching the competitive field fill in and recalibrating. When the reporting surfaced in January, GPT-5.3-Codex had not yet launched, Sonnet 4.6 did not exist, and GLM-5 was still a rumor. The bar that V4 would need to clear on arrival has risen considerably since then, and DeepSeek, whose whole brand is built on exceeding expectations, may prefer to wait rather than land amid a crowded field with merely competitive numbers.

The models shipped by every other lab tell their own story. The common thread across almost every release this month is not general intelligence or reasoning but coding and agentic capability. GPT-5.3-Codex is OpenAI’s most specialized model yet, built explicitly for long-running development tasks where the AI operates autonomously for hours, debugging its own training runs, managing deployments, and writing code that builds more of itself. OpenAI’s blog post noted, with something approaching wonder, that the model was instrumental in creating itself. Anthropic’s Sonnet 4.6 similarly emphasizes coding and computer use; its OSWorld benchmark scores for navigating software interfaces the way a human would have improved steadily across sixteen months of Sonnet releases. GLM-5 was pitched explicitly as moving AI from “vibe coding” to what Zhipu calls “agentic engineering.”

This convergence on coding is not coincidental. Writing software is one of the few domains where AI capability translates directly into measurable economic value on a per-task basis, which is what matters when the industry is under pressure to justify the hundreds of billions flowing into infrastructure. It is also the domain where models can most obviously improve themselves, a feedback loop that every lab is now leaning into openly. When OpenAI says GPT-5.3-Codex helped build GPT-5.3-Codex, that is not marketing whimsy; it is a description of a development process that is becoming standard.

The other pattern worth noting is speed of commodification. Anthropic released Opus 4.6 on February 5 as its premium flagship. Twelve days later, it released Sonnet 4.6 at one-third the price, with performance that approaches Opus on real-world office tasks. MiniMax’s M2.5 models claim near-state-of-the-art at a twentieth of Opus pricing. GLM-5 is available for roughly a sixth of what Opus costs per token. The shelf life of a frontier model’s pricing premium is now measured in days, not months, and the labs that hold proprietary advantages are compressing their own product lines faster than anyone else can.

AI Central

The February Model Blitz

Every major lab shipped this month, but DeepSeek is yet to join them.

This post is for paid subscribers