The Agentic Bill Comes Due
When agents broke the subscription model, they broke the safety model too.
An AI coding agent reads the codebase, forms a plan, generates code, runs tests, and iterates across multiple steps until the task resolves or the session reaches its context limit. Coding assistants sold on monthly subscriptions already operate in this mode, running agentic sessions autonomously against live repositories. Infrastructure designed around single-query cost and behavior profiles was running these sessions without having been rebuilt to match them.
The bonfire of the model credits
On June 1, GitHub retired its premium request unit system in favor of token-metered AI Credits across all Copilot plans. Writing on GitHub’s blog, Mario Rodriguez, the company’s Chief Product Officer, argued that Copilot’s growth into a full agentic platform had made flat-request pricing unsustainable. Rodriguez noted that a quick conversational query and a multi-hour autonomous coding session spanning an entire repository had previously consumed identical resources.
At one cent per credit, billing accrues on input, output, and cached tokens at the published per-model API rate. On Copilot Pro, the included 1,500 monthly credits carry a face value of $15 against a $10 plan price. On Pro+, 7,000 credits worth $70 accompany a $39 subscription. Code completions and Next Edit Suggestions draw no credits under any paid plan, and the new system eliminates the cheaper-model fallback that users with exhausted PRU credits could previously invoke.
On the first day of the new billing period, one Pro+ developer wrote in GitHub’s community forum that two hours of work had consumed 8% of a monthly 7,000-credit allocation, projecting full depletion in under two days. A separate developer reported that a single request to a large project had cost more than $6, writing in the same forum that such costs made reliable budgeting impossible for individual developers. On Reddit, a user reported that a single Claude 4.8 session had consumed 1,180 credits, 16% of a monthly Pro+ allowance, while returning suggestions described as mediocre and leaving the underlying problem unsolved.
About face
On Copilot’s published model menu, GPT-5.5 output tokens cost 24 times more than GPT-5.4 nano output tokens, turning model selection into a direct billing variable. A modeled task representing heavy agentic iteration, defined as 250,000 input tokens and 20,000 output tokens, runs to 185 credits on GPT-5.5 and 27.75 on MAI-Code-1-Flash, a 6.7-fold cost gap for the same work.
The official billing announcement’s community discussion post drew 37 thumbs-down reactions, and the backlash across GitHub’s forums and Reddit consistently identified frontier model pricing as the driver behind rapid credit depletion. Across these threads, developers announced plans to migrate to direct API access through Anthropic and OpenAI, or to alternative gateways including OpenRouter, RooCode, and LM Studio. One developer described a plan to exhaust the monthly Pro+ credit allotment in the first week and then pivot to OpenRouter for the remaining billing period, citing its compatibility with VS Code and credits that roll over for up to a year.
Measuring true performance
On June 16, OpenAI published its Deployment Simulation methodology, a pre-release safety technique that replays approximately 1.3 million de-identified production conversations through a candidate model to identify failure modes before deployment. The paper argues that realistic conversational contexts elicit behaviors that narrower evaluation sets may never surface, even when those behaviors were absent from the original traffic used to seed the simulation. Deployment Simulation caught a “calculator hacking” misalignment in GPT-5.1 that conventional benchmarks had not surfaced. The model had used a browser tool as a calculator while presenting the action to the user as a search.


