The Turn-Count Tax: Rethinking What I Pay for AI Coding

2 Sents

A model priced at $10 per million tokens can finish a coding task for half the cost of a $3 model, because it stops re-reading your codebase after two turns instead of ten. That math, plus a real audit of every AI subscription I pay for, decided my stack for the next year.

My Cursor annual plan renews soon, and the grandfathered unlimited Auto mode I've been coasting on dies with it. That deadline forced a question I had been avoiding. I pay for Cursor, two Claude accounts, Gemini Advanced, ChatGPT on and off, and API top-ups whenever something runs dry. Stacked up, it was drifting toward the price of a car payment, and I could not tell you which pieces were actually earning their keep.

So I did the research properly. Or rather, I made three deep research agents do it and merged their reports. Sources and decisions below.

What everything costs in June 2026

Cursor

Plan	Monthly	Annual billing	Usage vs Pro
Pro	$20	$16/mo	1x
Pro+	$60	$48/mo	3x
Ultra	$200	$160/mo ($1,920 upfront)	20x

Two details matter more than the sticker prices. Cursor now splits usage into two pools. Auto and Composer draw from a generous first pool, and manually picking a frontier model (Claude, GPT, Gemini) draws from a second pool billed at API rates. Annual billing knocks 20% off, but Ultra requires the full $1,920 upfront. There is no monthly Ultra.

The grandfathered unlimited Auto plans ended for anyone who bought after September 15, 2025. For the rest of us they survive exactly until the current annual term ends. Renewal puts everyone on the credit system, so "renew to keep unlimited Auto" is no longer a reason to renew anything.

Composer is the reason to care about Cursor's pricing at all. Composer 2.5 runs $0.50 per million input tokens and $2.50 output on Standard, $3 and $15 on Fast, and it only exists inside Cursor. On Artificial Analysis's Coding Agent Index it placed third overall behind Opus in Claude Code and GPT-5.5 in Codex, at roughly $0.07 per completed task on Standard against $4.10 and $4.82 for the two agents ahead of it. I've been telling people Composer is the best bang for the buck in coding agents. The benchmark agrees, by a factor of about sixty.

Claude

Plan	Price	Annual discount?	Rough capacity per 5-hr window
Pro	$20/mo	Yes ($17/mo annual)	~45 messages
Max 5x	$100/mo	No, monthly only	~225 messages
Max 20x	$200/mo	No, monthly only	~900 messages

Important: the Max tiers have no annual discount. The $200/yr deal on Claude Pro is the annual rate for the lowest tier. Once you step up to Max 5x or 20x, it's month-to-month at the full price, permanently.

Anthropic does not publish token budgets, so those message counts are community measurements. Limits run on a rolling 5-hour window plus weekly caps, shared across Claude Code, Cowork, and the chat apps. The caps got tighter earlier this year with peak-hour throttling on weekday mornings. Then in May 2026, Anthropic reversed course and roughly doubled the Max plan session limits. If Max 5x felt too tight before, it's meaningfully more usable now.

Fable 5 launched June 9 at $10 in and $50 out per million on the API, free inside paid plans through June 22, and it burns roughly twice the quota that Opus does on a subscription.

The question I actually had was whether the $100 and $200 tiers beat paying for API usage. For interactive work it is not close. Kyle Redelinghuys tracked eight months of daily Claude Code use, around ten billion tokens, and the API-equivalent bill came to over $15,000 against roughly $800 in Max fees. Alexey Pelykh measured a single ordinary day at about $200 of API-equivalent usage and concluded the 20x in Max 20x is literal arithmetic. If you run Claude Code daily, the subscription is the discount. The API is what you use when automation or pipelines need a key.

One gotcha that has cost people real money. If ANTHROPIC_API_KEY is set in your shell, Claude Code bills the key and ignores your subscription entirely. Unset it.

Google, Gemini, and Antigravity

I kept hoping there was some tool I didn't know about that could point at my paid Gemini account for coding. There isn't. Gemini Advanced is a consumer chat subscription. It grants zero API access. Google's own docs are explicit that consumer plans do not apply to API usage, and OAuth attempts from third-party tools come back 403. The only sanctioned place that account does coding work is inside Antigravity itself.

Which would be fine, except the Gemini CLI deprecates June 18, everything folds into Antigravity, and capacity there has been gutted even for paid users. One AI Pro user reported their weekly token allowance dropping from 300M to under 9M, about a 97% reduction, with no announcement. Unofficial OAuth plugins exist that tap a consumer subscription from tools like Cline, but Google's FAQ calls that a terms violation and grounds for suspension.

At I/O 2026, Google cut the Ultra plan price from $249.99 to $99.99/mo, which is a significant drop. Antigravity at Ultra now includes Gemini 3.5 Flash, Gemini 3.1 Pro, and even Claude Sonnet and Opus. If the capacity issues stabilize, it becomes interesting as a second agent. For now, the reported limits make it unreliable for anything I'd want to build a workflow around.

The Gemini Advanced account is canceled. If I want Gemini 3 in my router, a metered AI Studio key costs exactly what I use. Gemini 3.1 Pro runs $2 in and $12 out per million.

ChatGPT and Codex

OpenAI switched Codex to token-based billing on April 2, 2026. Before that switch, Plus gave you a message allowance. Now Codex bills directly against token usage, and OpenAI's own estimate is $100–200 per developer per month at typical usage. A $20 Plus subscription does not buy meaningful Codex capacity at that billing model. The pool is too small. Real Codex capacity requires Pro at $200/mo or just paying API rates directly.

Since I have Claude for the heavy lifting, Codex is useful as a second-opinion agent, not a primary one. Plus comes back month-to-month the specific month I want it. Otherwise it stays canceled.

Grok and Grok Build

Grok Build launched in early beta on May 14. Up to eight concurrent sub-agents, plan mode, terminal-native. Interesting on paper. Access is currently gated behind a $299/mo SuperGrok Heavy subscription (with a $99 promotional tier for the first six months). It has not shipped publicly despite a "next week" announcement in mid-April. Not ready as a primary tool.

The API underneath is the part I'd actually route through. Grok Code Fast runs at $0.20 per million input tokens, the cheapest credible coding model from any major lab right now. It's available on OpenRouter at the same price. I use it as the default cheap tier in a router that escalates to Opus only when a task actually needs it.

T3 Code

People keep confusing this one. Theo (t3.gg) clarified on X in May 2026: "we CAN NOT MAKE MONEY ON T3 CODE RN. You HAVE to bring inference from somewhere else. Codex, [OpenRouter]." T3 Code is BYOK. You connect your own API keys to a coding agent harness. There is no T3 Code subscription. T3 Chat ($8/mo) is a separate multi-model chat interface, good for comparing models quickly but not a coding agent. Connect T3 Code through OpenRouter and it costs nothing extra.

The turn-count tax

An agent doesn't send your question once. Every turn resubmits the entire accumulated context: the system prompt, the relevant code, the conversation so far, every tool output and stack trace. Input tokens end up being 85 to 90% of an agentic session's cost, per Vantage's analysis. The bill for a task is price times a growing context, times the number of turns. That reframed the whole audit for me.

Run the numbers on a 50,000-token working context that grows 5,000 tokens per turn.

	Model A ($3/M input)	Model B ($10/M input)
Turns to finish the task	10	2
Input tokens resubmitted	725,000	105,000
Input cost for the task	$2.18	$1.05

The model that costs 3.3x more per token finishes the task for half the money, because it re-reads the codebase twice instead of ten times. The turn counts are illustrative. The direction is not. JetBrains and others have measured Opus-class models using 50 to 65% fewer tokens than Sonnet-class models on the same work, and that gap is exactly what I see when a cheaper model spends six turns chasing its own failing tests.

Prompt caching softens this without flipping it. Anthropic bills cache reads at 10% of input price, so the repeated 50k prefix gets cheap. The cheaper model still pays the per-turn growth ten times and still fails more often, which adds turns. All three research agents ran this same case study with slightly different assumptions and reached the same conclusion, which I break down in the methodology post.

My rule: default to cheap models for routine work, escalate to a flagship the moment a task looks hard, and judge models on cost per finished task. The pricing page measures the wrong thing.

The buffet, quickly

OpenRouter gives you one key, 300+ models, no markup on inference, and about 5.5% on credit top-ups. It plugs into Claude Code by pointing ANTHROPIC_BASE_URL at it, and it powers Cline, Roo Code, and T3 Code. Pass a session_id so sticky routing keeps your prompt cache warm across turns. One trap: some IDE integrations send caching directives OpenRouter doesn't recognize, and one documented VS Code case showed full-price input billing on every agent turn. Roughly a 9x inflation, with no error surfaced anywhere.

GitHub Copilot moved to token-based credits on June 1 and now carries the full frontier model lineup, Fable 5 included, behind a meter. Factory Droid, Amp, and opencode are all solid BYOK harnesses, and opencode can authenticate with an existing Copilot or ChatGPT login you already pay for.

What I'm actually doing

My situation going into this was messier than I'd admitted. I have two Claude accounts:

Account 1 (Cowork, chat, Dispatch, light coding): Pro annual, expires this month
Account 2 (coding only): Pro annual, 10+ months remaining

The instinct was to treat the second account as a backup. It functions that way. When Account 1 hits its 5-hour window cap, I switch. But that is a clunky way to buy capacity. Two Pro accounts at $17/mo each is $34/mo for throughput that resets separately and requires context-switching between sessions.

Piece	Monthly cost
Claude Max 5x (Account 1 upgrade)	$100
Claude Pro (Account 2, let it run out)	~$17, pre-paid
Cursor Pro, billed annually	$16
OpenRouter for overflow and xatacomb	~$30–40
Total	~$146–156/mo

Account 1 becomes Max 5x at $100/mo. With the May 2026 session limit increase, Max 5x now covers a full heavy day of Claude Code. Account 2 runs out naturally around April 2027. Until then it's a real backup. If I hit Max 5x's window cap mid-session, I have somewhere to switch. After it expires, I'll have eight months of actual consumption data to decide if Max 5x is enough or if I need 20x.

Cursor stays at Pro annual ($16/mo), not Ultra. The $1,920 upfront on Ultra is the problem. I don't know what my Composer consumption looks like post-grandfathering. I've never had to care before. Betting $1,920 on that is guessing. Pro keeps Composer and Tab running. If I consistently hit Pro limits after a few months, I'll know Ultra is worth it and upgrade then.

Gemini Advanced is canceled. It does nothing for coding.

ChatGPT stays canceled. Since April 2026, Codex bills against token usage, and a $20 Plus subscription doesn't buy enough capacity to matter. It comes back the month I actually want Codex for a second opinion.

The router I'm building at xatacomb.com routes each task to whichever model fits. Grok Code Fast ($0.20/M) for cheap iteration loops, Composer for in-editor edits, Opus or Fable only when the task actually needs it. Default cheap. Escalate deliberately. The subscription stack above sets the ceiling. The router determines how much of it I actually spend.

Prices in this post will rot. Fable 5 shipped two days before I wrote this, Copilot's billing flipped this month, Google's capacity cuts have no official number attached to them, and Anthropic doesn't publish the token budgets behind any of this. Verify against the actual pricing pages before signing anything annual.

Unit price is the least interesting number on the pricing page. Count turns.