State of the Art
The state of frontier AI,
with the receipts attached.
What every frontier model costs, how much it can read in one go, and how the curve has bent over the last eighteen months. Numbers come from a `model_snapshots` table refreshed against vendor docs, never hardcoded in the views.
Cost per million input tokens, over time
Capability went up. Price went down.
Each line is a vendor's frontier and budget tiers. Y axis is log scale (cost spans two orders of magnitude). Read the slope, not the absolute numbers.
Context window, current frontier
The page got bigger. A lot bigger.
What the model can hold in mind on a single request. Eighteen months ago, 32K was generous. Today, 1M is normal and 2M is in production.
Current frontier and tiers
Side-by-side, with the bill.
Input price, output price, cache-read price, context window, and a few public benchmarks where vendors publish them. Use this to pick the right model for the job, not the loudest one.
| Model | Vendor | Context | In / Mtok | Out / Mtok | Cache read | MMLU |
|---|---|---|---|---|---|---|
|
Claude Opus 4.7
claude-opus-4-7
|
anthropic | 1.0M | $15 | $75 | $1.5 | 92.1 |
|
Claude Sonnet 4.6
claude-sonnet-4-6
|
anthropic | 200K | $3 | $15 | $0.3 | 89.3 |
|
Claude Haiku 4.5
claude-haiku-4-5
|
anthropic | 200K | $0.8 | $4 | $0.08 | 82.1 |
|
Gemini 3 Pro
gemini-3-pro
|
2.0M | $1.25 | $5 | — | 90.8 | |
|
Gemini 3 Flash
gemini-3-flash
|
1.0M | $0.075 | $0.3 | — | 81.7 | |
|
GPT-5
gpt-5
|
openai | 256K | $2.5 | $10 | $0.25 | 91.4 |
|
GPT-5 Turbo
gpt-5-turbo
|
openai | 128K | $0.5 | $1.5 | $0.05 | 86.2 |
|
GPT-5 Mini
gpt-5-mini
|
openai | 128K | $0.15 | $0.6 | $0.015 | 78.4 |
The number that ate the bill
Prompt caching turns the system prompt into a 10% line item.
On the second turn of any conversation, the long static preamble (system prompt, persona, retrieved corpus) reads from the prompt cache at roughly 10% of fresh-input price. On a long session, that quietly cuts your input bill by 70 to 90 percent.
Read the full glossary entryWant to see what these numbers mean in practice?