Cached Token Pricing
(Pricing data as of December 2025)
Cached Token Pricing applies when providers receive the same prompt or input context and do not need to process those input tokens again from scratch.
Cached Input Pricing reflects reuse of repeated prompt or prefix tokens, where the initial input processing step (the prefill pass) has already been computed and does not need to be run again (for example, system prompts, instructions, or long shared context reused across requests).
Cached Token Pricing by Provider
Key Takeaways
- Cached input pricing converges around approximately 10% of standard input pricing across major providers.
- Cached output pricing is uncommon, with Anthropic the only major provider in this dataset offering it, reflecting reuse of previously generated completions for repeated requests.
- Caching primarily reduces the cost of repeated prefix context, not generation, because reused input does not need to be recomputed, lowering the compute required to serve the request.
Cached Token Pricing by Model
| Model | ||||||||
|---|---|---|---|---|---|---|---|---|
| Frontier High | OpenAI | gpt-5.2-pro | $21.00 | — | — | $168.00 | — | — |
| Frontier High | OpenAI | gpt-5-pro | $15.00 | — | — | $120.00 | — | — |
| Frontier High | OpenAI | gpt-5.2 | $1.75 | $0.17 | 10% | $14.00 | — | — |
| Frontier High | OpenAI | gpt-5.1 | $1.25 | $0.13 | 10% | $10.00 | — | — |
| Mid Value | OpenAI | gpt-5-mini | $0.25 | $0.03 | 10% | $2.00 | — | — |
| OSS Value | OpenAI | gpt-5-nano | $0.05 | $0.01 | 10% | $0.40 | — | — |
| Frontier High | Anthropic | Opus 4.5 | $5.00 | $0.50 | 10% | $25.00 | $6.25 | 25% |
| Frontier Value | Anthropic | Sonnet 4.5 | $3.00 | $0.30 | 10% | $15.00 | $3.75 | 25% |
| Frontier Value | Anthropic | Haiku 4.5 | $1.00 | $0.10 | 10% | $5.00 | $1.25 | 25% |
| Frontier High | Gemini 3 Pro | $2.00 | $0.20 | 10% | $12.00 | — | — | |
| Frontier Value | Gemini 2.5 Flash | $0.30 | $0.03 | 10% | $2.50 | — | — |
1) Pricing data as of December 2025. Values were taken from official provider pricing pages, including input, output, cached input tokens, and cached output tokens.
2) This dataset reflects public list prices only; enterprise and volume discounts are not included.
3) Where providers publish context-length tiered pricing, values shown reflect the lowest context tier (e.g., Anthropic Sonnet 4.5 prompts ≤ 200K tokens). Pricing for extended context lengths may differ.