Data/Cached Pricing

Cached Token Pricing

(Pricing data as of December 2025)

Cached Token Pricing applies when providers receive the same prompt or input context and do not need to process those input tokens again from scratch.

Cached Input Pricing reflects reuse of repeated prompt or prefix tokens, where the initial input processing step (the prefill pass) has already been computed and does not need to be run again (for example, system prompts, instructions, or long shared context reused across requests).

Cached Token Pricing by Provider

Cached input as % of standard

Cached output as % of standard

Key Takeaways

Cached input pricing converges around approximately 10% of standard input pricing across major providers.
Cached output pricing is uncommon, with Anthropic the only major provider in this dataset offering it, reflecting reuse of previously generated completions for repeated requests.
Caching primarily reduces the cost of repeated prefix context, not generation, because reused input does not need to be recomputed, lowering the compute required to serve the request.

Cached Token Pricing by Model

		Model

Frontier High	OpenAI	gpt-5.2-pro	$21.00	—	—	$168.00	—	—
Frontier High	OpenAI	gpt-5-pro	$15.00	—	—	$120.00	—	—
Frontier High	OpenAI	gpt-5.2	$1.75	$0.17	10%	$14.00	—	—
Frontier High	OpenAI	gpt-5.1	$1.25	$0.13	10%	$10.00	—	—
Mid Value	OpenAI	gpt-5-mini	$0.25	$0.03	10%	$2.00	—	—
OSS Value	OpenAI	gpt-5-nano	$0.05	$0.01	10%	$0.40	—	—
Frontier High	Anthropic	Opus 4.5	$5.00	$0.50	10%	$25.00	$6.25	25%
Frontier Value	Anthropic	Sonnet 4.5	$3.00	$0.30	10%	$15.00	$3.75	25%
Frontier Value	Anthropic	Haiku 4.5	$1.00	$0.10	10%	$5.00	$1.25	25%
Frontier High	Google	Gemini 3 Pro	$2.00	$0.20	10%	$12.00	—	—
Frontier Value	Google	Gemini 2.5 Flash	$0.30	$0.03	10%	$2.50	—	—

1) Pricing data as of December 2025. Values were taken from official provider pricing pages, including input, output, cached input tokens, and cached output tokens.

2) This dataset reflects public list prices only; enterprise and volume discounts are not included.

3) Where providers publish context-length tiered pricing, values shown reflect the lowest context tier (e.g., Anthropic Sonnet 4.5 prompts ≤ 200K tokens). Pricing for extended context lengths may differ.