Cached input

Input tokens already stored in the provider's prompt cache.

Cached input tokens are input tokens the provider recognizes as identical to a recent prior request and serves from cache at a steep discount: 10% of regular input price on Anthropic, 25-50% on OpenAI, 25% on Google. Modern LLM pricing is effectively three-rate: regular input, cached input, and output. Any serious cost analysis needs the cached rate — skipping it can underestimate the real bill by 5-10x on production workloads with high cache hit rates.