Prompt cache

Reusing a prompt prefix across calls for a big input-price discount.

Prompt caching stores a prompt prefix on the provider's servers so subsequent requests with the same prefix don't re-process it. Anthropic charges 10% of input price for cached tokens (but 125% for the first-time write). OpenAI discounts automatically to 25-50% of input. Google offers context caching at 25% + a storage fee. Cache TTL is usually 5 minutes by default. Caching only pays off when the same prefix is hit repeatedly within TTL — most useful for stable system prompts, tool definitions, and repeated document reads.