Claude Opus 4.7 vs GPT-5 vs Gemini 2.5 Pro — real cost math

2026-04-22 · Choppy Toast

Sticker prices are misleading because each provider's cached-input discount hits differently.

Per million tokens (April 2026): - Claude Opus 4.7: $15 in / $75 out, cached $1.50 - GPT-5: $10 in / $30 out, cached $2.50 - Gemini 2.5 Pro: $1.25 in / $10 out, cached $0.31

Coding agent (12K input, 1.5K output, 500 req/mo, 50% cache hit): Opus $135, GPT-5 $52, Gemini $11. Gemini's 12x lead here is almost entirely about cached input — the same repo gets re-read thousands of times per month.

Chatbot (600 input, 200 output, 10K req/mo, 60% cache hit): Opus $180, GPT-5 $84, Gemini $22. Same story.

Long-doc summarization (20K input, 500 output, 200 req/mo, no cache): Opus $67.50, GPT-5 $43, Gemini $6. Gemini's 2M context and cheap per-token input punish the others.

But output quality isn't equal. Opus 4.7 still leads on SWE-bench and agentic coding. GPT-5 leads on GPQA-hard. Gemini 2.5 Pro leads on needle-in-haystack at 1M+ tokens.

The practical answer: route.

Use Opus or GPT-5 for the 5-10% of requests that need real intelligence. Route the other 90% to Gemini Flash or Haiku 4.5. Most production apps that do this cut their bill by 70-85% without a quality drop users can measure.