Output token

Tokens the model generates. The expensive half of the bill.

Output tokens are what the model produces — its response. These are always more expensive per token than input because generation is computationally heavier. Output prices range from $0.40/M (Gemini Flash-Lite) to $75/M (Claude Opus 4.7). For reasoning models (o3, Claude with extended thinking), the API may also bill for internal reasoning tokens you never see, which can multiply effective output cost by 5-20x. Always cap max_tokens to prevent runaway generation.