Reasoning token

Hidden tokens used for model's internal thinking. Billed but not shown.

Reasoning tokens (OpenAI) or extended thinking tokens (Anthropic) are output tokens the model generates privately to work through a problem before emitting its visible answer. The API doesn't return them, but they count against output billing and your max_tokens limit. On o3 or Claude with extended thinking, a 'short' 200-word answer can hide 5K-50K reasoning tokens. Budget accordingly — reasoning models often produce 10-50x more billable output than their visible response length suggests.