Rate limit
Max tokens or requests per minute you're allowed. Tier-based.
Every provider enforces tokens-per-minute (TPM) and requests-per-minute (RPM) limits that scale with your spend history. Tier 1 (~$5-50 spent) gets ~50K TPM on most models. Tier 5 (~$50K+ spent) gets 10M+. If your app spikes, you'll hit the limit before you hit the bill. Plan accordingly — retry logic, exponential backoff, and multi-provider fallback are table-stakes for production LLM apps.