Loading...
Loading...
Calculate and compare API costs across major LLM providers including OpenAI, Anthropic, Google, Meta, Mistral, and DeepSeek. Input your token usage to see detailed cost breakdowns and find the most cost-effective model for your use case.
Choose from 17+ models across 7 providers. Each model shows input/output pricing per million tokens.
Input your expected tokens per request, or use a preset for common use cases like chatbots or code assistants.
View cost breakdowns by request, day, month, and year. Compare all models to find the best price/performance ratio.
A rough rule of thumb is 1 token ≈ 4 characters or 0.75 words for English text. A typical ChatGPT-style conversation uses 500-2,000 input tokens (prompt + context) and 300-1,000 output tokens (response). Code tends to use more tokens due to formatting. Most providers offer tokenizer tools to count exact tokens.
Generating output tokens requires more compute than processing input tokens. When the model reads your prompt (input), it processes tokens in parallel. When generating a response (output), it must generate tokens one at a time, predicting each word sequentially. This sequential generation is more computationally intensive.
Some providers (like Anthropic and OpenAI) offer prompt caching, where repeated prompts with the same prefix are charged at reduced rates (typically 50-90% off). This calculator shows base pricing. If you have many requests with similar prompts, your actual costs may be lower due to caching.
We update prices monthly and when major pricing changes are announced. LLM pricing has been trending downward as providers optimize inference. The current prices are from February 2025. Always verify with the provider's official pricing page before making budget decisions.
Not necessarily. Consider the quality-cost tradeoff for your use case. Cheaper models like GPT-4o Mini or Gemini 2.0 Flash are great for simple tasks, but complex reasoning or creative tasks may benefit from more capable (and expensive) models like Claude Opus 4 or GPT-4o. Test with your actual prompts to find the right balance.
OpenAI and Anthropic offer ~50% discounts for asynchronous batch processing where you submit requests and receive results within 24 hours. This is ideal for non-time-sensitive workloads like content generation, data processing, or evaluations. The calculator shows real-time API pricing; batch pricing would be roughly half.