Loading...
Loading...
Compare pricing, context windows, capabilities, and latency across major LLM providers. Find the right model for your use case.
Last updated: February 2025. Prices and specifications from official provider documentation.
| Model | Provider ⇅ | Context ⇅ | Input Price ⇅ | Output Price ⇅ | Latency ⇅ | Capabilities | Best For |
|---|---|---|---|---|---|---|---|
Claude 3.5 Haiku 2024-10 | Anthropic | 200K max out: 8K | $0.80 per 1M tokens | $4.00 per 1M tokens | ⚡ Very Fast ~200ms TTFT | VisionReasoning | High-speed tasksCost efficiency+1 |
Claude Opus 4 2025-01 | Anthropic | 200K max out: 32K | $15.00 per 1M tokens | $75.00 per 1M tokens | ⏱️ Medium ~800ms TTFT | VisionReasoning | Complex analysisResearch+2 |
Claude Sonnet 4 2025-01 | Anthropic | 200K max out: 64K | $3.00 per 1M tokens | $15.00 per 1M tokens | 🚀 Fast ~400ms TTFT | VisionReasoning | Balanced tasksCoding+2 |
DeepSeek R1 2025-01 | DeepSeek | 64K max out: 8K | $0.55 per 1M tokens | $2.19 per 1M tokens | ⏱️ Medium ~1500ms TTFT | VisionReasoning | Complex reasoningMath+1 |
DeepSeek V3 2024-12 | DeepSeek | 64K max out: 8K | $0.27 per 1M tokens | $1.10 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | CodingGeneral tasks+1 |
Gemini 1.5 Pro 2024-02 | 2.0M max out: 8K | $1.25 per 1M tokens | $5.00 per 1M tokens | 🚀 Fast ~500ms TTFT | VisionReasoning | Long documentsVideo analysis+1 | |
Gemini 2.0 Flash 2024-12 | 1.0M max out: 8K | $0.10 per 1M tokens | $0.40 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | High-volume tasksMultimodal+1 | |
Gemini 2.0 Flash Thinking 2025-01 | 1.0M max out: 8K | $0.10 per 1M tokens | $0.40 per 1M tokens | ⏱️ Medium ~1000ms TTFT | VisionReasoning | Reasoning tasksComplex analysis+1 | |
GPT-4o 2024-05 | OpenAI | 128K max out: 16K | $2.50 per 1M tokens | $10.00 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | General tasksVision analysis+1 |
GPT-4o Mini 2024-07 | OpenAI | 128K max out: 16K | $0.15 per 1M tokens | $0.60 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | Cost-effective tasksHigh-volume applications+1 |
Llama 3.2 90B Vision 2024-09 | Meta | 128K max out: 8K | $0.90 per 1M tokens | $0.90 per 1M tokens | 🚀 Fast ~500ms TTFT | VisionReasoning | Vision tasksOpen-source multimodal+1 |
Llama 3.3 70B 2024-12 | Meta | 128K max out: 8K | $0.50 per 1M tokens | $0.75 per 1M tokens | 🚀 Fast ~400ms TTFT | VisionReasoning | Open-source deploymentCost efficiency+1 |
Mistral Large 2024-11 | Mistral | 128K max out: 8K | $2.00 per 1M tokens | $6.00 per 1M tokens | 🚀 Fast ~350ms TTFT | VisionReasoning | Enterprise tasksEuropean compliance+1 |
Mistral Small 2024-09 | Mistral | 32K max out: 8K | $0.20 per 1M tokens | $0.60 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | Fast responsesCost-sensitive tasks+1 |
o1 2024-12 | OpenAI | 200K max out: 100K | $15.00 per 1M tokens | $60.00 per 1M tokens | 🐢 Slow ~5000ms TTFT | VisionReasoning | Complex reasoningMath & science+1 |
o1 Mini 2024-09 | OpenAI | 128K max out: 66K | $3.00 per 1M tokens | $12.00 per 1M tokens | ⏱️ Medium ~2000ms TTFT | VisionReasoning | STEM tasksCoding+1 |
o3 Mini 2025-01 | OpenAI | 200K max out: 100K | $1.10 per 1M tokens | $4.40 per 1M tokens | ⏱️ Medium ~1500ms TTFT | VisionReasoning | Cost-effective reasoningTechnical tasks+1 |
GPT-4o Mini, Gemini 2.0 Flash, or DeepSeek V3 offer excellent price-to-performance ratios for high-volume tasks.
o1, Claude Opus 4, or DeepSeek R1 excel at multi-step reasoning, mathematics, and complex analysis tasks.
Gemini 1.5 Pro (2M context) and Claude models (200K) handle extensive documents and codebases effectively.
Claude Sonnet 4 and GPT-4o provide reliable function calling and tool use for autonomous agent applications.