Loading...
Compare pricing, context windows, capabilities, and latency across major LLM providers. Find the right model for your use case.
Last updated: February 2025. Prices and specifications from official provider documentation.
| Model | Provider ⇅ | Context ⇅ | Input Price ⇅ | Output Price ⇅ | Latency ⇅ | Capabilities | Best For |
|---|---|---|---|---|---|---|---|
Claude 3.5 Haiku 2024-10 | Anthropic | 200K max out: 8K | $0.80 per 1M tokens | $4.00 per 1M tokens | ⚡ Very Fast ~200ms TTFT | VisionReasoning | High-speed tasksCost efficiency+1 |
Claude 4.5 Haiku 2025-05 | Anthropic | 200K max out: 8K | $0.80 per 1M tokens | $4.00 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | High-speed tasksCost efficiency+2 |
Claude Opus 4 2025-05 | Anthropic | 200K max out: 32K | $15.00 per 1M tokens | $75.00 per 1M tokens | ⏱️ Medium ~800ms TTFT | VisionReasoning | Complex analysisResearch+2 |
Claude Sonnet 4 2025-05 | Anthropic | 200K max out: 64K | $3.00 per 1M tokens | $15.00 per 1M tokens | 🚀 Fast ~400ms TTFT | VisionReasoning | Balanced tasksCoding+2 |
DeepSeek R1 2025-01 | DeepSeek | 128K max out: 8K | $0.55 per 1M tokens | $2.19 per 1M tokens | ⏱️ Medium ~1500ms TTFT | VisionReasoning | Complex reasoningMath+1 |
DeepSeek V3 2024-12 | DeepSeek | 128K max out: 8K | $0.27 per 1M tokens | $1.10 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | CodingGeneral tasks+1 |
Gemini 1.5 Pro 2024-02 | 2.0M max out: 8K | $1.25 per 1M tokens | $5.00 per 1M tokens | 🚀 Fast ~500ms TTFT | VisionReasoning | Long documentsVideo analysis+1 | |
Gemini 2.0 Flash 2024-12 | 1.0M max out: 8K | $0.10 per 1M tokens | $0.40 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | High-volume tasksMultimodal+1 | |
Gemini 2.0 Flash Thinking 2025-01 | 1.0M max out: 8K | $0.10 per 1M tokens | $0.40 per 1M tokens | ⏱️ Medium ~1000ms TTFT | VisionReasoning | Reasoning tasksComplex analysis+1 | |
Gemini 2.5 Flash 2025-04 | 1.0M max out: 66K | $0.15 per 1M tokens | $0.60 per 1M tokens | 🚀 Fast ~250ms TTFT | VisionReasoning | Cost-effective reasoningHigh-volume tasks+1 | |
Gemini 2.5 Pro 2025-03 | 1.0M max out: 66K | $1.25 per 1M tokens | $10.00 per 1M tokens | ⏱️ Medium ~800ms TTFT | VisionReasoning | Complex reasoningCoding+2 | |
GPT-4.1 2025-04 | OpenAI | 1.0M max out: 33K | $2.00 per 1M tokens | $8.00 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | General tasksCoding+2 |
GPT-4.1 Mini 2025-04 | OpenAI | 1.0M max out: 33K | $0.40 per 1M tokens | $1.60 per 1M tokens | 🚀 Fast ~200ms TTFT | VisionReasoning | Cost-effective tasksHigh-volume applications+1 |
GPT-4.1 Nano 2025-04 | OpenAI | 1.0M max out: 33K | $0.10 per 1M tokens | $0.40 per 1M tokens | ⚡ Very Fast ~100ms TTFT | VisionReasoning | Ultra-low-cost tasksClassification+2 |
GPT-4o 2024-05 | OpenAI | 128K max out: 16K | $2.50 per 1M tokens | $10.00 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | General tasksVision analysis+1 |
GPT-4o Mini 2024-07 | OpenAI | 128K max out: 16K | $0.15 per 1M tokens | $0.60 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | Cost-effective tasksHigh-volume applications+1 |
Llama 3.2 90B Vision 2024-09 | Meta | 128K max out: 8K | $0.90 per 1M tokens | $0.90 per 1M tokens | 🚀 Fast ~500ms TTFT | VisionReasoning | Vision tasksOpen-source multimodal+1 |
Llama 3.3 70B 2024-12 | Meta | 128K max out: 8K | $0.50 per 1M tokens | $0.75 per 1M tokens | 🚀 Fast ~400ms TTFT | VisionReasoning | Open-source deploymentCost efficiency+1 |
Llama 4 Maverick 2025-04 | Meta | 1.0M max out: 16K | $0.50 per 1M tokens | $0.75 per 1M tokens | 🚀 Fast ~350ms TTFT | VisionReasoning | Open-source deploymentMultimodal tasks+2 |
Llama 4 Scout 2025-04 | Meta | 10.0M max out: 16K | $0.20 per 1M tokens | $0.30 per 1M tokens | 🚀 Fast ~300ms TTFT | VisionReasoning | Ultra-long contextOpen-source deployment+1 |
Mistral Large 2024-11 | Mistral | 128K max out: 8K | $2.00 per 1M tokens | $6.00 per 1M tokens | 🚀 Fast ~350ms TTFT | VisionReasoning | Enterprise tasksEuropean compliance+1 |
Mistral Small 2024-09 | Mistral | 32K max out: 8K | $0.20 per 1M tokens | $0.60 per 1M tokens | ⚡ Very Fast ~150ms TTFT | VisionReasoning | Fast responsesCost-sensitive tasks+1 |
o1 2024-12 | OpenAI | 200K max out: 100K | $15.00 per 1M tokens | $60.00 per 1M tokens | 🐢 Slow ~5000ms TTFT | VisionReasoning | Complex reasoningMath & science+1 |
o1 Mini 2024-09 | OpenAI | 128K max out: 66K | $3.00 per 1M tokens | $12.00 per 1M tokens | ⏱️ Medium ~2000ms TTFT | VisionReasoning | STEM tasksCoding+1 |
o3 2025-04 | OpenAI | 200K max out: 100K | $10.00 per 1M tokens | $40.00 per 1M tokens | 🐢 Slow ~5000ms TTFT | VisionReasoning | Complex reasoningMath & science+2 |
o3 Mini 2025-01 | OpenAI | 200K max out: 100K | $1.10 per 1M tokens | $4.40 per 1M tokens | ⏱️ Medium ~1500ms TTFT | VisionReasoning | Cost-effective reasoningTechnical tasks+1 |
o4 Mini 2025-04 | OpenAI | 200K max out: 100K | $1.10 per 1M tokens | $4.40 per 1M tokens | ⏱️ Medium ~1200ms TTFT | VisionReasoning | ReasoningCoding+2 |
GPT-4o Mini, Gemini 2.0 Flash, or DeepSeek V3 offer excellent price-to-performance ratios for high-volume tasks.
o1, Claude Opus 4, or DeepSeek R1 excel at multi-step reasoning, mathematics, and complex analysis tasks.
Gemini 1.5 Pro (2M context) and Claude models (200K) handle extensive documents and codebases effectively.
Claude Sonnet 4 and GPT-4o provide reliable function calling and tool use for autonomous agent applications.