LLM Model Comparison

Compare pricing, context windows, capabilities, and latency across major LLM providers. Find the right model for your use case.

Last updated: February 2025. Prices and specifications from official provider documentation.

Showing 27 of 27 models

Model	Provider ⇅	Context ⇅	Input Price ⇅	Output Price ⇅	Latency ⇅	Capabilities	Best For
Claude 3.5 Haiku 2024-10	Anthropic	200K max out: 8K	$0.80 per 1M tokens	$4.00 per 1M tokens	⚡ Very Fast ~200ms TTFT	VisionReasoning	High-speed tasksCost efficiency+1
Claude 4.5 Haiku 2025-05	Anthropic	200K max out: 8K	$0.80 per 1M tokens	$4.00 per 1M tokens	⚡ Very Fast ~150ms TTFT	VisionReasoning	High-speed tasksCost efficiency+2
Claude Opus 4 2025-05	Anthropic	200K max out: 32K	$15.00 per 1M tokens	$75.00 per 1M tokens	⏱️ Medium ~800ms TTFT	VisionReasoning	Complex analysisResearch+2
Claude Sonnet 4 2025-05	Anthropic	200K max out: 64K	$3.00 per 1M tokens	$15.00 per 1M tokens	🚀 Fast ~400ms TTFT	VisionReasoning	Balanced tasksCoding+2
DeepSeek R1 2025-01	DeepSeek	128K max out: 8K	$0.55 per 1M tokens	$2.19 per 1M tokens	⏱️ Medium ~1500ms TTFT	VisionReasoning	Complex reasoningMath+1
DeepSeek V3 2024-12	DeepSeek	128K max out: 8K	$0.27 per 1M tokens	$1.10 per 1M tokens	🚀 Fast ~300ms TTFT	VisionReasoning	CodingGeneral tasks+1
Gemini 1.5 Pro 2024-02	Google	2.0M max out: 8K	$1.25 per 1M tokens	$5.00 per 1M tokens	🚀 Fast ~500ms TTFT	VisionReasoning	Long documentsVideo analysis+1
Gemini 2.0 Flash 2024-12	Google	1.0M max out: 8K	$0.10 per 1M tokens	$0.40 per 1M tokens	⚡ Very Fast ~150ms TTFT	VisionReasoning	High-volume tasksMultimodal+1
Gemini 2.0 Flash Thinking 2025-01	Google	1.0M max out: 8K	$0.10 per 1M tokens	$0.40 per 1M tokens	⏱️ Medium ~1000ms TTFT	VisionReasoning	Reasoning tasksComplex analysis+1
Gemini 2.5 Flash 2025-04	Google	1.0M max out: 66K	$0.15 per 1M tokens	$0.60 per 1M tokens	🚀 Fast ~250ms TTFT	VisionReasoning	Cost-effective reasoningHigh-volume tasks+1
Gemini 2.5 Pro 2025-03	Google	1.0M max out: 66K	$1.25 per 1M tokens	$10.00 per 1M tokens	⏱️ Medium ~800ms TTFT	VisionReasoning	Complex reasoningCoding+2
GPT-4.1 2025-04	OpenAI	1.0M max out: 33K	$2.00 per 1M tokens	$8.00 per 1M tokens	🚀 Fast ~300ms TTFT	VisionReasoning	General tasksCoding+2
GPT-4.1 Mini 2025-04	OpenAI	1.0M max out: 33K	$0.40 per 1M tokens	$1.60 per 1M tokens	🚀 Fast ~200ms TTFT	VisionReasoning	Cost-effective tasksHigh-volume applications+1
GPT-4.1 Nano 2025-04	OpenAI	1.0M max out: 33K	$0.10 per 1M tokens	$0.40 per 1M tokens	⚡ Very Fast ~100ms TTFT	VisionReasoning	Ultra-low-cost tasksClassification+2
GPT-4o 2024-05	OpenAI	128K max out: 16K	$2.50 per 1M tokens	$10.00 per 1M tokens	🚀 Fast ~300ms TTFT	VisionReasoning	General tasksVision analysis+1
GPT-4o Mini 2024-07	OpenAI	128K max out: 16K	$0.15 per 1M tokens	$0.60 per 1M tokens	⚡ Very Fast ~150ms TTFT	VisionReasoning	Cost-effective tasksHigh-volume applications+1
Llama 3.2 90B Vision 2024-09	Meta	128K max out: 8K	$0.90 per 1M tokens	$0.90 per 1M tokens	🚀 Fast ~500ms TTFT	VisionReasoning	Vision tasksOpen-source multimodal+1
Llama 3.3 70B 2024-12	Meta	128K max out: 8K	$0.50 per 1M tokens	$0.75 per 1M tokens	🚀 Fast ~400ms TTFT	VisionReasoning	Open-source deploymentCost efficiency+1
Llama 4 Maverick 2025-04	Meta	1.0M max out: 16K	$0.50 per 1M tokens	$0.75 per 1M tokens	🚀 Fast ~350ms TTFT	VisionReasoning	Open-source deploymentMultimodal tasks+2
Llama 4 Scout 2025-04	Meta	10.0M max out: 16K	$0.20 per 1M tokens	$0.30 per 1M tokens	🚀 Fast ~300ms TTFT	VisionReasoning	Ultra-long contextOpen-source deployment+1
Mistral Large 2024-11	Mistral	128K max out: 8K	$2.00 per 1M tokens	$6.00 per 1M tokens	🚀 Fast ~350ms TTFT	VisionReasoning	Enterprise tasksEuropean compliance+1
Mistral Small 2024-09	Mistral	32K max out: 8K	$0.20 per 1M tokens	$0.60 per 1M tokens	⚡ Very Fast ~150ms TTFT	VisionReasoning	Fast responsesCost-sensitive tasks+1
o1 2024-12	OpenAI	200K max out: 100K	$15.00 per 1M tokens	$60.00 per 1M tokens	🐢 Slow ~5000ms TTFT	VisionReasoning	Complex reasoningMath & science+1
o1 Mini 2024-09	OpenAI	128K max out: 66K	$3.00 per 1M tokens	$12.00 per 1M tokens	⏱️ Medium ~2000ms TTFT	VisionReasoning	STEM tasksCoding+1
o3 2025-04	OpenAI	200K max out: 100K	$10.00 per 1M tokens	$40.00 per 1M tokens	🐢 Slow ~5000ms TTFT	VisionReasoning	Complex reasoningMath & science+2
o3 Mini 2025-01	OpenAI	200K max out: 100K	$1.10 per 1M tokens	$4.40 per 1M tokens	⏱️ Medium ~1500ms TTFT	VisionReasoning	Cost-effective reasoningTechnical tasks+1
o4 Mini 2025-04	OpenAI	200K max out: 100K	$1.10 per 1M tokens	$4.40 per 1M tokens	⏱️ Medium ~1200ms TTFT	VisionReasoning	ReasoningCoding+2

Understanding the Data

Context Window: Maximum tokens the model can process in a single request

Pricing: Cost per million tokens (input/output may differ)

TTFT: Time To First Token - latency before response starts

Vision: Can process images and visual content

Reasoning: Extended thinking/chain-of-thought capabilities

Prices: As of April 2026, subject to change

Choosing the Right Model

For Cost-Sensitive Applications

GPT-4o Mini, Gemini 2.0 Flash, or DeepSeek V3 offer excellent price-to-performance ratios for high-volume tasks.

GPT-4o MiniGemini 2.0 FlashDeepSeek V3

For Complex Reasoning

o1, Claude Opus 4, or DeepSeek R1 excel at multi-step reasoning, mathematics, and complex analysis tasks.

o1Claude Opus 4DeepSeek R1

For Long Documents

Gemini 1.5 Pro (2M context) and Claude models (200K) handle extensive documents and codebases effectively.

Gemini 1.5 ProClaude Sonnet 4Gemini 2.0 Flash

For Agentic Workflows

Claude Sonnet 4 and GPT-4o provide reliable function calling and tool use for autonomous agent applications.

Claude Sonnet 4GPT-4oClaude Opus 4

Disclaimer: Pricing and specifications are subject to change. Always verify current pricing on the official provider websites before making decisions. Some models may have additional costs for fine-tuning, batch processing, or enterprise features.

Model

Provider ⇅

Context ⇅

Input Price ⇅

Output Price ⇅

Latency ⇅

Capabilities

Best For

Claude 3.5 Haiku

2024-10

Anthropic

200K

max out: 8K

$0.80

per 1M tokens

$4.00

per 1M tokens

⚡ Very Fast

~200ms TTFT

VisionReasoning

High-speed tasksCost efficiency+1

Claude 4.5 Haiku

2025-05

Anthropic

200K

max out: 8K

$0.80

per 1M tokens

$4.00

per 1M tokens

⚡ Very Fast

~150ms TTFT

VisionReasoning

High-speed tasksCost efficiency+2

Claude Opus 4

2025-05

Anthropic

200K

max out: 32K

$15.00

per 1M tokens

$75.00

per 1M tokens

⏱️ Medium

~800ms TTFT

VisionReasoning

Complex analysisResearch+2

Claude Sonnet 4

2025-05

Anthropic

200K

max out: 64K

$3.00

per 1M tokens

$15.00

per 1M tokens

🚀 Fast

~400ms TTFT

VisionReasoning

Balanced tasksCoding+2

DeepSeek R1

2025-01

DeepSeek

128K

max out: 8K

$0.55

per 1M tokens

$2.19

per 1M tokens

⏱️ Medium

~1500ms TTFT

VisionReasoning

Complex reasoningMath+1

DeepSeek V3

2024-12

DeepSeek

128K

max out: 8K

$0.27

per 1M tokens

$1.10

per 1M tokens

🚀 Fast

~300ms TTFT

VisionReasoning

CodingGeneral tasks+1

Gemini 1.5 Pro

2024-02

Google

2.0M

max out: 8K

$1.25

per 1M tokens

$5.00

per 1M tokens

🚀 Fast

~500ms TTFT

VisionReasoning

Long documentsVideo analysis+1

Gemini 2.0 Flash

2024-12

Google

1.0M

max out: 8K

$0.10

per 1M tokens

$0.40

per 1M tokens

⚡ Very Fast

~150ms TTFT

VisionReasoning

High-volume tasksMultimodal+1

Gemini 2.0 Flash Thinking

2025-01

Google

1.0M

max out: 8K

$0.10

per 1M tokens

$0.40

per 1M tokens

⏱️ Medium

~1000ms TTFT

VisionReasoning

Reasoning tasksComplex analysis+1

Gemini 2.5 Flash

2025-04

Google

1.0M

max out: 66K

$0.15

per 1M tokens

$0.60

per 1M tokens

🚀 Fast

~250ms TTFT

VisionReasoning

Cost-effective reasoningHigh-volume tasks+1

Gemini 2.5 Pro

2025-03

Google

1.0M

max out: 66K

$1.25

per 1M tokens

$10.00

per 1M tokens

⏱️ Medium

~800ms TTFT

VisionReasoning

Complex reasoningCoding+2

GPT-4.1

2025-04

OpenAI

1.0M

max out: 33K

$2.00

per 1M tokens

$8.00

per 1M tokens

🚀 Fast

~300ms TTFT

VisionReasoning

General tasksCoding+2

GPT-4.1 Mini

2025-04

OpenAI

1.0M

max out: 33K

$0.40

per 1M tokens

$1.60

per 1M tokens

🚀 Fast

~200ms TTFT

VisionReasoning

Cost-effective tasksHigh-volume applications+1

GPT-4.1 Nano

2025-04

OpenAI

1.0M

max out: 33K

$0.10

per 1M tokens

$0.40

per 1M tokens

⚡ Very Fast

~100ms TTFT

VisionReasoning

Ultra-low-cost tasksClassification+2

GPT-4o

2024-05

OpenAI

128K

max out: 16K

$2.50

per 1M tokens

$10.00

per 1M tokens

🚀 Fast

~300ms TTFT

VisionReasoning

General tasksVision analysis+1

GPT-4o Mini

2024-07

OpenAI

128K

max out: 16K

$0.15

per 1M tokens

$0.60

per 1M tokens

⚡ Very Fast

~150ms TTFT

VisionReasoning

Cost-effective tasksHigh-volume applications+1

Llama 3.2 90B Vision

2024-09

Meta

128K

max out: 8K

$0.90

per 1M tokens

$0.90

per 1M tokens

🚀 Fast

~500ms TTFT

VisionReasoning

Vision tasksOpen-source multimodal+1

Llama 3.3 70B

2024-12

Meta

128K

max out: 8K

$0.50

per 1M tokens

$0.75

per 1M tokens

🚀 Fast

~400ms TTFT

VisionReasoning

Open-source deploymentCost efficiency+1

Llama 4 Maverick

2025-04

Meta

1.0M

max out: 16K

$0.50

per 1M tokens

$0.75

per 1M tokens

🚀 Fast

~350ms TTFT

VisionReasoning

Open-source deploymentMultimodal tasks+2

Llama 4 Scout

2025-04

Meta

10.0M

max out: 16K

$0.20

per 1M tokens

$0.30

per 1M tokens

🚀 Fast

~300ms TTFT

VisionReasoning

Ultra-long contextOpen-source deployment+1

Mistral Large

2024-11

Mistral

128K

max out: 8K

$2.00

per 1M tokens

$6.00

per 1M tokens

🚀 Fast

~350ms TTFT

VisionReasoning

Enterprise tasksEuropean compliance+1

Mistral Small

2024-09

Mistral

32K

max out: 8K

$0.20

per 1M tokens

$0.60

per 1M tokens

⚡ Very Fast

~150ms TTFT

VisionReasoning

Fast responsesCost-sensitive tasks+1

2024-12

OpenAI

200K

max out: 100K

$15.00

per 1M tokens

$60.00

per 1M tokens

🐢 Slow

~5000ms TTFT

VisionReasoning

Complex reasoningMath & science+1

o1 Mini

2024-09

OpenAI

128K

max out: 66K

$3.00

per 1M tokens

$12.00

per 1M tokens

⏱️ Medium

~2000ms TTFT

VisionReasoning

STEM tasksCoding+1

2025-04

OpenAI

200K

max out: 100K

$10.00

per 1M tokens

$40.00

per 1M tokens

🐢 Slow

~5000ms TTFT

VisionReasoning

Complex reasoningMath & science+2

o3 Mini

2025-01

OpenAI

200K

max out: 100K

$1.10

per 1M tokens

$4.40

per 1M tokens

⏱️ Medium

~1500ms TTFT

VisionReasoning

Cost-effective reasoningTechnical tasks+1

o4 Mini

2025-04

OpenAI

200K

max out: 100K

$1.10

per 1M tokens

$4.40

per 1M tokens

⏱️ Medium

~1200ms TTFT

VisionReasoning

ReasoningCoding+2

Choosing the Right Model

For Cost-Sensitive Applications

GPT-4o Mini, Gemini 2.0 Flash, or DeepSeek V3 offer excellent price-to-performance ratios for high-volume tasks.

GPT-4o MiniGemini 2.0 FlashDeepSeek V3

For Complex Reasoning

o1, Claude Opus 4, or DeepSeek R1 excel at multi-step reasoning, mathematics, and complex analysis tasks.

o1Claude Opus 4DeepSeek R1

For Long Documents

Gemini 1.5 Pro (2M context) and Claude models (200K) handle extensive documents and codebases effectively.

Gemini 1.5 ProClaude Sonnet 4Gemini 2.0 Flash

For Agentic Workflows

Claude Sonnet 4 and GPT-4o provide reliable function calling and tool use for autonomous agent applications.

Claude Sonnet 4GPT-4oClaude Opus 4