🆚

AI Model Comparison 2026

Compare GPT-5, Claude 4.7, Gemini 2.5, Llama 4, DeepSeek — context window, input/output pricing, modality (text/vision/audio), thinking mode. Updated 2026-05.

All tools

💰 Cheapest

GPT-4o mini

$0.15 / $ 0.6 per 1M · OpenAI

📦 Largest context

Gemini 2.5 Pro

2,000K tokens · Google

Provider: Sort: Prices updated 2026-05. Verify on provider pages before committing budget.

Model	Context	Output	Input $/1M	Output $/1M	Modality	Notes
Claude Opus 4.7 Anthropic⚡ thinking	1M	64K	$15	$75	📝 👁	Top for complex coding & reasoning
Claude Sonnet 4.6 Anthropic⚡ thinking	1M	64K	$3	$15	📝 👁	Best price/quality balance
Claude Haiku 4.5 Anthropic	200K	8K	$0.8	$4	📝 👁	Fast, cheap, high-throughput
GPT-5 OpenAI⚡ thinking	400K	16K	$5	$20	📝 👁 🎙	General-purpose with thinking mode
GPT-4o OpenAI	128K	16K	$2.5	$10	📝 👁 🎙	Native multimodal, fast
GPT-4o mini OpenAI	128K	16K	$0.15	$0.6	📝 👁	Cheapest in OpenAI line
o3 OpenAI⚡ thinking	200K	100K	$10	$40	📝 👁	Reasoning model, strong math/code
Gemini 2.5 Pro Google⚡ thinking	2M	64K	$1.25	$10	📝 👁 🎙 🎥	Largest context (2M tokens)
Gemini 2.5 Flash Google	1M	64K	$0.3	$2.5	📝 👁 🎙 🎥	Full multimodal at low price
Llama 3.3 70B Meta	128K	8K	$0.6	$0.8	📝	Open weights, self-hostable
Llama 4 Maverick Meta	256K	8K	$0.27	$0.85	📝 👁	MoE, released 2025
DeepSeek V3 DeepSeek	128K	8K	$0.27	$1.1	📝	Strong coding at very low price
DeepSeek R1 DeepSeek⚡ thinking	128K	32K	$0.55	$2.19	📝	Open-weights reasoning model
Grok 3 xAI	256K	8K	$3	$15	📝 👁	Realtime web access
Mistral Large 2 Mistral	128K	8K	$2	$6	📝	EU-hosted, GDPR friendly
Qwen 2.5 72B Alibaba	128K	8K	$0.4	$1.2	📝 👁	Strong multilingual incl. Chinese

📝 = text · 👁 = vision · 🎙 = audio · 🎥 = video

How to pick a model

High-throughput simple tasks (classification, extraction, FAQ bots): pick the cheapest — Haiku 4.5, GPT-4o mini, Gemini Flash, DeepSeek V3.
Complex coding & reasoning: Claude Opus 4.7 or o3 (thinking mode).
Long documents (books, full codebases): Gemini 2.5 Pro (2M context) or Claude (1M).
Native multimodal incl. audio + video: Gemini 2.5 — only one handling all four modalities.
Self-host / on-prem: Llama 4, DeepSeek (open weights).
EU/GDPR compliance: Mistral (EU-hosted).

Output vs input cost

Output is usually 4-5× the input price. For a 1k-input / 1k-output call, ~80% of the bill is the response. Tip: ask for terse replies (respond in <100 words).