Interactive Explainer — Updated February 2025

Choosing the right Claude model

Three models. One family. From lightning-fast Haiku to deep-thinking Opus — an interactive guide to understanding what each model does best, what it costs, and when to use it.

0
Model tiers
economy → balanced → frontier
0K
Context window
tokens across all models
0%
Batch savings
with the Batch API
Scroll to explore

Meet the Family

Three models, each optimized for a different balance of speed, intelligence, and cost.

Think of it like choosing a vehicle.

Sometimes you need a quick scooter for short trips. Sometimes a reliable sedan for daily driving. And sometimes — a heavy-duty truck for the big jobs. Claude's three models work the same way: each is built for a different balance of intelligence, speed, and cost.

Most teams use 2 or 3 models together — routing simple tasks to the fast one and complex tasks to the smart one.

The scooter: Haiku 4.5

Haiku is the speed demon. At $0.80 per million input tokens, it's 19x cheaper than Opus. It won't write your PhD thesis, but it will classify 50,000 support tickets before lunch.

Best at: Classification, routing, real-time responses, high-volume processing.

This is why chatbots feel instant — they're almost certainly running on a model like Haiku.

The sedan: Sonnet 4.5

Sonnet is the workhorse. It's the model most teams default to — smart enough for coding assistants, content generation, and data analysis, but not so expensive that your CFO notices.

Best at: Code generation, content writing, data analysis, general-purpose production workloads.

If you're not sure which model to start with, Sonnet is almost always the right answer.

The truck: Opus 4.6

Opus is for when the stakes are high and the problem is hard. Legal analysis, research synthesis, strategic planning — tasks where being wrong costs more than the model. At $15/million input tokens, you pay for the best reasoning available.

Best at: Complex reasoning, research, high-stakes decisions, novel problem solving.

Opus scores 83.7% on MMLU Pro — graduate-level reasoning across 57 subjects.
The Claude Family
Intelligence →Cost →Haiku 4.5🧠Sonnet 4.5Opus 4.6

The Intelligence Spectrum

How each model performs across industry-standard evaluations.

Numbers tell the story.

Imagine three students taking the same exam. One finishes first but gets a B+. Another takes a bit longer for an A. The third aces everything but needs the most time. That's essentially what's happening with benchmarks— standardized tests for AI models.

Higher scores mean the model gets more answers right — but scoring 5% higher often costs 5x more.

Can it actually code?

SWE-bench Verified tests whether a model can fix real bugs in production code. Opus solves 72.5% of issues — nearly 3x more than Haiku. This is why Opus powers tools like Claude Code for complex multi-file refactors.

This is why your AI coding assistant sometimes "just works" and sometimes needs hand-holding — model capability matters enormously.

Can it think abstractly?

ARC-AGI-2 is the hardest benchmark here — it tests whether a model can solve problems it has literally never seen before. Scores are low across the board, but Opus leads at 21.2%. This is the frontier of AI reasoning.

ARC-AGI-2 is designed to resist memorization — models can't just recall answers they've seen in training data.

Can it do a real job?

OSWorld and Finance Agent test whether a model can complete real tasks — using a computer GUI, managing spreadsheets, executing financial analysis workflows. These are the benchmarks that matter most for production: not "can it answer a trivia question?" but "can it do useful work?"

Agent benchmarks are the new frontier — they test the models on tasks that would actually save you time and money.
All Benchmarks
SWE-bench VerifiedHaiku25%Sonnet55%Opus72.5%OSWorldHaiku12%Sonnet28.5%Opus38.2%ARC-AGI-2Haiku4%Sonnet14%Opus21.2%Finance AgentHaiku48%Sonnet72%Opus85%
Haiku
Sonnet
Opus

What It Costs

Pay per token, scale from zero to millions of requests.

You pay per word, not per month.

Think of it like electricity: you pay for what you use. Claude charges per token — a chunk of text roughly equal to ¾ of a word. Input tokens (what you send) and output tokens (what you get back) are priced separately.

The price spread is dramatic: Opus output costs 18.75x more than Haiku output.

A 1,000-word document is roughly 1,300 tokens. A typical API call uses 500-2,000 tokens.

The hidden cost multiplier.

Here's what catches people off guard: output tokens cost 5x more than input tokens across all models. A chatbot that generates long, verbose responses will cost significantly more than one with concise, focused answers — even on the same model.

For Sonnet: 1M input tokens = $3. But 1M output tokens = $15. That's the same content costing 5x more just because the model generated it instead of reading it.

This is why "be concise" in your system prompt isn't just a style preference — it's a cost optimization.

Four ways to slash your bill.

Smart teams don't just pick a model — they optimize how they use it. Prompt caching alone can cut costs by up to 90% for repeated system prompts. The Batch API gives an instant 50% discount for non-urgent work. And model routing — sending simple tasks to Haiku and complex ones to Opus — can reduce costs by 60-80%.

Combined, these strategies can reduce a $10,000/month bill to under $2,000.

The best AI teams don't spend the most — they spend the smartest.
Price per 1M tokens
Haiku 4.5
Input$0.80
Output$4.00
Sonnet 4.5
Input$3.00
Output$15.00
Opus 4.6
Input$15.00
Output$75.00

Cost Calculator

Think of this like an electricity bill estimator. Plug in your usage, see what each model would cost.

Try clicking a preset scenario below, or adjust the sliders manually.

1,000
500
1,000
Monthly Cost Estimate
Haiku 4.5
$84.0/moCheapest
Sonnet 4.5
$315/mo
Opus 4.6
$1,575/mo

Where Each Model Shines

Like picking the right tool from a toolbox — each model excels at different jobs.

Filter by model to see what each one does best.

🎧haiku

Customer Support

Automated ticket routing and response generation

🛡️haiku

Content Moderation

Real-time content classification and filtering

📊haiku

Data Extraction

Structured data extraction from unstructured text

💻sonnet

Coding Assistant

Code completion, debugging, and refactoring

✍️sonnet

Content Writing

Blog posts, marketing copy, social media content

📈sonnet

Data Analysis

Analyzing datasets, generating insights and reports

🔗sonnet

API Integration

Building and testing API connections

🌍sonnet

Translation

Multi-language translation with context preservation

🔬opus

Research Synthesis

Multi-source research analysis and literature review

🧭opus

Strategic Planning

Complex business analysis and strategic recommendations

⚖️opus

Legal Analysis

Contract review, compliance checking, legal research

📝opus

Scientific Writing

Academic papers, grant proposals, technical documentation


Find Your Model

Like a sommelier recommending wine — answer 5 questions and we'll suggest the best model for your palate.

Question 1 of 50%

What matters most for your application?


The Decision Map

Quick reference for choosing the right model by scenario.

A simple decision framework.

Think of choosing a model like choosing a restaurant. Fast food (Haiku) for quick meals. A solid bistro (Sonnet) for most occasions. A Michelin-star restaurant (Opus) when the experience really matters. The map on the right shows which model fits each scenario.

Most teams end up using 2-3 models, routing requests based on complexity — not locking into just one.

Haiku: when speed trumps depth.

High-volume classification, real-time responses, simple routing decisions. Haiku handles these at a fraction of the cost. A customer support system processing 50,000 tickets per day would cost $160/month on Haiku vs. $3,000/month on Opus.

This is why most chatbots feel instant — they're running on economy-tier models like Haiku.

Sonnet: the default choice.

If you're building a coding assistant, generating marketing content, or analyzing data — Sonnet is almost always the right starting point. It handles code generation with 93% accuracy on HumanEval and provides excellent reasoning at a fraction of Opus's cost.

When in doubt, start with Sonnet and only upgrade to Opus if you measurably need better results.

Opus: when accuracy is everything.

Legal contract review where a missed clause costs millions. Research synthesis where nuance matters. Strategic planning where the recommendation drives real decisions. Opus is for when the cost of being wrong exceeds the cost of the model.

Opus isn't expensive — being wrong is expensive. Opus is insurance.
Decision Map
Haiku Zone
High-volume, low-complexity tasks
Real-time user-facing applications
Classification and routing
Sonnet Zone
General-purpose production workloads
Code generation and review
Content creation at scale
Opus Zone
Complex multi-step reasoning
Research and analysis
High-stakes decision support
Novel problem solving
Rule of thumb: Start with Sonnet. Drop to Haiku for speed. Upgrade to Opus for depth.
🎉
You explored all of this!
If this was useful, share it with someone choosing a Claude model.
See more explainers →