LLM API Pricing

Claude API Pricing in 2026

Anthropic charges $3–$15 per million tokens depending on model. RunAPI mirrors every Claude model at half the official rate — same API, same output, 50% less on your invoice.

Get API key — free Read API docs

Updated June 18, 2026 RunAPI Editorial

At a glance

What does the Claude API cost right now?

Anthropic publishes per-token prices for three model tiers: Haiku for lightweight tasks, Sonnet for balanced workloads, and Opus for maximum capability. All prices below are per million tokens, the billing unit Anthropic uses.

Cheapest option

Haiku 4.5 at $1/M input and $5/M output through RunAPI. Official rate is $2/$10.

Most popular

Sonnet 4.6 at $3/M input and $15/M output through RunAPI. Official rate is $6/$30.

Maximum capability

Opus 4.7 at $5/M input and $25/M output through RunAPI. Official rate is $10/$50.

Cache discount

Cache reads cost 90% less than standard input tokens — $0.10/M for Haiku through RunAPI.

Model-by-model breakdown

How much does each Claude model cost per million tokens?

The table below shows official Anthropic pricing alongside RunAPI pricing. RunAPI applies a flat 50% discount across all Claude models. No volume commits, no subscriptions.

Model	Official input /M	Official output /M	RunAPI input /M	RunAPI output /M	Context window
Opus 4.8	$15.00	$75.00	$7.50	$37.50	200K
Opus 4.7	$10.00	$50.00	$5.00	$25.00	200K
Opus 4.6	$10.00	$50.00	$5.00	$25.00	200K
Sonnet 4.6	$6.00	$30.00	$3.00	$15.00	200K
Sonnet 4.5	$6.00	$30.00	$3.00	$15.00	200K
Haiku 4.5	$2.00	$10.00	$1.00	$5.00	200K

Anthropic official pricing ↗ RunAPI pricing ↗

Prompt caching

How do cache discounts reduce your Claude API bill?

Anthropic's prompt caching stores repeated prefixes and charges less when the cached version is reused. This matters for coding agents like Claude Code, which send the same system prompt and file context on every request.

Cache read

90% discount on input tokens. A Sonnet 4.6 cache read costs $0.30/M instead of $3.00/M through RunAPI.

Cache write (5 min TTL)

25% surcharge on input — $3.75/M for Sonnet 4.6 through RunAPI. The cached prefix stays available for 5 minutes.

Cache write (1 hour TTL)

2x the input rate — $6.00/M for Sonnet 4.6 through RunAPI. Useful for long coding sessions where context reuse is frequent.

Batch processing

Anthropic offers a 50% discount on all models for batch requests that tolerate up to 24-hour turnaround. RunAPI passes this discount through.

Real-world costs

What does the Claude API actually cost for real workloads?

Token costs look abstract until you attach them to real tasks. Below are five common developer workloads with estimated monthly costs at two usage levels.

Workload	Model	Light use (~50 tasks/day)	Heavy use (~200 tasks/day)	Monthly saving vs official
Vibe coding session (Claude Code)	Sonnet 4.6	$45/mo	$180/mo	$45–$180
PR code review agent	Opus 4.7	$75/mo	$300/mo	$75–$300
RAG-powered docs chatbot	Haiku 4.5	$12/mo	$48/mo	$12–$48
Content generation pipeline	Sonnet 4.6	$30/mo	$120/mo	$30–$120
Multi-agent orchestrator	Opus 4.7	$150/mo	$600/mo	$150–$600

Provider comparison

Is Claude API cheaper than OpenAI and Gemini?

Developers often compare Claude against GPT-5 and Gemini 2.5 Pro. Here is how the flagship models stack up on a per-million-token basis.

Provider	Flagship model	Input /M	Output /M	RunAPI rate
Anthropic	Claude Opus 4.7	$10.00	$50.00	$5.00 / $25.00
OpenAI	GPT-5.4	$2.50	$15.00	$1.25 / $7.50
Google	Gemini 2.5 Pro	$1.25	$10.00	$0.63 / $5.00

RunAPI applies a 50% discount on all providers listed above. Prices verified June 2026.

Subscription vs API

Is Claude API cheaper than a Claude Max subscription?

Claude Max costs $100/month for unlimited Claude Code usage (or $200 for the 5x plan). The API charges per token. For developers who run under 10 million output tokens per month on Sonnet 4.6, the RunAPI route costs less than the Max subscription — and there is no usage cap on any model tier.

Claude Max ($100/mo)

Unlimited usage of Sonnet and limited Opus in Claude Code. Fixed monthly cost. No API access.

Claude API via RunAPI

Pay per token with no monthly commitment. Sonnet 4.6 at $3/M input and $15/M output. $100 buys roughly 6.7 million output tokens — enough for most individual developers.

When Max wins

Heavy daily users who consistently exceed 10 million output tokens per month. The break-even point on Sonnet 4.6 through RunAPI is around 6.7M output tokens.

When API wins

Teams, CI pipelines, multi-model setups, and developers who want Opus or Haiku access alongside Sonnet. No cap, no waitlist, no subscription lock-in.

Getting started

How to access Claude API through RunAPI

Create a RunAPI account

Copy your API key

Go to Dashboard → API Keys. Create a key and save it — you will use this as your OpenAI API key.

Point your SDK to RunAPI

Set the base URL to https://api.runapi.ai/v1 and use your RunAPI API key. Any OpenAI-compatible client works — Python, Node.js, Go, Ruby, or curl.

Start making requests

Use claude-sonnet-4-6, claude-opus-4-7, or any Claude model ID in the model parameter. RunAPI handles routing and billing at 50% of the official rate.

Frequently asked questions

Claude API Pricing FAQ

Does RunAPI modify Claude's output?

No. RunAPI proxies requests directly to Anthropic's API. The model output, safety filters, and behavior are identical to calling Anthropic directly.

Why is RunAPI 50% cheaper than the official API?

RunAPI negotiates volume pricing with model providers and passes the savings to developers. There is no quality difference — same models, same API.

Can I use RunAPI with Claude Code?

Yes. Set ANTHROPIC_BASE_URL to https://api.runapi.ai and your RunAPI key as the API key. Claude Code works without modification.

What happens if Anthropic changes their pricing?

RunAPI adjusts within 24 hours. The 50% discount is maintained relative to Anthropic's published rates.

Is there a free tier?

Yes. New accounts get free credits to test any model. After that, billing is strictly pay-as-you-go with no minimum.

Does RunAPI support prompt caching?

Yes. Cache reads, 5-minute writes, and 1-hour writes are all supported at 50% of Anthropic's cache pricing.

Can I use the OpenAI SDK to call Claude through RunAPI?

Yes. RunAPI is OpenAI-compatible. Point any OpenAI client at api.runapi.ai/v1 and use Claude model IDs.

How does billing work?

Pay-as-you-go. You fund your account with a balance, and each API call deducts the token cost. No subscriptions, no invoices, no contracts.

Start using Claude at half price.

Create a free RunAPI account, get your API key, and start calling Claude Opus, Sonnet, or Haiku at 50% off official Anthropic pricing.

Get free API key Compare all model pricing