Cache read
90% discount on input tokens. A Sonnet 4.6 cache read costs $0.30/M instead of $3.00/M through RunAPI.
Anthropic charges $3–$15 per million tokens depending on model. RunAPI mirrors every Claude model at half the official rate — same API, same output, 50% less on your invoice.
Anthropic publishes per-token prices for three model tiers: Haiku for lightweight tasks, Sonnet for balanced workloads, and Opus for maximum capability. All prices below are per million tokens, the billing unit Anthropic uses.
Haiku 4.5 at $1/M input and $5/M output through RunAPI. Official rate is $2/$10.
Sonnet 4.6 at $3/M input and $15/M output through RunAPI. Official rate is $6/$30.
Opus 4.7 at $5/M input and $25/M output through RunAPI. Official rate is $10/$50.
Cache reads cost 90% less than standard input tokens — $0.10/M for Haiku through RunAPI.
The table below shows official Anthropic pricing alongside RunAPI pricing. RunAPI applies a flat 50% discount across all Claude models. No volume commits, no subscriptions.
| Model | Official input /M | Official output /M | RunAPI input /M | RunAPI output /M | Context window |
|---|---|---|---|---|---|
| Opus 4.8 | $15.00 | $75.00 | $7.50 | $37.50 | 200K |
| Opus 4.7 | $10.00 | $50.00 | $5.00 | $25.00 | 200K |
| Opus 4.6 | $10.00 | $50.00 | $5.00 | $25.00 | 200K |
| Sonnet 4.6 | $6.00 | $30.00 | $3.00 | $15.00 | 200K |
| Sonnet 4.5 | $6.00 | $30.00 | $3.00 | $15.00 | 200K |
| Haiku 4.5 | $2.00 | $10.00 | $1.00 | $5.00 | 200K |
Anthropic's prompt caching stores repeated prefixes and charges less when the cached version is reused. This matters for coding agents like Claude Code, which send the same system prompt and file context on every request.
90% discount on input tokens. A Sonnet 4.6 cache read costs $0.30/M instead of $3.00/M through RunAPI.
25% surcharge on input — $3.75/M for Sonnet 4.6 through RunAPI. The cached prefix stays available for 5 minutes.
2x the input rate — $6.00/M for Sonnet 4.6 through RunAPI. Useful for long coding sessions where context reuse is frequent.
Anthropic offers a 50% discount on all models for batch requests that tolerate up to 24-hour turnaround. RunAPI passes this discount through.
Token costs look abstract until you attach them to real tasks. Below are five common developer workloads with estimated monthly costs at two usage levels.
| Workload | Model | Light use (~50 tasks/day) | Heavy use (~200 tasks/day) | Monthly saving vs official |
|---|---|---|---|---|
| Vibe coding session (Claude Code) | Sonnet 4.6 | $45/mo | $180/mo | $45–$180 |
| PR code review agent | Opus 4.7 | $75/mo | $300/mo | $75–$300 |
| RAG-powered docs chatbot | Haiku 4.5 | $12/mo | $48/mo | $12–$48 |
| Content generation pipeline | Sonnet 4.6 | $30/mo | $120/mo | $30–$120 |
| Multi-agent orchestrator | Opus 4.7 | $150/mo | $600/mo | $150–$600 |
Developers often compare Claude against GPT-5 and Gemini 2.5 Pro. Here is how the flagship models stack up on a per-million-token basis.
| Provider | Flagship model | Input /M | Output /M | RunAPI rate |
|---|---|---|---|---|
| Anthropic | Claude Opus 4.7 | $10.00 | $50.00 | $5.00 / $25.00 |
| OpenAI | GPT-5.4 | $2.50 | $15.00 | $1.25 / $7.50 |
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.63 / $5.00 |
RunAPI applies a 50% discount on all providers listed above. Prices verified June 2026.
Claude Max costs $100/month for unlimited Claude Code usage (or $200 for the 5x plan). The API charges per token. For developers who run under 10 million output tokens per month on Sonnet 4.6, the RunAPI route costs less than the Max subscription — and there is no usage cap on any model tier.
Unlimited usage of Sonnet and limited Opus in Claude Code. Fixed monthly cost. No API access.
Pay per token with no monthly commitment. Sonnet 4.6 at $3/M input and $15/M output. $100 buys roughly 6.7 million output tokens — enough for most individual developers.
Heavy daily users who consistently exceed 10 million output tokens per month. The break-even point on Sonnet 4.6 through RunAPI is around 6.7M output tokens.
Teams, CI pipelines, multi-model setups, and developers who want Opus or Haiku access alongside Sonnet. No cap, no waitlist, no subscription lock-in.
Sign up at runapi.ai. No credit card required for the free tier.
Go to Dashboard → API Keys. Create a key and save it — you will use this as your OpenAI API key.
Set the base URL to https://api.runapi.ai/v1 and use your RunAPI API key. Any OpenAI-compatible client works — Python, Node.js, Go, Ruby, or curl.
Use claude-sonnet-4-6, claude-opus-4-7, or any Claude model ID in the model parameter. RunAPI handles routing and billing at 50% of the official rate.
No. RunAPI proxies requests directly to Anthropic's API. The model output, safety filters, and behavior are identical to calling Anthropic directly.
RunAPI negotiates volume pricing with model providers and passes the savings to developers. There is no quality difference — same models, same API.
Yes. Set ANTHROPIC_BASE_URL to https://api.runapi.ai and your RunAPI key as the API key. Claude Code works without modification.
RunAPI adjusts within 24 hours. The 50% discount is maintained relative to Anthropic's published rates.
Yes. New accounts get free credits to test any model. After that, billing is strictly pay-as-you-go with no minimum.
Yes. Cache reads, 5-minute writes, and 1-hour writes are all supported at 50% of Anthropic's cache pricing.
Yes. RunAPI is OpenAI-compatible. Point any OpenAI client at api.runapi.ai/v1 and use Claude model IDs.
Pay-as-you-go. You fund your account with a balance, and each API call deducts the token cost. No subscriptions, no invoices, no contracts.
Create a free RunAPI account, get your API key, and start calling Claude Opus, Sonnet, or Haiku at 50% off official Anthropic pricing.