LLM API Pricing

OpenAI API Pricing in 2026

GPT-5.4 costs $2.50 input and $15 output per million tokens; GPT-5.5 costs $5 and $30. RunAPI mirrors every GPT model at half the official rate — same API, same output, 50% less on your invoice.

Updated June 18, 2026 RunAPI Editorial
At a glance

What does the OpenAI API cost right now?

OpenAI prices each GPT model per million tokens, with separate input and output rates and a cheaper cached-input rate. All figures below are per million tokens, the billing unit OpenAI uses.

Most popular

GPT-5.4 at $1.25/M input and $7.50/M output through RunAPI. Official rate is $2.50/$15.

Cheapest option

GPT-5.4-mini at a fraction of the flagship rate, billed at 50% off through RunAPI.

Maximum capability

GPT-5.5 at $2.50/M input and $15/M output through RunAPI. Official rate is $5/$30.

Cache discount

Cached input tokens cost a fraction of standard input — passed through at 50% on RunAPI.

Model-by-model breakdown

How much does each GPT model cost per million tokens?

The table shows official OpenAI pricing alongside RunAPI pricing. RunAPI applies a flat 50% discount across all GPT models. No volume commits, no subscriptions.

Model Official input /M Official output /M RunAPI input /M RunAPI output /M Context window
GPT-5.5 $5.00 $30.00 $2.50 $15.00 400K
GPT-5.4 $2.50 $15.00 $1.25 $7.50 400K
GPT-5.4-mini $0.25 $2.00 $0.13 $1.00 400K
GPT-5.3-codex $2.50 $15.00 $1.25 $7.50 400K
Cache and batch

How do cache and batch discounts cut your GPT bill?

OpenAI charges less for cached input tokens and offers a deep discount on batch requests that tolerate delayed turnaround. Both matter for repetitive workloads like coding agents and bulk processing.

Cached input

Repeated prompt prefixes are billed at a reduced input rate. RunAPI passes the discount through at 50% of OpenAI's cached rate.

Batch API (50% off)

Requests submitted to the Batch API run at half the standard rate with up to 24-hour turnaround. RunAPI passes this through on top of its own discount.

Reasoning effort

GPT-5 models let you set reasoning effort. Lower effort emits fewer reasoning tokens, directly reducing output cost on metered billing.

Output token control

Cap max output tokens per request to bound cost and avoid runaway generations on long agentic tasks.

Real-world costs

What does the GPT API cost for real workloads?

Token rates look abstract until attached to real tasks. Below are common developer workloads with estimated monthly costs at two usage levels, billed at RunAPI rates.

Workload Model Light use (~50 tasks/day) Heavy use (~200 tasks/day) Monthly saving vs official
Coding agent (Codex) GPT-5.3-codex $20/mo $80/mo $20–$80
Customer-support chatbot GPT-5.4-mini $6/mo $24/mo $6–$24
RAG knowledge assistant GPT-5.4 $18/mo $72/mo $18–$72
Content generation pipeline GPT-5.4 $25/mo $100/mo $25–$100
Multi-agent orchestrator GPT-5.5 $90/mo $360/mo $90–$360
Provider comparison

Is the OpenAI API cheaper than Claude and Gemini?

Developers weigh GPT against Claude and Gemini. Here is how the flagship models compare on a per-million-token basis, with RunAPI rates alongside.

Provider Flagship model Input /M Output /M RunAPI rate
OpenAI GPT-5.4 $2.50 $15.00 $1.25 / $7.50
Anthropic Claude Opus 4.7 $10.00 $50.00 $5.00 / $25.00
Google Gemini 2.5 Pro $1.25 $10.00 $0.63 / $5.00

RunAPI applies a 50% discount on all providers listed above. Prices verified June 2026.

Getting started

How to access the GPT API through RunAPI

1

Create a RunAPI account

Sign up at runapi.ai. No credit card required for the free tier.

2

Copy your API key

Go to Dashboard → API Keys. Create a key and save it — you will use this as your OpenAI API key.

3

Point your SDK to RunAPI

Set the base URL to https://api.runapi.ai/v1 and use your RunAPI key. Any OpenAI-compatible client works.

4

Start making requests

Use gpt-5.4, gpt-5.5, or any GPT model ID in the model parameter. RunAPI handles routing and billing at 50% of the official rate.

Frequently asked questions

OpenAI API Pricing FAQ

How much does the OpenAI GPT-5 API cost?

GPT-5.4 costs $2.50 per million input tokens and $15 per million output tokens officially. GPT-5.5 costs $5 and $30. Through RunAPI, every GPT model is billed at half those rates — GPT-5.4 runs $1.25 input and $7.50 output per million tokens.

Why is RunAPI 50% cheaper than OpenAI?

RunAPI negotiates volume pricing with model providers and passes the savings on to developers. Requests reach the same OpenAI models with identical output, safety filters, and behavior, so the only difference is the lower rate on your invoice. There is no quality trade-off and no separate billing tier — the discount applies automatically to every GPT model.

Does GPT-5 have cache pricing?

Yes. OpenAI bills repeated prompt prefixes at a reduced cached-input rate, which lowers cost for agents that resend the same context. RunAPI passes the cache discount through at 50% of OpenAI's cached rate, so caching savings stack with the base discount.

How does the OpenAI Batch API discount work?

The Batch API runs requests at 50% of the standard rate in exchange for up to 24-hour turnaround. It suits bulk jobs that do not need instant responses. RunAPI passes this discount through, so batch work is billed at half of the already-discounted rate.

Is GPT cheaper than Claude or Gemini?

On flagship input tokens, GPT-5.4 at $2.50 sits between Gemini 2.5 Pro at $1.25 and Claude Opus at $10. The cheapest choice depends on the model tier and workload. RunAPI halves the rate for all three, so the relative ranking stays the same.

Can I use the OpenAI SDK with RunAPI?

Yes. RunAPI is OpenAI-compatible. Point any OpenAI client at https://api.runapi.ai/v1, use your RunAPI key, and pass a GPT model ID. Existing code that already uses the OpenAI SDK works without any changes beyond the base URL and key, so migrating an established project takes about a minute.

Does RunAPI support GPT-5.3-codex for coding?

Yes. GPT-5.3-codex is available through RunAPI at 50% of the official rate, which is $1.25 input and $7.50 output per million tokens. It works with Codex and other OpenAI-compatible coding tools by overriding the base URL and key in their settings. Cached input and batch discounts also pass through, lowering the effective cost of repetitive coding sessions further.

Is there a free tier?

Yes. New RunAPI accounts receive free credits to test any GPT model before committing. After that, billing is strictly pay-as-you-go with no minimum spend, no subscription, and no monthly commitment — you fund a balance and each call deducts its token cost. You can top up any amount and watch usage per model in the dashboard.

Run GPT-5 at half price.

Create a free RunAPI account, get your API key, and call any OpenAI GPT model at 50% off official pricing.