GPT API
GPT API access for OpenAI's flagship LLM across chat, code generation, and multi-step reasoning tasks.
# Base URL
https://runapi.ai
# Endpoints
POST /v1/responses
POST /v1/chat/completions
curl https://runapi.ai/v1/chat/completions \
-H "Authorization: Bearer $RUNAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.5",
"messages": [
{
"role": "user",
"content": "Analyze this quarterly revenue data and produce a summary with key trends, anomalies, and three recommendations."
}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://runapi.ai/v1",
api_key="your-runapi-key"
)
response = client.chat.completions.create(
model="gpt-5.5",
messages=[{"role": "user", "content": "Analyze this quarterly revenue data and produce a summary with key trends, anomalies, and three recommendations."}]
)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://runapi.ai/v1",
apiKey: "your-runapi-key"
});
const response = await client.chat.completions.create({
model: "gpt-5.5",
messages: [{ role: "user", content: "Analyze this quarterly revenue data and produce a summary with key trends, anomalies, and three recommendations." }]
});
GPT is OpenAI's flagship large language model series, with the GPT-5 family delivering state-of-the-art performance in chat, code generation, and multi-step reasoning. Available in multiple capability tiers.
- Multiple variants for different speed, quality, and cost tiers
- Model skill includes docs, schemas, pricing, and setup notes
- Works with Claude Code, Codex, Gemini CLI, Cursor, and VS Code
- Single API key and unified billing across all variants
- Async task management with polling and webhook callbacks
- Failed generations are not charged
Compare all API variants
| Variant | Billing | From | |
|---|---|---|---|
| codex-auto-review | 1K tokens | $0.150 | View → |
| gpt-5.2 | 1K tokens | $0.070 | View → |
| gpt-5.2-pro | 1K tokens | $0.840 | View → |
| gpt-5.3-codex | 1K tokens | $0.070 | View → |
| gpt-5.3-codex-spark | 1K tokens | $0.070 | View → |
| gpt-5.4 | 1K tokens | $0.080 | View → |
| gpt-5.4-mini | 1K tokens | $0.030 | View → |
| gpt-5.4-nano | 1K tokens | $0.010 | View → |
| gpt-5.4-pro | 1K tokens | $0.900 | View → |
| gpt-5.5 | 1K tokens | $0.150 | View → |
| gpt-5.5-pro | 1K tokens | $0.900 | View → |
GPT API endpoints
Use the OpenAI or Anthropic SDK with your RunAPI key. No extra SDK required.
| Endpoint | Protocol |
|---|---|
| /v1/responses | OpenAI Responses |
| /v1/chat/completions | OpenAI compatible |
From model skill to first result in four steps
Choose a model
Browse the model catalog and pick the model and variant that match your output type, quality bar, and latency target. Each variant page shows its model ID, pricing, and parameter constraints so you can compare before committing.
Configure
Set your RunAPI API key as an environment variable and install the model skill in your coding workspace. The skill loads docs, typed schemas, pricing notes, and setup steps so your agent has the right context from the start.
Call
Use the skill instructions to add the model feature inside your application. Send a POST request with your prompt, model ID, and parameters. RunAPI routes the request, manages the async lifecycle, and returns structured JSON.
Receive
Poll by task ID for completion, stream results end-to-end when supported, or configure a webhook callback URL to receive results automatically. The CLI provides a built-in wait command, and the SDKs offer both polling and callback patterns.
What is the GPT API?
GPT is OpenAI's frontier LLM family, spanning standard and mini variants for different cost-performance trade-offs. Through RunAPI, all GPT models share the same API shape and billing.
Why route the GPT API through RunAPI
One auth, every provider
A single RunAPI API key unlocks the whole model catalog across all providers. No separate accounts to create, no API keys to rotate per integration, and no credential management overhead. Add a new model to your app by changing one parameter.
Unified pricing & billing
Per-call pricing in USD, billed monthly into a single invoice. No subscription tiers, no minimum spend, and failed generations are never charged. The pricing page and check_pricing API show exact costs before you commit to a model.
Schema-first SDK
Typed schemas, parameter constraints, and setup notes are packaged in the model skill so your implementation starts from the right contract. The skill loads into Claude Code, Codex, Gemini CLI, Cursor, and VS Code — your agent knows the correct request shape before you write a line of code.
Common questions
Which variant should I start with?
Pick the cheapest variant that meets your quality bar. Most teams start on the fast variant and graduate to pro for production.
Is there a free tier?
New accounts get free first calls on every model. After that, pay per call.
Do you stream results?
Where streaming is available, RunAPI streams end-to-end.
How are failures billed?
Failed generations are not charged.
Are outputs cached?
Generated outputs are stored and retrievable by task ID. Inputs are not cached.
Can I use commercially?
Yes — commercial use is included for every variant unless a model license explicitly restricts it, which is called out on the variant page.
What about rate limits?
Per-key rate limits scale with usage tier. See pricing page for current limits.
Where can I report issues?
Open an issue on the public GitHub repo or email support.