GLM glm-4.5 API
Same API key, same model skill workflow — switch variants by changing one model ID.
# Base URL
https://runapi.ai
# Endpoints
POST /v1/chat/completions
curl https://runapi.ai/v1/chat/completions \
-H "Authorization: Bearer $RUNAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-4.5",
"messages": [
{
"role": "user",
"content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."
}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://runapi.ai/v1",
api_key="your-runapi-key"
)
response = client.chat.completions.create(
model="glm-4.5",
messages=[{"role": "user", "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."}]
)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://runapi.ai/v1",
apiKey: "your-runapi-key"
});
const response = await client.chat.completions.create({
model: "glm-4.5",
messages: [{ role: "user", content: "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause." }]
});
glm-4.5 targets the sweet spot of quality and cost within the GLM family.
- Pay-per-call pricing in USD
- Failed generations not charged
- Streaming when supported by the model
- Schema-validated tool calls
Pricing
Technical details
| Model ID | glm-4.5 |
| Provider | Z.ai |
| Modality | text |
| Task type | synchronous |
| Billing unit | 1K tokens |
| API endpoint | /v1/chat/completions |
| Commercial license | Yes — included via API |
| Status | Operational |
Model skill — glm-4.5
Install the skill once, then use the variant ID from this page while building.
| Endpoint | Protocol |
|---|---|
| /v1/chat/completions | OpenAI compatible |
Use glm-4.5 with a model skill
Install
Install the model skill for this model line.
Configure
Set the model field to the full model ID shown on this page.
Call
Use the skill instructions while adding prompt, input, and callback handling to your app.
Receive
Read the task response, webhook callback, or cached output URL from RunAPI.
What's different about glm-4.5
355B / 32B active; 128K context; flagship open-weight MoE baseline
Lighter GLM-4.5 tier for fast, lower-cost everyday work
355B / 32B active; 128K context; flagship open-weight MoE baseline
200K context; first GLM on Cambricon chips; sharper code generation
355B / 32B active; 128K context; flagship open-weight MoE baseline
200K context; 73.8% SWE-bench; persistent thinking across turns
Best for
Customer support
Answer customer questions from a private knowledge base, reducing ticket volume.
Document analysis
Draft contract summaries and flag key clauses for attorney review.
Code generation
Auto-generate unit tests, code reviews, and refactoring suggestions in CI.
Frequently asked questions about glm-4.5
Is the model ID stable across versions?
RunAPI keeps the model ID stable and handles compatible version refreshes without changing your request shape.
What's the rate limit on this variant?
Per-key rate limits scale with usage tier. See pricing page for current limits.
Can I switch variants later?
Yes — variant is a flag. Switch by changing the model parameter.
Does it stream?
Where streaming is available, RunAPI streams end-to-end.
Where do I report quality issues?
Open an issue on the public GitHub repo or email support.