Z.ai
Z.ai's GLM — MIT-licensed MoE LLMs from 128K to 200K context, top open-weight SWE-bench scores, via one RunAPI key.
Z.ai builds the GLM family of MIT-licensed Mixture-of-Experts language models for coding and agentic workflows. The line spans GLM-4.5 (355B / 32B active, 128K context) through GLM-5.1 (754B / 40B active, 200K context), which holds the top open-weight SWE-bench Pro score at 58.4%. All are available through RunAPI from the OpenAI and Anthropic SDKs with per-token billing.
- Single API key shared across providers
- Model skills carry docs and schemas into your workspace
- Per-call billing, no commitment
- Failed generations are not charged
What stands out
All models from Z.ai
Install a Z.ai model skill.
Pick a model and add its skill so your coding tool has docs, schemas, pricing notes, and setup steps.
# Base URL
https://runapi.ai
# Endpoints
POST /v1/chat/completions
curl https://runapi.ai/v1/chat/completions \
-H "Authorization: Bearer $RUNAPI_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "glm-5.1",
"messages": [
{
"role": "user",
"content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."
}
]
}'
from openai import OpenAI
client = OpenAI(
base_url="https://runapi.ai/v1",
api_key="your-runapi-key"
)
response = client.chat.completions.create(
model="glm-5.1",
messages=[{"role": "user", "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."}]
)
import OpenAI from "openai";
const client = new OpenAI({
baseURL: "https://runapi.ai/v1",
apiKey: "your-runapi-key"
});
const response = await client.chat.completions.create({
model: "glm-5.1",
messages: [{ role: "user", content: "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause." }]
});
Every variant from Z.ai
Frequently asked questions about Z.ai
Is this an official Z.ai integration?
RunAPI exposes a managed API surface with transparent pricing, capability, and error behavior.
Do I need a Z.ai account?
No — your RunAPI key is enough for managed access.
What's the latency overhead from proxying?
Typically under 20 ms. RunAPI keeps the proxy layer close to model execution regions.
Are images / videos cached?
Generated outputs are stored and retrievable by task ID. Inputs are not cached.
Can I bring my own key?
Not currently — calls use RunAPI-managed access.