PROVIDER

Z.ai

Z.ai's GLM — MIT-licensed MoE LLMs from 128K to 200K context, top open-weight SWE-bench scores, via one RunAPI key.

1 models · 7 variants · from $0.010
OVERVIEW

Z.ai builds the GLM family of MIT-licensed Mixture-of-Experts language models for coding and agentic workflows. The line spans GLM-4.5 (355B / 32B active, 128K context) through GLM-5.1 (754B / 40B active, 200K context), which holds the top open-weight SWE-bench Pro score at 58.4%. All are available through RunAPI from the OpenAI and Anthropic SDKs with per-token billing.

  • Single API key shared across providers
  • Model skills carry docs and schemas into your workspace
  • Per-call billing, no commitment
  • Failed generations are not charged
FEATURES

What stands out

MODELS

All models from Z.ai

QUICKSTART

Install a Z.ai model skill.

Pick a model and add its skill so your coding tool has docs, schemas, pricing notes, and setup steps.

runapi.ai
# Base URL
https://runapi.ai

# Endpoints
POST /v1/chat/completions
curl https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "glm-5.1",
  "messages": [
    {
      "role": "user",
      "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."
    }
  ]
}'
from openai import OpenAI

client = OpenAI(
    base_url="https://runapi.ai/v1",
    api_key="your-runapi-key"
)

response = client.chat.completions.create(
    model="glm-5.1",
    messages=[{"role": "user", "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."}]
)
import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://runapi.ai/v1",
  apiKey: "your-runapi-key"
});

const response = await client.chat.completions.create({
  model: "glm-5.1",
  messages: [{ role: "user", content: "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause." }]
});
https://runapi.ai /v1/chat/completions
REFERENCE

Every variant from Z.ai

Full pricing table →
Model Variant Billing From
GLM
glm-4.5 1K tokens $0.020 View →
glm-4.5-air 1K tokens $0.010 View →
glm-4.6 1K tokens $0.020 View →
glm-4.7 1K tokens $0.020 View →
glm-5 1K tokens $0.020 View →
glm-5-turbo 1K tokens $0.020 View →
glm-5.1 1K tokens $0.030 View →
FAQ

Frequently asked questions about Z.ai

Is this an official Z.ai integration?

RunAPI exposes a managed API surface with transparent pricing, capability, and error behavior.

Do I need a Z.ai account?

No — your RunAPI key is enough for managed access.

What's the latency overhead from proxying?

Typically under 20 ms. RunAPI keeps the proxy layer close to model execution regions.

Are images / videos cached?

Generated outputs are stored and retrievable by task ID. Inputs are not cached.

Can I bring my own key?

Not currently — calls use RunAPI-managed access.

START NOW

Start building with Z.ai models.