VARIANT · Z.ai / GLM

GLM glm-5-turbo API

透過 RunAPI 統一 AI API 提供的模型變體。

可直接上線 · text · 可商用

runapi.ai

# Base URL
https://runapi.ai

# Endpoints
POST /v1/chat/completions

curl https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "glm-5-turbo",
  "messages": [
    {
      "role": "user",
      "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."
    }
  ]
}'

from openai import OpenAI

client = OpenAI(
    base_url="https://runapi.ai/v1",
    api_key="your-runapi-key"
)

response = client.chat.completions.create(
    model="glm-5-turbo",
    messages=[{"role": "user", "content": "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause."}]
)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://runapi.ai/v1",
  apiKey: "your-runapi-key"
});

const response = await client.chat.completions.create({
  model: "glm-5-turbo",
  messages: [{ role: "user", content: "Read this multi-file repository, find the failing integration test, and propose a patch with an explanation of the root cause." }]
});

https://runapi.ai /v1/chat/completions

切換 variant

glm-4.5 glm-4.5-air glm-4.6 glm-4.7 glm-5 glm-5.1

OVERVIEW

glm-5-turbo 在 GLM 系列中，兼顧品質與成本的最佳平衡。

以美元按次計費
生成失敗不收費
模型支援時可串流輸出
Model skill setup

PRICING

價格

失敗的生成不收費

Chat completion

Input $0.60 / 1M tokens

Output $2.00 / 1M tokens

Cache read $0.12

Cache write 5m Free

規格表

技術細節

Model ID	glm-5-turbo
供應商	Z.ai
模態	text
任務類型	synchronous
計費單位	1K tokens
API endpoint	/v1/chat/completions
商用授權	是 — 已透過 API 包含
狀態	可直接上線

SKILLS

快速開始 — glm-5-turbo

相同格式 · variant 固定在 model 中

Endpoint	Protocol
/v1/chat/completions	OpenAI compatible

運作方式

四步驟使用 glm-5-turbo

01

安裝

安裝此 model line 的 model skill。

02

設定

將 model 欄位設定為此頁面顯示的完整 model ID。

03

呼叫

使用您的 prompt、inputs 和 callback 設定送出型別化請求。

04

接收

讀取 RunAPI 的 task 回應、webhook callback 或快取輸出 URL。

DIFFERENCES

glm-5-turbo 有什麼不同

VS GLM-4.5

Speed-optimized GLM-5 tier for lower latency

355B / 32B active; 128K context; flagship open-weight MoE baseline

VS GLM-4.5-AIR

Speed-optimized GLM-5 tier for lower latency

Lighter GLM-4.5 tier for fast, lower-cost everyday work

VS GLM-4.6

Speed-optimized GLM-5 tier for lower latency

200K context; first GLM on Cambricon chips; sharper code generation

使用情境

最適合

客服支援

從私有知識庫回答顧客問題，減少工單量。

文件分析

起草合約摘要，並標示關鍵條款供律師審閱。

程式碼生成

在 CI 中自動產生單元測試、程式碼審查與重構建議。

FAQ

關於 glm-5-turbo 的常見問題

模型 ID 在不同版本之間會保持穩定嗎？

RunAPI 會維持 model ID 穩定，並在不改變請求格式的情況下處理相容版本更新。

這個 variant 的速率限制是多少？

每個金鑰的速率限制會依使用方案而調整。請查看定價頁面以了解目前限制。

之後可以切換 variant 嗎？

可以——variant 只是旗標。只要變更 model 參數即可切換。

它支援串流嗎？

只要支援串流，RunAPI 就會端到端串流。

我該在哪裡回報品質問題？

請在公開 GitHub repo 提交 issue，或寄信給支援。

GLM 的其他 variant

glm-4.5-air 最便宜

$0.010 / 1K tokens

$0.020 / 1K tokens

$0.020 / 1K tokens

$0.020 / 1K tokens

$0.020 / 1K tokens

$0.030 / 1K tokens

其他模型的替代方案

Anthropic's LLM for complex reasoning, code, analysis, and extended-context tasks.

Reasoning-first LLMs via RunAPI — flash for fast, low-cost work; pro for complex agentic tasks.

OpenAI text embeddings for semantic search, retrieval, clustering, and ranking workflows.

立即開始

開始使用 GLM 進行開發。

建立免費帳戶閱讀快速入門 →