Can I call Claude from Hermes Agent through RunAPI?

Yes. Configure RunAPI as a custom:runapi provider in Hermes Agent with base_url https://runapi.ai/v1 and api_mode chat_completions. Set model to claude-opus-4.8 or any other Claude variant. The same RUNAPI_API_KEY handles chat, image, video, and music models.

How much does the Claude API actually cost with prompt caching?

RunAPI charges 50% of Anthropic's official rate. Opus 4.8 is $7.50/$37.50 per million input/output tokens through RunAPI versus $15/$75 direct. With prompt caching enabled, cached input tokens cost even less. No subscription or volume commitment required.

Does switching between Claude models require reconfiguring Hermes Agent?

No. Change only the model parameter in your Hermes config or use the /model command during a session. The custom:runapi provider, base_url, and API key stay the same across all Claude variants -- Opus 4.8, Sonnet 4.6, Haiku 4.5, and dated snapshots.

Can I use the native Anthropic Messages API from Hermes Agent?

RunAPI exposes both /v1/chat/completions (OpenAI-compatible, used by Hermes Agent's chat_completions mode) and /v1/messages (native Anthropic format). The native endpoint supports extended thinking and Anthropic-specific features. For Hermes Agent, the OpenAI-compatible path covers standard chat and streaming.

How do I use prompt caching to reduce Claude API costs?

Include a cache_control breakpoint on your system prompt or large context blocks. Subsequent requests that share the same cached prefix pay a reduced input token rate. This is especially effective for agent loops where the system prompt and tool definitions repeat across many turns.

Can Hermes Agent use Claude's extended thinking mode through RunAPI?

Yes. Pass the extended thinking parameters in your request body. Hermes Agent forwards them to the RunAPI Claude endpoint, which supports the same extended thinking configuration as the direct Anthropic API.

HERMES + CLAUDE

Hermes Agent で Claude を使う。

Anthropic Claude は、最大の能力（200K コンテキスト、拡張思考）のための Opus 4.8、バランスの取れたパフォーマンスのための Sonnet 4.6、スピードのための Haiku 4.5 を提供します。Hermes Agent は custom:runapi プロバイダー経由で Anthropic 公式のトークン単価の 50% で Claude を呼び出します——チャット用に設定したのと同じキーと base_url を使用します。

API キーを取得ドキュメントを読む

1つの APIキー · OpenAI 互換エンドポイント · トークン単価 50% オフの課金

RunAPI を使って Hermes Agent 経由で Claude のチャット補完リクエストを送信します。

要件：
- Hermes Agent ですでに設定されている custom:runapi プロバイダーを使用します
- RunAPI のチャット補完エンドポイント https://runapi.ai/v1/chat/completions を呼び出します
- model を "claude-opus-4.8" に設定します
- RUNAPI_API_KEY 環境変数が認証を提供します
- レスポンスは同期的です——assistant メッセージはレスポンスボディで直接返されます
- ストリーミングには "stream": true を設定して server-sent events を受け取ります

curl -X POST https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-opus-4.8",
    "max_tokens": 1024,
    "messages": [
      {"role": "user", "content": "Explain the difference between a mutex and a semaphore in three sentences."}
    ]
  }'

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "claude-opus-4.8",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "A mutex is a locking mechanism that allows only one thread to access a resource at a time..."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 87,
    "total_tokens": 111
  }
}

curlコマンドをコピーしてテスト claude

仕組み

Hermes Agent で Claude を使う3ステップ

Configure RunAPI

Set the RUNAPI_API_KEY environment variable. If you already added RunAPI as a custom:runapi provider in Hermes Agent, the same key and base_url work for Claude — switch the model parameter to claude-opus-4.8 in your Hermes config or use the /model command.

export RUNAPI_API_KEY=runapi_xxx

Call Claude

Send a POST request to /v1/chat/completions with model set to claude-opus-4.8. Include a messages array with at least one user message. Set max_tokens to control response length. Add "stream" true for token-by-token SSE output in your Hermes session.

POST /v1/chat/completions

Read the response

The endpoint returns the assistant message synchronously — no task polling needed. Hermes Agent displays the response inline. Token usage counts are included in the response for billing transparency. Streaming responses arrive as SSE events for real-time display.

usage.total_tokens: 111

パラメータ

Claude API パラメータ（OpenAI 互換）

パラメータ	型	説明
`model`	`string`	Required. claude-opus-4.8, claude-sonnet-4.6, claude-haiku-4.5, or any Claude variant listed in the RunAPI catalog.
`messages`	`array`	Required. Array of message objects with role (system, user, assistant) and content fields.
`max_tokens`	`integer`	Maximum number of tokens in the response. Defaults vary by model — set explicitly for predictable billing.
`stream`	`boolean`	When true, returns server-sent events with incremental token deltas instead of a single JSON response.
`temperature`	`float`	Sampling temperature between 0 and 1. Lower values produce more deterministic output.
`top_p`	`float`	Nucleus sampling cutoff. Alternative to temperature — use one or the other, not both.

Hermes AgentのClaudeとは？

ClaudeはAnthropicのLLMで、Hermes Agentはcustom:runapi providerを通じて公式Anthropicの1トークンあたり価格の半額で呼び出します。3つのティア——Opus 4.8（200Kコンテキスト・拡張思考）・Sonnet 4.6（バランスのとれた速度と品質）・Haiku 4.5（高速低コスト）——すべて同じprovider設定で使えます。modelフィールドを変更するだけでリクエストごとに切り替えられ、再設定は不要です。

Claudeの活用例

ツール使用とMCPを使ったAIエージェント構築

Hermes AgentでClaudeの関数呼び出しとModel Context Protocolサポートを使い、ファイルの読み取り・データベースのクエリ・推論に基づくアクションを実行するマルチステップ自動化ワークフローを構築します。

コード生成とレビュー

コーディングタスクをHermes AgentでClaudeにルーティングします——Opus 4.8は複雑なアーキテクチャ決定とマルチファイルリファクタリング・Sonnet 4.6は日常的なPRレビューとテスト生成に使います。

プロンプトキャッシングによるコンテンツ生成

プロンプトキャッシングを使ってマーケティングコピー・ドキュメント・レポートを大規模に生成し、システムプロンプトとコンテキストが複数のリクエスト間で同じままの場合にコストを削減します。

FAQ

Claude + Hermes Agent のよくある質問

Hermes Agent の基本設定

まだ設定していませんか？Hermes Agent の RunAPI セットアップガイドから始めましょう。

Hermes Agent セットアップガイド →

Claude モデルカタログ

すべての Claude バリアント、トークン単価、コンテキストウィンドウの詳細を確認できます。

Claude モデル →

今すぐ Hermes Agent で Claude を試す。

無料の RunAPI キーを取得し、custom:runapi プロバイダーを設定して、Anthropic 公式料金の 50% で Claude を使い始めましょう。

モデルを見る →