HERMES + GPT

Use GPT in Hermes Agent.

GPT-5.5 is OpenAI's flagship LLM, available through RunAPI at half the official per-token price. Hermes Agent connects via the custom:runapi provider using chat_completions mode — one config block unlocks every GPT variant (5.5, 5.4, 5.4-mini, 5.3-codex) with streaming, function calling, and structured output.

one API key · OpenAI-compatible · streaming responses
Use RunAPI to call GPT-5.5 through the OpenAI-compatible Chat Completions endpoint.

Requirements:
- Read the API key from RUNAPI_API_KEY.
- Use the custom:runapi provider with base_url https://runapi.ai/v1.
- Call POST https://runapi.ai/v1/chat/completions
- Set model to "gpt-5.5".
- Include a messages array with at least one user message.
- The response is synchronous — the completion arrives in the same HTTP response.
- For streaming, set "stream": true to receive server-sent events.
- For the Responses API, call POST https://runapi.ai/v1/responses instead.
curl -X POST https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a concise coding assistant."},
      {"role": "user", "content": "Write a Python function that merges two sorted lists in O(n) time."}
    ],
    "temperature": 0.3,
    "max_tokens": 1024
  }'
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "def merge_sorted(a, b):\n    result = []\n    i = j = 0\n    while i < len(a) and j < len(b):\n        if a[i] <= b[j]:\n            result.append(a[i]); i += 1\n        else:\n            result.append(b[j]); j += 1\n    result.extend(a[i:])\n    result.extend(b[j:])\n    return result"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 38,
    "completion_tokens": 95,
    "total_tokens": 133
  }
}
Copy the curl command to test gpt
HOW IT WORKS

Use GPT in Hermes Agent in three steps

1

Add RunAPI as a custom provider

If the custom:runapi provider is already configured in Hermes Agent, the same key works for GPT. Otherwise, add a custom provider with base_url https://runapi.ai/v1, key_env set to RUNAPI_API_KEY, and api_mode set to chat_completions.

export RUNAPI_API_KEY=runapi_xxx
2

Select a GPT model

Set the default model to gpt-5.5 for the flagship, gpt-5.4 or gpt-5.4-mini for lower cost, or gpt-5.3-codex for code-heavy tasks. The /v1/chat/completions endpoint returns a standard OpenAI response with usage counts and finish_reason.

default: gpt-5.5
3

Use streaming or function calling

Hermes Agent forwards stream, tools, and response_format parameters through the custom:runapi provider. All standard OpenAI Chat Completions parameters work through RunAPI without modification.

"stream": true
PARAMETERS

GPT Chat Completions parameters

Parameter Type Description
model string Required. gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.3-codex, or gpt-5.2.
messages array Required. Array of message objects with role (system, user, assistant) and content fields.
temperature number Optional. Sampling temperature between 0 and 2. Lower values produce more deterministic output. Defaults to 1.
max_tokens integer Optional. Maximum number of tokens to generate in the completion.
stream boolean Optional. When true, returns server-sent events with incremental token deltas. Defaults to false.
tools array Optional. Array of tool definitions for function calling. Each tool has a type, function name, description, and parameters schema.
response_format object Optional. Set type to "json_object" or "json_schema" for structured JSON output.
reasoning_effort string Optional. Controls thinking depth for supported models. Accepted values are low, medium, high.

What is GPT on Hermes Agent?

GPT is OpenAI's LLM family, available through RunAPI's custom:runapi provider at half the official per-token cost. Hermes Agent connects using the standard chat_completions API mode, so you get GPT-5.5, 5.4, 5.4-mini, and 5.3-codex with streaming, function calling, structured JSON output, and vision input -- all through the same provider config you use for Claude or Gemini.

GPT use cases

Agentic coding with Codex models

Use GPT-5.3-codex through Hermes Agent for code generation, refactoring, and automated PR workflows at lower per-token cost than the flagship models.

Batch processing with structured outputs

Process large document sets through GPT with json_schema response format, extracting structured data at scale for RAG pipelines, invoice parsing, or content classification.

Dynamic model routing per task complexity

Route simple queries to GPT-5.4-mini for cost efficiency and complex reasoning tasks to GPT-5.5 for quality, all through the same custom:runapi provider and API key.

FAQ

GPT + Hermes Agent questions

Hermes Agent general setup

Not configured yet? Start with the RunAPI setup guide for Hermes Agent.

Hermes Agent setup guide →

GPT model catalog

See all GPT variants, per-token pricing, and API docs.

GPT on RunAPI →

Try GPT-5.5 in Hermes Agent today.

Get a free RunAPI key, configure the custom:runapi provider, and call GPT-5.5 at half the official OpenAI token price — streaming, function calling, and structured output included.