Can I use GPT-5.5 in Hermes Agent through RunAPI?

Yes. Hermes Agent supports custom OpenAI-compatible providers. Add RunAPI as custom:runapi with base_url https://runapi.ai/v1, key_env set to RUNAPI_API_KEY, and api_mode set to chat_completions. Set the default model to gpt-5.5.

How does RunAPI GPT pricing compare to official OpenAI pricing?

RunAPI charges 50% of the official OpenAI per-token rate for all GPT models. The discount applies to both input and output tokens. Check the RunAPI pricing page for exact per-million-token rates.

Which GPT model should I use -- 5.5 vs 5.4 vs mini vs codex?

GPT-5.5 for complex reasoning and hard problems. GPT-5.4 for everyday tasks at lower cost. GPT-5.4-mini for high-volume cheap work like classification. GPT-5.3-codex for code generation and editing. Switch between them by changing only the model field -- no provider reconfiguration needed.

Does the Responses API work through RunAPI in Hermes Agent?

Yes. RunAPI also proxies the OpenAI Responses API at /v1/responses. If Hermes Agent supports the Responses API surface, set the endpoint to https://runapi.ai/v1/responses. The same API key and custom provider work for both endpoints.

How do I use structured outputs to guarantee valid JSON from GPT?

Set response_format to json_schema with a schema definition in your request. GPT will constrain its output to match your schema exactly. RunAPI forwards the schema parameter unchanged. This works for data extraction, form parsing, and any task where you need predictable JSON structure.

Can Hermes Agent switch between GPT models dynamically per request?

Yes. Set the model parameter per request. Hermes Agent can route simple tasks to GPT-5.4-mini for cost efficiency and complex reasoning to GPT-5.5 for quality, all through the same RunAPI provider.

HERMES + GPT

Use GPT in Hermes Agent.

GPT-5.5 is OpenAI's flagship LLM, available through RunAPI at half the official per-token price. Hermes Agent connects via the custom:runapi provider using chat_completions mode — one config block unlocks every GPT variant (5.5, 5.4, 5.4-mini, 5.3-codex) with streaming, function calling, and structured output.

Get API Key Read the docs

one API key · OpenAI-compatible · streaming responses

Use RunAPI to call GPT-5.5 through the OpenAI-compatible Chat Completions endpoint.

Requirements:
- Read the API key from RUNAPI_API_KEY.
- Use the custom:runapi provider with base_url https://runapi.ai/v1.
- Call POST https://runapi.ai/v1/chat/completions
- Set model to "gpt-5.5".
- Include a messages array with at least one user message.
- The response is synchronous — the completion arrives in the same HTTP response.
- For streaming, set "stream": true to receive server-sent events.
- For the Responses API, call POST https://runapi.ai/v1/responses instead.

curl -X POST https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "messages": [
      {"role": "system", "content": "You are a concise coding assistant."},
      {"role": "user", "content": "Write a Python function that merges two sorted lists in O(n) time."}
    ],
    "temperature": 0.3,
    "max_tokens": 1024
  }'

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-5.5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "def merge_sorted(a, b):\n    result = []\n    i = j = 0\n    while i < len(a) and j < len(b):\n        if a[i] <= b[j]:\n            result.append(a[i]); i += 1\n        else:\n            result.append(b[j]); j += 1\n    result.extend(a[i:])\n    result.extend(b[j:])\n    return result"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 38,
    "completion_tokens": 95,
    "total_tokens": 133
  }
}

Copy the curl command to test gpt

HOW IT WORKS

Use GPT in Hermes Agent in three steps

Add RunAPI as a custom provider

If the custom:runapi provider is already configured in Hermes Agent, the same key works for GPT. Otherwise, add a custom provider with base_url https://runapi.ai/v1, key_env set to RUNAPI_API_KEY, and api_mode set to chat_completions.

export RUNAPI_API_KEY=runapi_xxx

Select a GPT model

Set the default model to gpt-5.5 for the flagship, gpt-5.4 or gpt-5.4-mini for lower cost, or gpt-5.3-codex for code-heavy tasks. The /v1/chat/completions endpoint returns a standard OpenAI response with usage counts and finish_reason.

default: gpt-5.5

Use streaming or function calling

Hermes Agent forwards stream, tools, and response_format parameters through the custom:runapi provider. All standard OpenAI Chat Completions parameters work through RunAPI without modification.

"stream": true

PARAMETERS

GPT Chat Completions parameters

Parameter	Type	Description
`model`	`string`	Required. gpt-5.5, gpt-5.4, gpt-5.4-mini, gpt-5.4-nano, gpt-5.3-codex, or gpt-5.2.
`messages`	`array`	Required. Array of message objects with role (system, user, assistant) and content fields.
`temperature`	`number`	Optional. Sampling temperature between 0 and 2. Lower values produce more deterministic output. Defaults to 1.
`max_tokens`	`integer`	Optional. Maximum number of tokens to generate in the completion.
`stream`	`boolean`	Optional. When true, returns server-sent events with incremental token deltas. Defaults to false.
`tools`	`array`	Optional. Array of tool definitions for function calling. Each tool has a type, function name, description, and parameters schema.
`response_format`	`object`	Optional. Set type to "json_object" or "json_schema" for structured JSON output.
`reasoning_effort`	`string`	Optional. Controls thinking depth for supported models. Accepted values are low, medium, high.

What is GPT on Hermes Agent?

GPT is OpenAI's LLM family, available through RunAPI's custom:runapi provider at half the official per-token cost. Hermes Agent connects using the standard chat_completions API mode, so you get GPT-5.5, 5.4, 5.4-mini, and 5.3-codex with streaming, function calling, structured JSON output, and vision input -- all through the same provider config you use for Claude or Gemini.

GPT use cases

Agentic coding with Codex models

Use GPT-5.3-codex through Hermes Agent for code generation, refactoring, and automated PR workflows at lower per-token cost than the flagship models.

Batch processing with structured outputs

Process large document sets through GPT with json_schema response format, extracting structured data at scale for RAG pipelines, invoice parsing, or content classification.

Dynamic model routing per task complexity

Route simple queries to GPT-5.4-mini for cost efficiency and complex reasoning tasks to GPT-5.5 for quality, all through the same custom:runapi provider and API key.

FAQ

GPT + Hermes Agent questions

Hermes Agent general setup

Not configured yet? Start with the RunAPI setup guide for Hermes Agent.

Hermes Agent setup guide →

GPT model catalog

See all GPT variants, per-token pricing, and API docs.

GPT on RunAPI →

Try GPT-5.5 in Hermes Agent today.

Get a free RunAPI key, configure the custom:runapi provider, and call GPT-5.5 at half the official OpenAI token price — streaming, function calling, and structured output included.

Browse models →