HERMES + GPT IMAGE

Use GPT Image in Hermes Agent.

GPT Image 2 is OpenAI's dedicated image generation model — text-to-image and instruction-based image editing with up to 4K output resolution and transparent background support. Hermes Agent calls it through the same RunAPI custom provider and API key used for chat, with no ComfyUI or GPU setup needed.

one API key · text to image + edit image · up to 4K output
Use RunAPI to generate an image with OpenAI GPT Image 2 from Hermes Agent.

Requirements:
- Use the RunAPI API at https://runapi.ai/v1/text_to_image.
- Read the API key from RUNAPI_API_KEY environment variable.
- Use the custom:runapi provider already configured in Hermes Agent.
- Set the model to "gpt-image-2-text-to-image".
- Write a descriptive prompt. GPT Image 2 follows natural language instructions closely — describe layout, style, text overlays, and transparency needs.
- Optionally set output_resolution to 1k, 2k, or 4k. Default is 1k.
- The response returns a task_id. Poll the task status endpoint until the task completes, then retrieve the output URL.
curl -X POST https://runapi.ai/v1/text_to_image \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-image-2-text-to-image",
    "prompt": "A product photo of a glass perfume bottle on a marble surface, transparent background, studio lighting, the label reads AURORA in gold serif font",
    "output_resolution": "2k",
    "aspect_ratio": "3:4"
  }'
{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "gpt-image-2-text-to-image"
}
Copy the curl command to test gpt-image
HOW IT WORKS

Use GPT Image in Hermes Agent in three steps

1

Configure RunAPI

Set the RUNAPI_API_KEY environment variable in your shell profile. If the custom:runapi provider is already configured in Hermes Agent for chat, the same key and base_url work for GPT Image — no additional setup needed.

export RUNAPI_API_KEY=runapi_xxx
2

Call GPT Image 2

Send a POST request to the text_to_image endpoint with model set to gpt-image-2-text-to-image. Include a descriptive prompt with layout and style instructions. Set output_resolution to 2k or 4k for higher detail. For editing existing images, use the edit_image endpoint with gpt-image-2-image-to-image and provide source_image_urls.

POST /v1/text_to_image
3

Get the result

The API returns a task_id immediately. Poll the task status endpoint until the status changes to completed, then retrieve the output image URL from the response. GPT Image 2 typically completes within 10–30 seconds depending on resolution.

task_id: tsk_abc123
PARAMETERS

GPT Image API parameters

Parameter Type Description
model string Required. gpt-image-2-text-to-image for generation, gpt-image-2-image-to-image for editing.
prompt string Required. Natural language description of the desired image. Supports detailed instructions for layout, text overlays, and style.
output_resolution string Optional. Output resolution — 1k (default), 2k, or 4k. Higher resolution costs more per image.
aspect_ratio string Optional. Defaults to auto. Supports 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, and more.
source_image_urls array Required for edit_image endpoint. One or more URLs of source images to edit.

What is GPT Image on Hermes Agent?

GPT Image 2 treats prompts as production briefs rather than loose keyword lists. It includes a reasoning step before generating, which helps it follow structured instructions for layout, text placement, and composition. Users find it works best with simpler, clearly structured prompts -- complex multi-pass refinements can introduce noise patterns. Hermes Agent calls it through the RunAPI custom provider.

GPT Image use cases

Product photography with transparent backgrounds

Generate product shots on transparent backgrounds for compositing into marketing materials, catalogs, or e-commerce listings without manual masking.

Social media campaign graphics

Create social media visuals with embedded text, brand colors, and consistent styling across multiple campaign images -- specify the exact text in the prompt.

Cinematic stills for video conversion

Generate video-ready first frames and cinematic stills that can serve as keyframes for video generation workflows or standalone editorial illustrations.

FAQ

GPT Image + Hermes Agent questions

Hermes Agent general setup

Not configured yet? Start with the RunAPI setup guide for Hermes Agent.

Hermes Agent setup guide →

GPT Image model catalog

See all GPT Image variants, pricing, and API docs.

GPT Image models →

Try GPT Image in Hermes Agent today.

Get a free RunAPI key, configure the custom:runapi provider, and start generating and editing images with OpenAI GPT Image 2.