---
title: &quot;Use GPT Image in Hermes Agent via RunAPI — Image API Guide&quot;
url: &quot;https://runapi.ai/hermes-gpt-image.md&quot;
canonical: &quot;https://runapi.ai/hermes-gpt-image&quot;
locale: &quot;en&quot;
model: &quot;gpt-image&quot;
---

# Use GPT Image in Hermes Agent.

GPT Image 2 is OpenAI&#39;s dedicated image generation model — text-to-image and instruction-based image editing with up to 4K output resolution and transparent background support. Hermes Agent calls it through the same RunAPI custom provider and API key used for chat, with no ComfyUI or GPU setup needed.

## API example

```bash
curl -X POST https://runapi.ai/v1/text_to_image \
  -H &quot;Authorization: Bearer $RUNAPI_API_KEY&quot; \
  -H &quot;Content-Type: application/json&quot; \
  -d &#39;{
    &quot;model&quot;: &quot;gpt-image-2-text-to-image&quot;,
    &quot;prompt&quot;: &quot;A product photo of a glass perfume bottle on a marble surface, transparent background, studio lighting, the label reads AURORA in gold serif font&quot;,
    &quot;output_resolution&quot;: &quot;2k&quot;,
    &quot;aspect_ratio&quot;: &quot;3:4&quot;
  }&#39;

```

### Response

```json
{
  &quot;task_id&quot;: &quot;tsk_abc123&quot;,
  &quot;status&quot;: &quot;pending&quot;,
  &quot;model&quot;: &quot;gpt-image-2-text-to-image&quot;
}

```

## How it works

1. **Configure RunAPI** — Set the RUNAPI_API_KEY environment variable in your shell profile. If the custom:runapi provider is already configured in Hermes Agent for chat, the same key and base_url work for GPT Image — no additional setup needed.
2. **Call GPT Image 2** — Send a POST request to the text_to_image endpoint with model set to gpt-image-2-text-to-image. Include a descriptive prompt with layout and style instructions. Set output_resolution to 2k or 4k for higher detail. For editing existing images, use the edit_image endpoint with gpt-image-2-image-to-image and provide source_image_urls.
3. **Get the result** — The API returns a task_id immediately. Poll the task status endpoint until the status changes to completed, then retrieve the output image URL from the response. GPT Image 2 typically completes within 10–30 seconds depending on resolution.

## Parameters

| Parameter | Type | Description |
|-----------|------|-------------|
| `model` | `string` | Required. gpt-image-2-text-to-image for generation, gpt-image-2-image-to-image for editing. |
| `prompt` | `string` | Required. Natural language description of the desired image. Supports detailed instructions for layout, text overlays, and style. |
| `output_resolution` | `string` | Optional. Output resolution — 1k (default), 2k, or 4k. Higher resolution costs more per image. |
| `aspect_ratio` | `string` | Optional. Defaults to auto. Supports 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, and more. |
| `source_image_urls` | `array` | Required for edit_image endpoint. One or more URLs of source images to edit. |

## FAQ

### Can I use GPT Image 2 in Hermes Agent?

Yes. Hermes Agent calls GPT Image 2 through the RunAPI text_to_image endpoint. Set the model field to gpt-image-2-text-to-image and send the request with the same RUNAPI_API_KEY you configured for the custom:runapi provider. No ComfyUI or GPU rental required.

### What is the difference between GPT Image 2 and GPT-4o Image?

GPT Image 2 is OpenAI&#39;s dedicated image generation model with higher quality, 4K output, and transparent background support. GPT-4o Image generates images within a chat context but is limited to 1:1, 3:2, or 2:3 aspect ratios. Both are available through RunAPI — use gpt-image-2-text-to-image for standalone generation and gpt-4o-image for chat-integrated image output.

### How is GPT Image 2 priced differently from GPT-4o Image?

GPT Image 2 is billed per image by output resolution: 1k, 2k, or 4k. GPT-4o Image is billed per image by output count — generating 2 or 4 images in a single request costs more per image. Both use pay-as-you-go billing with no monthly minimum. Check the RunAPI pricing page for current rates.

### Can Hermes Agent edit images with GPT Image 2 instead of ComfyUI?

Yes. Use the edit_image endpoint with model set to gpt-image-2-image-to-image. Pass source images in source_image_urls and describe the edit in natural language — &quot;remove the background,&quot; &quot;add sunglasses,&quot; &quot;change the text to HELLO.&quot; No ComfyUI workflow graphs, no GPU instance, no inpainting masks needed.

### Does quality improve or degrade through iterative editing passes?

It can degrade. Users report that repeated refinement passes sometimes introduce noise patterns or shading degradation. For best results, be specific in the first prompt rather than planning on iterative refinement. If you need multi-step editing, consider using Flux Kontext for the refinement stage.

### Can Hermes Agent combine GPT Image with other RunAPI models in a single workflow?

Yes. Hermes Agent can generate an image with GPT Image 2, upscale it with Topaz, or pass it to Flux Kontext for further editing. All models share the same RunAPI key and the agent handles the chaining.


## Links

- [Hermes Agent setup guide →](https://runapi.ai/hermes-agent)
- [GPT Image models →](https://runapi.ai/models/gpt-image)
- [Model catalog](https://runapi.ai/models)
- [API docs](https://runapi.ai/docs)
