在 Hermes Agent 中使用 GPT Image。
GPT Image 2 是 OpenAI 专用的图像生成模型——支持文生图和基于指令的图像编辑,输出分辨率最高可达 4K,并支持透明背景。Hermes Agent 通过与聊天相同的 RunAPI custom 提供商和 API 密钥调用它,无需 ComfyUI 或 GPU 设置。
使用 RunAPI 通过 Hermes Agent 用 OpenAI GPT Image 2 生成图像。
要求:
- 使用位于 https://runapi.ai/v1/text_to_image 的 RunAPI API。
- 从 RUNAPI_API_KEY 环境变量读取 API 密钥。
- 使用 Hermes Agent 中已配置的 custom:runapi 提供商。
- 将 model 设置为 "gpt-image-2-text-to-image"。
- 编写一个描述性的 prompt。GPT Image 2 会紧密遵循自然语言指令——描述布局、风格、文字叠加和透明度需求。
- 可选地将 output_resolution 设为 1k、2k 或 4k。默认为 1k。
- 响应会返回一个 task_id。轮询任务状态端点直到任务完成,然后获取输出 URL。
curl -X POST https://runapi.ai/v1/text_to_image \
-H "Authorization: Bearer $RUNAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-image-2-text-to-image",
"prompt": "A product photo of a glass perfume bottle on a marble surface, transparent background, studio lighting, the label reads AURORA in gold serif font",
"output_resolution": "2k",
"aspect_ratio": "3:4"
}'
{
"task_id": "tsk_abc123",
"status": "pending",
"model": "gpt-image-2-text-to-image"
}
三步在 Hermes Agent 中使用 GPT Image
Configure RunAPI
Set the RUNAPI_API_KEY environment variable in your shell profile. If the custom:runapi provider is already configured in Hermes Agent for chat, the same key and base_url work for GPT Image — no additional setup needed.
export RUNAPI_API_KEY=runapi_xxx
Call GPT Image 2
Send a POST request to the text_to_image endpoint with model set to gpt-image-2-text-to-image. Include a descriptive prompt with layout and style instructions. Set output_resolution to 2k or 4k for higher detail. For editing existing images, use the edit_image endpoint with gpt-image-2-image-to-image and provide source_image_urls.
POST /v1/text_to_image
Get the result
The API returns a task_id immediately. Poll the task status endpoint until the status changes to completed, then retrieve the output image URL from the response. GPT Image 2 typically completes within 10–30 seconds depending on resolution.
task_id: tsk_abc123
GPT Image API 参数
| 参数 | 类型 | 说明 |
|---|---|---|
model |
string |
Required. gpt-image-2-text-to-image for generation, gpt-image-2-image-to-image for editing. |
prompt |
string |
Required. Natural language description of the desired image. Supports detailed instructions for layout, text overlays, and style. |
output_resolution |
string |
Optional. Output resolution — 1k (default), 2k, or 4k. Higher resolution costs more per image. |
aspect_ratio |
string |
Optional. Defaults to auto. Supports 1:1, 3:2, 2:3, 4:3, 3:4, 16:9, 9:16, and more. |
source_image_urls |
array |
Required for edit_image endpoint. One or more URLs of source images to edit. |
Hermes Agent 上的 GPT Image 是什么?
GPT Image 2 将提示词视为制作简报而非关键词列表。它在生成前包含一个推理步骤,有助于遵循排版、文字位置和构图的结构化指令。Hermes Agent 通过 RunAPI custom provider 调用它。
GPT Image 使用场景
结构化提示词的批量图像生成
通过 Hermes Agent 批量处理结构化设计简报,为产品目录、营销活动或内容系列生成图像,GPT Image 2 对每个简报都严格遵循排版和风格规范。
多模态内容流水线
将 GPT Image 2 与 GPT 文本模型串联——先用 GPT 生成详细的设计简报,再用 GPT Image 2 执行生成,确保视觉输出与内容策略紧密对齐。
透明资产的品牌套件生成
生成带透明背景的品牌资产——图标、徽章、UI 元素——可在设计工作流或 Hermes Agent 的下游步骤中直接复合使用。
GPT Image + Hermes Agent 常见问题
Yes. Hermes Agent calls GPT Image 2 through the RunAPI text_to_image endpoint. Set the model field to gpt-image-2-text-to-image and send the request with the same RUNAPI_API_KEY you configured for the custom:runapi provider. No ComfyUI or GPU rental required.
GPT Image 2 is OpenAI's dedicated image generation model with higher quality, 4K output, and transparent background support. GPT-4o Image generates images within a chat context but is limited to 1:1, 3:2, or 2:3 aspect ratios. Both are available through RunAPI — use gpt-image-2-text-to-image for standalone generation and gpt-4o-image for chat-integrated image output.
GPT Image 2 is billed per image by output resolution: 1k, 2k, or 4k. GPT-4o Image is billed per image by output count — generating 2 or 4 images in a single request costs more per image. Both use pay-as-you-go billing with no monthly minimum. Check the RunAPI pricing page for current rates.
Yes. Use the edit_image endpoint with model set to gpt-image-2-image-to-image. Pass source images in source_image_urls and describe the edit in natural language — "remove the background," "add sunglasses," "change the text to HELLO." No ComfyUI workflow graphs, no GPU instance, no inpainting masks needed.
It can degrade. Users report that repeated refinement passes sometimes introduce noise patterns or shading degradation. For best results, be specific in the first prompt rather than planning on iterative refinement. If you need multi-step editing, consider using Flux Kontext for the refinement stage.
Yes. Hermes Agent can generate an image with GPT Image 2, upscale it with Topaz, or pass it to Flux Kontext for further editing. All models share the same RunAPI key and the agent handles the chaining.
立即在 Hermes Agent 中试用 GPT Image。
免费获取 RunAPI 密钥,配置 custom:runapi 提供商,开始使用 OpenAI GPT Image 2 生成和编辑图像。