Add Flux, Imagen 4, and GPT Image to OpenClaw — No Extra Skills
The same API key your OpenClaw agent uses for chat also calls Flux Kontext, Imagen 4, Seedream, and GPT Image 2 — 10+ image models, no extra skill to install. Image cost starts at 2 cents per generation, pay-as-you-go.
Install the RunAPI CLI and generate an image in this OpenClaw workspace.
- Run: npx runapi
- It will prompt you for an API key. Sign up at https://runapi.ai if you don't have one.
- Then run: npx runapi run text_to_image --model flux-kontext-pro --prompt "A red cube on a white table"
- Confirm the image URL appears in the output.
Available image models: flux-kontext-pro, imagen-4, seedream-5-lite-text-to-image, gpt-image-2-text-to-image.
Four image models, one API key, zero extra skills
Each model covers a different production need. Flux Kontext handles brand-consistent edits and text rendering. Imagen 4 produces photorealistic output with strong prompt adherence. Seedream 5 Lite runs fast at low cost for high-volume workflows. GPT Image 2 excels at instruction-following edits and compositing. All four are callable from the same RunAPI endpoint your OpenClaw agent already uses for chat.
Text rendering, brand consistency, and in-context editing. Strong for marketing assets where text overlays and style references matter. From 2.5 cents per image.
Photorealistic generation with high prompt fidelity. Standard, fast, and ultra tiers let you trade speed for detail. From 2 cents per image.
Fast, low-cost generation for bulk workflows. Handles text-to-image and image-to-image at 2.75 cents per call, suited for prototyping and iteration.
Instruction-driven editing and compositing. Best for tasks where the prompt describes a transformation — background removal, style transfer, object placement.
Generate images in OpenClaw through RunAPI
Configure RunAPI in OpenClaw
If you have not set up RunAPI in OpenClaw yet, follow the OpenClaw setup guide. Add the RunAPI provider with baseUrl https://runapi.ai/v1 and your RUNAPI_API_KEY environment variable.
Send an image generation request
Use the RunAPI task endpoint from your agent code or a direct HTTP call. Set the model field to the image model slug such as flux-kontext-pro, imagen-4, or gpt-image-2. The request body follows the same JSON pattern as LLM calls.
Poll and retrieve the result
Image tasks return a task ID. Poll the task status endpoint or use a webhook callback. When the task completes, the response includes the generated image URL. RunAPI SDKs and the CLI handle polling automatically.
Flux Kontext vs Imagen 4 vs Seedream vs GPT Image 2
| Comparison point | Flux Kontext Pro | Imagen 4 | Seedream 5 Lite | GPT Image 2 |
|---|---|---|---|---|
| Best default use | Brand-consistent edits, text overlays, logo placement, and in-context image modification with style references. | Photorealistic generation from detailed prompts, product photography, and high-fidelity single-image output. | Fast bulk generation and prototyping where speed and cost matter more than maximum detail. | Instruction-driven compositing, background changes, style transfers, and multi-step image editing. |
| Input types | Text prompt, reference image for in-context editing, style references. | Text prompt. Fast, standard, and ultra quality tiers. | Text prompt, image-to-image with a source image URL. | Text prompt for generation, text plus image URL for editing and compositing. |
| Output quality | High detail with accurate text rendering. Strong at preserving brand elements across edits. | Photorealistic with strong prompt adherence. Ultra tier for maximum detail. | Good quality at speed. Suitable for drafts, thumbnails, and iteration loops. | High quality with strong instruction following. Best for edits that require understanding spatial relationships. |
| Speed | Standard generation speed. Suitable for interactive and batch workflows. | Fast tier available for near-instant output. Standard and ultra tiers trade speed for quality. | Fastest among the four. Optimized for high-volume pipelines. | Standard speed. Slightly slower for complex multi-step edits. |
| Cost per image | From 2.5 cents (Pro). Max tier at 5 cents for higher fidelity. | From 2 cents (Fast) to 6 cents (Ultra). Standard at 4 cents. | 2.75 cents per image for both text-to-image and image-to-image. | From 3 cents. Resolution-based pricing. |
| Best for OpenClaw agents | When the agent workflow involves brand assets, marketing images, or text-heavy visuals. | When the agent needs photorealistic output from a natural language description. | When the agent generates many images per session and cost or speed is the priority. | When the agent edits existing images based on user instructions. |
Marketing asset generation
OpenClaw agents can generate product images, social media visuals, and ad creatives by calling Flux Kontext or Imagen 4. The agent writes the prompt based on conversation context and retrieves the finished image in the same session.
Explore Flux KontextAutomated image editing
Pass an existing image URL to GPT Image 2 or Seedream with an editing instruction. The agent can remove backgrounds, swap styles, or composite elements without manual design tools.
Explore GPT Image 2High-volume image iteration
OpenClaw agents that prototype UI components, social ads, or product variations benefit from Seedream 5 Lite's speed and low per-image cost. Generate 20 drafts per session, then route the selected prompt to Flux Kontext Pro for the final output.
Explore SeedreamGenerate images through the RunAPI task endpoint
Use the same RunAPI key and task lifecycle for every image model. Change the model slug and endpoint to switch between Flux, Imagen, Seedream, and GPT Image. The response returns a task ID for polling.
{
"model": "flux-kontext-pro",
"prompt": "A futuristic city skyline at sunset, photorealistic, 8K detail"
}
{
"model": "imagen-4",
"prompt": "A golden retriever wearing astronaut suit, studio lighting"
}
{
"model": "gpt-image-2",
"prompt": "Remove the background and add a tropical beach",
"image_url": "https://example.com/photo.jpg"
}
The same API key generates video and music
Video generation
Generate video clips with Kling 3.0, Veo 3, and Seedance 2.0. Text-to-video and image-to-video endpoints follow the same async task lifecycle as image generation.
Compare video APIsMusic creation
Create music tracks with Suno v4, v4.5, and v5. Describe the genre, mood, and lyrics in the prompt. The agent receives audio URLs when the task completes.
Explore Suno modelsImage generation costs start at 2 cents per image
RunAPI uses pay-as-you-go pricing with no monthly subscription. Each image model has a per-generation cost based on the model tier and output resolution. Flux Kontext Pro starts at 2.5 cents, Imagen 4 Fast at 2 cents, Seedream 5 Lite at 2.75 cents, and GPT Image 2 at 3 cents. Check the live pricing page for current rates across all 113+ models.
Pricing methodology
Prices on this page reflect the RunAPI pay-as-you-go rates at the time of publication. RunAPI sets prices based on compute cost plus a transparent margin. Actual per-image cost may vary by resolution, quality tier, or model-specific options. Always confirm current pricing on the live pricing page before production deployment.
OpenClaw image generation FAQ
How do I choose between Flux Kontext and Imagen 4 for my OpenClaw workflow?
Use Flux Kontext Pro when the agent produces brand assets, marketing images, or visuals that include text — it preserves style and renders text accurately. Use Imagen 4 when the agent needs photorealistic output from a natural language description. Both are available from the same RunAPI endpoint.
Which image models work with OpenClaw through RunAPI?
Flux Kontext Pro and Max, Imagen 4 in fast, standard, and ultra tiers, Seedream 5 Lite, GPT Image 2, Nano Banana, and several more. Over 10 image models across 6 providers are available through the same API key. The full list is available on the RunAPI pricing page and updates automatically as new models are added.
How much does image generation cost through RunAPI?
Prices start at 2 cents per image with Imagen 4 Fast. Flux Kontext Pro costs 2.5 cents, Seedream 5 Lite costs 2.75 cents, and GPT Image 2 starts at 3 cents. All pricing is pay-as-you-go with no monthly minimum.
Does RunAPI charge separately for image calls versus LLM calls?
No. RunAPI uses a single credits balance across all modalities. Image, video, music, and LLM calls all draw from the same account balance. There is no per-modality subscription or minimum spend. You can monitor spending per model in the RunAPI dashboard.
How do I switch between image models in my OpenClaw workflow?
Change the model field in the request body. The endpoint, API key, task lifecycle, and polling pattern stay the same. Your agent can route to different models based on the task without changing any integration code. For example, route drafts to seedream-5-lite-text-to-image and final exports to flux-kontext-pro.
Is RunAPI cheaper than fal.ai or Replicate for image generation?
RunAPI image generation starts at 2 cents per image compared to fal.ai at roughly 4 cents and Replicate at roughly 3.5 cents per image. Pricing varies by model and resolution. Check the RunAPI pricing page for current per-model rates.
Add image generation to OpenClaw in minutes.
One RunAPI key gives your OpenClaw agent access to Flux Kontext, Imagen 4, Seedream, GPT Image 2, and 100+ more models across image, video, music, and LLM. No extra skills, no extra billing accounts.