Which Wan endpoints can I call from OpenClaw?

All of them. text_to_video, image_to_video, speech_to_video, text_to_image (Wan 2.7 Image), edit_video, and animate. Each endpoint uses a different model slug — for example wan-2.7-text-to-video for video generation and wan-2.7-image for image generation up to 4K.

What is the difference between Wan 2.5, 2.6, and 2.7?

Wan 2.5 introduced 1080p output. Wan 2.6 added video editing (R2V) and flash variants for faster generation. Wan 2.7 adds image generation (wan-2.7-image, wan-2.7-image-pro up to 4K), video editing (wan-2.7-edit-video), and improved text-to-video quality that leads the Artificial Analysis leaderboard.

How does speech-to-video work with Wan?

Use wan-2.2-a14b-speech-to-video-turbo with source_audio_url (the audio file) and source_image_url (the face to animate). Wan generates a lip-synced video where the face speaks the audio. Output resolution supports 480p, 580p, or 720p.

Can I generate images with Wan?

Yes. Wan 2.7 added text_to_image endpoints. Use wan-2.7-image for standard generation or wan-2.7-image-pro for higher quality. Both support aspect ratios from 1:1 to 21:9 and output resolutions of 1k, 2k, or 4k.

Is Wan open-source? Can I self-host it?

Yes. Wan is released under Apache 2.0 by Alibaba and the model weights are publicly available. Through RunAPI you skip the GPU setup — one API call generates video or images. If you need a self-hosted pipeline for privacy, the same weights run on your own infrastructure.

OPENCLAW + WAN

Use Wan in OpenClaw.

Wan is Alibaba's open-source video and image generation model, Apache 2.0 licensed and ranked #1 on the Artificial Analysis text-to-video leaderboard. It spans 20+ variants from Wan 2.2 through 2.7 — text-to-video, image-to-video, speech-to-video with lip-sync, video editing via R2V, and image generation up to 4K. OpenClaw agents call any Wan endpoint through the same RunAPI key used for chat.

Get API Key Read the docs

one API key · 20+ Wan variants · Apache 2.0 open source

Use RunAPI to generate a video with Alibaba Wan 2.7.

Requirements:
- Call the RunAPI text_to_video endpoint at https://runapi.ai/api/v1/task/text_to_video.
- Set model to "wan-2.7-text-to-video".
- Read the API key from the RUNAPI_API_KEY environment variable.
- Set output_resolution to "1080p" for full HD output.
- Include a detailed prompt describing the scene, camera motion, and lighting.
- The response is async. Poll the returned task_id until status is "completed".
- When done, read the video URL from the response output.

curl -X POST https://runapi.ai/api/v1/task/text_to_video \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "wan-2.7-text-to-video",
    "prompt": "A drone shot rising over terraced rice paddies at golden hour, mist rolling through the valleys, slow upward camera tilt",
    "output_resolution": "1080p"
  }'

{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "wan-2.7-text-to-video"
}

Copy the curl command to test wan

HOW IT WORKS

Use Wan in OpenClaw in three steps

Configure RunAPI

Set RUNAPI_API_KEY in your environment. If you already configured RunAPI in OpenClaw for chat or image generation, the same key works for all Wan endpoints — no extra provider setup needed.

export RUNAPI_API_KEY=runapi_xxx

Call a Wan endpoint

Send a POST request to text_to_video with model set to wan-2.7-text-to-video and output_resolution to 720p or 1080p. For image-to-video, use wan-2.7-image-to-video with a first_frame_image_url. For speech-driven video, use wan-2.2-a14b-speech-to-video-turbo with source_audio_url and source_image_url.

POST /api/v1/task/text_to_video

Poll for the result

The endpoint returns a task_id immediately. Poll the task status endpoint until the status is completed, then read the output video or image URL from the response. RunAPI SDKs and the CLI handle polling automatically.

GET /api/v1/task/text_to_video/tsk_abc123

PARAMETERS

Wan text_to_video API parameters

Parameter	Type	Description
`model`	`string`	Required. wan-2.7-text-to-video, wan-2.6-text-to-video, wan-2.5-text-to-video, wan-2.2-a14b-text-to-video-turbo, or wan-2.7-r2v.
`prompt`	`string`	Required. Text description of the desired video scene, including camera motion, lighting, and subject detail.
`output_resolution`	`string`	Optional. 720p or 1080p for Wan 2.5+. Wan 2.2 also accepts 480p and 580p. Defaults to 720p.
`aspect_ratio`	`string`	Optional. For wan-2.7-r2v only. Accepted values: 16:9, 9:16, 1:1, 4:3, 3:4.
`duration_seconds`	`integer`	Optional. For wan-2.7-r2v only. Video length in seconds, 2 to 10.
`seed`	`integer`	Optional. Reproducibility seed for deterministic output.
`callback_url`	`string`	Optional. Webhook URL that receives a POST when the task completes.

What is Wan on OpenClaw?

Wan by Alibaba is an open-source (Apache 2.0) video model ranked at the top of the Artificial Analysis text-to-video leaderboard. It offers first-frame and last-frame control for endpoint-anchored generation, multi-shot video with character consistency, and native audio including lip-synced speech-to-video. OpenClaw agents access all 20+ Wan variants through RunAPI with a single API key.

Wan use cases

Storyboard-to-video workflow

Use first-frame and last-frame anchoring to turn storyboard panels into video sequences. Each clip starts and ends on your keyframes, maintaining visual continuity across a multi-shot project.

Virtual presenters and brand mascots

Generate talking-head video from a face image and audio file using Wan's speech-to-video endpoint. The model handles lip sync and head movement for consistent brand spokesperson content.

Multi-shot sequences with character consistency

Build dialogue-heavy or narrative content where the same character appears across multiple clips. Wan's temporal consistency keeps faces and outfits stable between shots.

FAQ

Wan + OpenClaw questions

OpenClaw general setup

Not configured yet? Start with the RunAPI setup guide for OpenClaw.

OpenClaw setup guide →

Wan model catalog

See all 20+ Wan variants, pricing tiers, and endpoint docs.

Wan on RunAPI →

Try Wan in OpenClaw today.

Get a free RunAPI key, paste the prompt into OpenClaw, and generate video with the #1 ranked open-source model — text-to-video, image-to-video, or speech-to-video.

Browse models →