在 OpenClaw 中使用 Wan。
Wan 是 Alibaba 的开源视频与图像生成模型,采用 Apache 2.0 许可,在 Artificial Analysis 文生视频排行榜上位列第一。它涵盖从 Wan 2.2 到 2.7 的 20 多个版本 —— 文生视频、图生视频、带唇形同步的语音生视频、通过 R2V 进行视频编辑,以及最高 4K 的图像生成。OpenClaw 智能体使用与聊天相同的 RunAPI 密钥调用任意 Wan 端点。
Use RunAPI to generate a video with Alibaba Wan 2.7.
要求:
- Call the RunAPI text_to_video endpoint at https://runapi.ai/api/v1/task/text_to_video.
- Set model to "wan-2.7-text-to-video".
- Read the API key from the RUNAPI_API_KEY environment variable.
- Set output_resolution to "1080p" for full HD output.
- Include a detailed prompt describing the scene, camera motion, and lighting.
- The response is async. Poll the returned task_id until status is "completed".
- When done, read the video URL from the response output.
curl -X POST https://runapi.ai/api/v1/task/text_to_video \
-H "Authorization: Bearer $RUNAPI_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "wan-2.7-text-to-video",
"prompt": "A drone shot rising over terraced rice paddies at golden hour, mist rolling through the valleys, slow upward camera tilt",
"output_resolution": "1080p"
}'
{
"task_id": "tsk_abc123",
"status": "pending",
"model": "wan-2.7-text-to-video"
}
三步在 OpenClaw 中使用 Wan
Configure RunAPI
Set RUNAPI_API_KEY in your environment. If you already configured RunAPI in OpenClaw for chat or image generation, the same key works for all Wan endpoints — no extra provider setup needed.
export RUNAPI_API_KEY=runapi_xxx
Call a Wan endpoint
Send a POST request to text_to_video with model set to wan-2.7-text-to-video and output_resolution to 720p or 1080p. For image-to-video, use wan-2.7-image-to-video with a first_frame_image_url. For speech-driven video, use wan-2.2-a14b-speech-to-video-turbo with source_audio_url and source_image_url.
POST /api/v1/task/text_to_video
Poll for the result
The endpoint returns a task_id immediately. Poll the task status endpoint until the status is completed, then read the output video or image URL from the response. RunAPI SDKs and the CLI handle polling automatically.
GET /api/v1/task/text_to_video/tsk_abc123
Wan text_to_video API 参数
| 参数 | 类型 | 说明 |
|---|---|---|
model |
string |
Required. wan-2.7-text-to-video, wan-2.6-text-to-video, wan-2.5-text-to-video, wan-2.2-a14b-text-to-video-turbo, or wan-2.7-r2v. |
prompt |
string |
Required. Text description of the desired video scene, including camera motion, lighting, and subject detail. |
output_resolution |
string |
Optional. 720p or 1080p for Wan 2.5+. Wan 2.2 also accepts 480p and 580p. Defaults to 720p. |
aspect_ratio |
string |
Optional. For wan-2.7-r2v only. Accepted values: 16:9, 9:16, 1:1, 4:3, 3:4. |
duration_seconds |
integer |
Optional. For wan-2.7-r2v only. Video length in seconds, 2 to 10. |
seed |
integer |
Optional. Reproducibility seed for deterministic output. |
callback_url |
string |
Optional. Webhook URL that receives a POST when the task completes. |
OpenClaw 上的 Wan 是什么?
Wan 是阿里巴巴推出的开源(Apache 2.0)视频模型,在 Artificial Analysis 文生视频排行榜上位列榜首。它提供首帧和末帧控制用于端点锚定生成、带角色一致性的多镜头视频,以及包含口型同步语音转视频在内的原生音频支持。OpenClaw agent 通过 RunAPI 用单一 API key 访问全部 20+ 个 Wan 变体。
Wan 使用场景
分镜转视频工作流
使用首帧和末帧锚定,将分镜板画面转化为视频序列。每个片段在你的关键帧上开始和结束,在多镜头项目中保持视觉连续性。
虚拟主播与品牌吉祥物
通过 Wan 的语音转视频端点,根据人脸图像和音频文件生成说话头像视频。模型处理口型同步和头部运动,用于一致的品牌代言人内容。
带角色一致性的多镜头序列
构建相同角色跨多个片段出现的对话密集或叙事内容。Wan 的时序一致性在镜头间保持面部和服装的稳定性。
Wan + OpenClaw 常见问题
All of them. text_to_video, image_to_video, speech_to_video, text_to_image (Wan 2.7 Image), edit_video, and animate. Each endpoint uses a different model slug — for example wan-2.7-text-to-video for video generation and wan-2.7-image for image generation up to 4K.
Wan 2.5 introduced 1080p output. Wan 2.6 added video editing (R2V) and flash variants for faster generation. Wan 2.7 adds image generation (wan-2.7-image, wan-2.7-image-pro up to 4K), video editing (wan-2.7-edit-video), and improved text-to-video quality that leads the Artificial Analysis leaderboard.
Use wan-2.2-a14b-speech-to-video-turbo with source_audio_url (the audio file) and source_image_url (the face to animate). Wan generates a lip-synced video where the face speaks the audio. Output resolution supports 480p, 580p, or 720p.
Yes. Wan 2.7 added text_to_image endpoints. Use wan-2.7-image for standard generation or wan-2.7-image-pro for higher quality. Both support aspect ratios from 1:1 to 21:9 and output resolutions of 1k, 2k, or 4k.
Yes. Wan is released under Apache 2.0 by Alibaba and the model weights are publicly available. Through RunAPI you skip the GPU setup — one API call generates video or images. If you need a self-hosted pipeline for privacy, the same weights run on your own infrastructure.