How does Kling per-second billing work on RunAPI?

Kling charges per second of generated video. The rate depends on output_resolution and whether enable_sound is on. A 5-second 720p clip without sound is the cheapest option; 1080p with sound costs roughly twice as much per second. Check the RunAPI pricing page for exact rates.

What happens when a Kling generation fails -- do I lose credits?

No. RunAPI only bills for completed generations. If the task fails or times out, the reserved credits are rolled back to your account balance.

Can Kling generate videos with sound?

Yes. Set enable_sound to true in the request body. Kling 3.0 generates synchronized audio matching the video content. Sound generation increases the per-second cost -- at 720p, sound adds about 3 cents per second.

How long does Kling generation actually take?

Generation typically takes 30 to 120 seconds depending on duration and resolution. Longer clips at 1080p with sound take the most time. The API returns a task_id immediately so your agent can do other work while waiting.

Can I control camera motion in Kling videos?

Kling 3.0 has a separate motion_control endpoint at /api/v1/kling/motion_control for applying motion presets to a source image with a reference video. The text_to_video endpoint relies on prompt descriptions for camera direction.

OPENCLAW + KLING

Use Kling in OpenClaw.

Kling 3.0 by Kuaishou generates video from text or images at up to 1080p with native audio, multi-shot scenes, and 3–15 second durations. OpenClaw agents call it through RunAPI with the same API key used for chat — send a prompt, poll the task, and receive a video URL.

Get API Key Read the docs

one API key · text to video + image to video · per-second billing

Use RunAPI to generate a video with Kling 3.0.

Requirements:
- Call POST https://runapi.ai/api/v1/kling/text_to_video
- Set model to "kling-3.0"
- Read the API key from RUNAPI_API_KEY environment variable
- Set duration_seconds to control length (3–15 seconds)
- Set aspect_ratio to "16:9" for landscape video
- Enable sound with enable_sound: true for native audio
- The response is async — poll the task status endpoint until the task completes, then retrieve the video URL

curl -X POST https://runapi.ai/api/v1/kling/text_to_video \
  -H "Authorization: Bearer $RUNAPI_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kling-3.0",
    "prompt": "A drone shot pulling back from a mountain lake at sunrise, mist rising off the water, cinematic lighting",
    "duration_seconds": 5,
    "aspect_ratio": "16:9",
    "enable_sound": true,
    "output_resolution": "1080p"
  }'

{
  "task_id": "tsk_abc123",
  "status": "pending",
  "model": "kling-3.0"
}

Copy the curl command to test kling

HOW IT WORKS

Use Kling in OpenClaw in three steps

Configure RunAPI

Set the RUNAPI_API_KEY environment variable. If you already configured RunAPI as an OpenClaw provider for chat, the same key works for video generation — no extra setup needed.

export RUNAPI_API_KEY=runapi_xxx

Call Kling text_to_video

Send a POST to /api/v1/kling/text_to_video with model set to kling-3.0. Include a prompt, duration_seconds (3–15), aspect_ratio, and optionally enable_sound for native audio. For image-driven generation, use /api/v1/kling/image_to_video with a first_frame_image_url instead.

POST /api/v1/kling/text_to_video

Poll for the result

The endpoint returns a task_id immediately. Poll the task status endpoint until the status changes to completed, then retrieve the video URL from the response. Generation typically takes 30–120 seconds depending on duration and resolution.

GET /api/v1/kling/text_to_video/tsk_abc123

PARAMETERS

Kling text_to_video API parameters

Parameter	Type	Description
`model`	`string`	Required. kling-3.0 for the latest version.
`prompt`	`string`	Video description. Required unless multi_shots is enabled.
`duration_seconds`	`integer`	Video length. Kling 3.0 supports 3–15 seconds. Older versions accept 5 or 10.
`aspect_ratio`	`string`	Output aspect ratio: 16:9, 9:16, or 1:1.
`output_resolution`	`string`	Resolution: 720p, 1080p, or 4k. Higher resolution costs more per second.
`enable_sound`	`boolean`	Generate native audio alongside video. Increases per-second cost.
`negative_prompt`	`string`	Elements to exclude from generation.
`first_frame_image_url`	`string`	Image URL to use as the opening frame (single-shot mode).
`cfg_scale`	`number`	Guidance scale (0–1). Higher values follow the prompt more closely.
`multi_shots`	`boolean`	Enable multi-shot scene generation with separate prompts per segment.

What is Kling on OpenClaw?

Kling 3.0 by Kuaishou is known for cinematic-quality video with strong cloth simulation, fluid dynamics, and motion physics. It generates clips up to 3 minutes long from text or images at up to 1080p with native audio and multi-shot scenes. OpenClaw agents call it through the RunAPI endpoint with the same API key used for chat.

Kling use cases

B-roll and establishing shots

Generate scene-length B-roll footage for tight deadlines -- nature shots, travel content, and environment footage where Kling's motion physics and cinematic lighting stand out.

Product lifestyle content

Create product videos for food, fashion, or lifestyle brands from a single image or text prompt, with natural camera movement and realistic material rendering.

Social media shorts

Produce short clips for TikTok, Reels, or YouTube Shorts with cinematic framing. Set duration_seconds to 5 or 10 for platform-ready lengths.

FAQ

Kling + OpenClaw questions

OpenClaw general setup

Not configured yet? Start with the RunAPI setup guide for OpenClaw.

OpenClaw setup guide →

Kling model catalog

See all Kling variants, pricing tiers, and API docs.

Kling models →

Try Kling in OpenClaw today.

Get a free RunAPI key, paste the prompt into OpenClaw, and start generating video with Kling 3.0.

Browse models →