VARIANT · Google / Gemini Omni

Gemini Omni gemini-omni-text-to-video API

Same API, same SDK — switch variants by changing one parameter.

Operational · video · Commercial OK
# Works with Claude Code, Codex, Gemini CLI, Cursor, and 50+ agents
npx skills add runapi-ai/gemini-omni -g
The -g flag installs globally so every project picks it up.
Or paste this prompt to your AI agent:
Install the Gemini Omni skill for me:

1. Clone https://github.com/runapi-ai/gemini-omni
2. Copy the skills/gemini-omni/ directory into your
   user-level skills directory (e.g. ~/.claude/skills/
   for Claude Code, ~/.codex/skills/ for Codex).
3. Verify that SKILL.md is present.
4. Confirm the install path when done.
Switch variant
gemini-omni-audio gemini-omni-character
OVERVIEW

gemini-omni-text-to-video targets the sweet spot of quality and cost within the Gemini Omni family.

  • Pay-per-call pricing in USD
  • Failed generations not charged
  • Streaming when supported by the model
  • Schema-validated tool calls
PRICING

Pricing

Failed generations are not charged
Text to video
$0.90-$3.60 / video
Input mode: video · Duration seconds: any · Output resolution: 4k $3.60
Input mode: video · Duration seconds: any · Output resolution: 720p $2.40
Input mode: video · Duration seconds: any · Output resolution: 1080p $2.40
Input mode: generated · Duration seconds: 4 · Output resolution: 4k $2.10
Input mode: generated · Duration seconds: 4 · Output resolution: 720p $0.90
Input mode: generated · Duration seconds: 4 · Output resolution: 1080p $0.90
Input mode: generated · Duration seconds: 6 · Output resolution: 4k $2.40
Input mode: generated · Duration seconds: 6 · Output resolution: 720p $1.20
Input mode: generated · Duration seconds: 6 · Output resolution: 1080p $1.20
Input mode: generated · Duration seconds: 8 · Output resolution: 4k $2.70
Input mode: generated · Duration seconds: 8 · Output resolution: 720p $1.50
Input mode: generated · Duration seconds: 8 · Output resolution: 1080p $1.50
Input mode: generated · Duration seconds: 10 · Output resolution: 4k $3.00
Input mode: generated · Duration seconds: 10 · Output resolution: 720p $1.80
Input mode: generated · Duration seconds: 10 · Output resolution: 1080p $1.80
SPEC SHEET

Technical details

Model ID gemini-omni-text-to-video
Provider Google
Modality video
Task type asynchronous
Billing unit call
API endpoint /api/v1/gemini_omni/text_to_video
Commercial license Yes — included via API
Status Operational
QUICKSTART

Quickstart — gemini-omni-text-to-video

runapi.ai
curl -X POST https://runapi.ai/api/v1/gemini_omni/text_to_video \
  -H "Authorization: Bearer $RUNAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "gemini-omni-text-to-video",
  "prompt": "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks."
}'
import { GeminiOmniClient } from "@runapi.ai/gemini-omni";

const client = new GeminiOmniClient();
const result = await client.textToVideo.run({
    model: "gemini-omni-text-to-video",
    prompt: "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks.",
});
require "runapi/gemini_omni"

client = RunApi::GeminiOmni::Client.new
result = client.text_to_video.run(
    model: "gemini-omni-text-to-video",
    prompt: "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks."
)
@runapi.ai/gemini-omni v1
HOW IT WORKS

Use gemini-omni-text-to-video in four steps

01

Install

Install the model SDK or agent skill for this model line.

02

Configure

Set the model field to the full model ID shown on this page.

03

Call

Send a typed request with your prompt, inputs, and callback settings.

04

Receive

Read the task response, webhook callback, or cached output URL from RunAPI.

DIFFERENCES

What's different about gemini-omni-text-to-video

VS GEMINI-OMNI-AUDIO

Prompted multimodal video with image, audio, character, and source-clip references

Synchronous reusable voice resource creation from preset voices

VS GEMINI-OMNI-CHARACTER

Prompted multimodal video with image, audio, character, and source-clip references

Synchronous reusable character resource creation from one reference image

USE CASES

Best for

Ad & social content

Generate product launch clips and short-form ads from a text brief, cutting production from weeks to hours.

E-learning

Convert lesson scripts into animated explainer videos at scale without a camera or crew.

Creator workflows

Produce viral short-form content for social platforms directly from a prompt.

FAQ

Frequently asked questions about gemini-omni-text-to-video

Is the model ID stable across versions?

RunAPI keeps the model ID stable and handles compatible version refreshes without changing your request shape.

What's the rate limit on this variant?

Per-key rate limits scale with usage tier. See pricing page for current limits.

Can I switch variants later?

Yes — variant is a flag. Switch by changing the model parameter.

Does it stream?

Where streaming is available, RunAPI streams end-to-end.

Where do I report quality issues?

Open an issue on the public GitHub repo or email support.

START NOW

Start building with Gemini Omni.