VARIANT · Google / Gemini Omni

gemini-omni-text-to-video API

Google / Gemini Omni

Use gemini-omni-text-to-video from the Gemini Omni family via RunAPI. Per-call pricing, no subscription, and failed generations are never charged.

Operational · video · Commercial OK

runapi.ai

curl -X POST https://runapi.ai/api/v1/gemini_omni/text_to_video \
  -H "Authorization: Bearer $RUNAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "gemini-omni-text-to-video",
  "prompt": "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks."
}'

import { GeminiOmniClient } from "@runapi.ai/gemini-omni";

const client = new GeminiOmniClient();
const result = await client.textToVideo.run({
    model: "gemini-omni-text-to-video",
    prompt: "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks.",
});

<?php

require __DIR__ . "/vendor/autoload.php";

use RunApi\GeminiOmni\GeminiOmniClient;

$client = new GeminiOmniClient();
$result = $client->textToVideo->run([
        'model' => 'gemini-omni-text-to-video',
        'prompt' => 'Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks.',
]);

require "runapi/gemini_omni"

client = RunApi::GeminiOmni::Client.new
result = client.text_to_video.run(
    model: "gemini-omni-text-to-video",
    prompt: "Create a 1080p neon city tracking shot with a reusable character walking through rain while a calm narrator speaks."
)

npx skills add runapi-ai/gemini-omni -g

# Claude Code
claude mcp add runapi -s user -- npx -y @runapi.ai/mcp

# Codex
codex plugin install runapi-mcp@agents

# Cursor / Windsurf / VS Code
npx @runapi.ai/mcp init cursor

@runapi.ai/gemini-omni v1

Switch variant

gemini-omni-audio gemini-omni-character gemini-omni-flash-preview

OVERVIEW

gemini-omni-text-to-video targets the sweet spot of quality and cost within the Gemini Omni family. It shares the same API key, request shape, and async task lifecycle as every other RunAPI model — switch to this variant by changing one model ID parameter. Install the Gemini Omni skill to load docs, typed schemas, pricing notes, and setup steps into your coding workspace. Billing is metered per completed request in USD with no subscription or minimum spend, and failed generations are never charged.

Pay-per-call pricing in USD with no subscription
Failed generations are never charged
Streaming when supported by the model
Schema-validated parameters and tool calls
Switch variant by changing one model ID parameter
Unified billing across all models and providers

PRICING

Pricing

Failed generations are not charged

Text to video

$0.90-$3.60 / video

Input mode: video · Duration seconds: any · Output resolution: 4k $3.60

Input mode: video · Duration seconds: any · Output resolution: 720p $2.40

Input mode: video · Duration seconds: any · Output resolution: 1080p $2.40

Input mode: generated · Duration seconds: 4 · Output resolution: 4k $2.10

Input mode: generated · Duration seconds: 4 · Output resolution: 720p $0.90

Input mode: generated · Duration seconds: 4 · Output resolution: 1080p $0.90

Input mode: generated · Duration seconds: 6 · Output resolution: 4k $2.40

Input mode: generated · Duration seconds: 6 · Output resolution: 720p $1.20

Input mode: generated · Duration seconds: 6 · Output resolution: 1080p $1.20

Input mode: generated · Duration seconds: 8 · Output resolution: 4k $2.70

Input mode: generated · Duration seconds: 8 · Output resolution: 720p $1.50

Input mode: generated · Duration seconds: 8 · Output resolution: 1080p $1.50

Input mode: generated · Duration seconds: 10 · Output resolution: 4k $3.00

Input mode: generated · Duration seconds: 10 · Output resolution: 720p $1.80

Input mode: generated · Duration seconds: 10 · Output resolution: 1080p $1.80

SPEC SHEET

Technical details

Model ID	gemini-omni-text-to-video
Provider	Google
Modality	video
Task type	asynchronous
Billing unit	call
API endpoint	/api/v1/gemini_omni/text_to_video
Commercial license	Yes — included via API
Catalog status	Operational

SKILLS

Model skill — gemini-omni-text-to-video

Install the skill once, then use the variant ID from this page while building.

# Install the model skill for app development workflows
npx skills add runapi-ai/gemini-omni -g

Installs docs, schemas, pricing context, and setup notes into your developer workspace.

Or use this setup request in your coding tool:

Install the Gemini Omni skill for this app:

1. Add runapi-ai/gemini-omni with the skills installer.
2. Load SKILL.md in this workspace.
3. Use its docs, schemas, pricing notes, and setup steps when adding model features.
4. Confirm the install path when done.

HOW IT WORKS

Use gemini-omni-text-to-video with a model skill

Install

Install the model skill for the Gemini Omni line. The skill loads docs, typed schemas, pricing notes, and setup steps into your coding workspace so your agent has the right context.

Configure

Set the model field to the full model ID shown on this page and configure your RunAPI API key as an environment variable. The same key works across all models and providers.

Call

Use the skill instructions while adding prompt, input, and callback handling to your application. RunAPI routes the request to the provider, manages the async lifecycle, and returns structured JSON.

Receive

Read the task response by polling the task ID, streaming when supported, or handling the webhook callback at your configured URL. Generated outputs are stored and retrievable by task ID.

DIFFERENCES

What's different about gemini-omni-text-to-video

VS GEMINI-OMNI-AUDIO

Prompted multimodal video with image, audio, character, and source-clip references

Synchronous reusable voice resource creation from preset voices

VS GEMINI-OMNI-CHARACTER

Prompted multimodal video with image, audio, character, and source-clip references

Synchronous reusable character resource creation from one reference image

VS GEMINI-OMNI-FLASH-PREVIEW

Prompted multimodal video with image, audio, character, and source-clip references

Fast, conversational video generation for natural-language creative iteration

USE CASES

Best for

Ad & social content

Generate product launch clips and short-form ads from a text brief, cutting production from weeks to hours.

E-learning

Convert lesson scripts into animated explainer videos at scale without a camera or crew.

Creator workflows

Produce viral short-form content for social platforms directly from a prompt.

FAQ

Frequently asked questions about gemini-omni-text-to-video

Is the model ID stable across versions?

Yes. RunAPI keeps the model ID stable and handles compatible version refreshes without changing your request shape. You do not need to update your code when the provider releases a new compatible version.

What's the rate limit on this variant?

Per-key rate limits scale with your usage tier. The pricing page shows current limits. If you need higher throughput, contact support to discuss tier upgrades.

Can I switch variants later?

Yes. Variant is a parameter in the request. Switch by changing the model ID — no code changes, no re-authentication, no separate billing setup. All variants share the same API key and request shape.

Does it stream?

Where streaming is available, RunAPI streams end-to-end. LLM models support token-level streaming. Media models use async task polling or webhook callbacks for result delivery.

Where do I report quality issues?

Open an issue on the public GitHub repo or email support at [email protected]. Include the task ID and model ID so the team can investigate the specific generation.

Do I need a separate provider account?

No. Your RunAPI API key is enough to access this variant and every other model in the catalog. You do not need accounts with the underlying provider.