PROVIDER

Google AI Models

Q: Is this an official Google integration?

RunAPI exposes a managed API surface with transparent per-call pricing, fully documented capability and parameters, and clear error behavior. You get the same model output quality without managing a direct provider relationship or provider-side account.

Q: Do I need a Google account?

No. Your RunAPI API key is enough for managed access to all Google models. You do not need to create a separate account, manage provider-specific credentials, or handle provider-side billing.

Q: What's the latency overhead from proxying through RunAPI?

Typically under 20 ms. RunAPI keeps the proxy layer close to model execution regions to minimize added latency. Media generation time is dominated by the model itself, not the proxy.

Q: Are images / videos cached?

Generated outputs are stored and retrievable by task ID. You can fetch completed results at any time using the task status endpoint or the RunAPI dashboard. Output URLs remain accessible for the retention period shown in the API docs. Inputs are not cached or stored.

Q: Can I bring my own key?

Not currently. Calls use RunAPI-managed access, which simplifies authentication and lets RunAPI handle rate limiting, retries, and billing consolidation on your behalf.

Q: How is billing consolidated?

All API calls across all providers appear on a single monthly USD invoice. There is no per-provider billing, no subscription, and no minimum spend. Failed generations are never charged.

Q: What SDKs can I use with Google models?

Official SDKs are available for Python, Node.js, PHP, Java, Ruby, and Go. Each SDK handles authentication, async task polling, and typed responses. For LLM models, the OpenAI and Anthropic SDKs also work by pointing the base URL to RunAPI.

Q: What are model skills and how do they work?

Model skills are installable packages that load a model's docs, typed schemas, pricing notes, and setup steps directly into your coding workspace. Install a skill with one command and your agent has the right context before you write integration code. Skills work with Claude Code, Codex, Gemini CLI, Cursor, and VS Code.

Q: How do I switch between Google models?

Change the model parameter in your API request. All Google models share the same API key, the same request shape, and the same billing. No code changes, no re-authentication, and no separate billing setup are required when switching between models or between variants of the same model. You can also switch to models from other providers by changing the same parameter — the API surface is unified across the entire catalog.

Veo 3, Imagen 4, Nano Banana, and Gemini — Google's full stack from video generation to multimodal reasoning.

5 models · 19 variants · from $0.0000

Browse all models API docs →

All Google models available through RunAPI 5 models

Gemini Text

6 variants · from $0.020

Gemini Omni Video

3 variants · from $0.0000

Imagen 4 Image

4 variants · from $0.040

Nano Banana Image

4 variants · from $0.040

Veo 3.1 Video

2 variants · from $0.600

OVERVIEW

Google offers frontier models spanning video (Veo 3 with synchronized audio), image (Imagen 4, Nano Banana), and language (Gemini multimodal LLMs). Through RunAPI, the entire Google AI stack shares a single key.

Single API key shared across all providers
No separate %{provider} account required
Model skills carry docs, schemas, and setup steps into your workspace
Per-call billing in USD, no subscription or minimum spend
Failed generations are never charged
Switch models by changing one parameter
Billing consolidated into one monthly invoice

FEATURES

What stands out

FASTEST

Gemini

P50 ~ <1s

One of the most-used model APIs from Google, chosen by developers for its balance of output quality, speed, and pricing.

FRONTIER

Veo 3.1

Frontier tier

One of the most-used model APIs from Google, chosen by developers for its balance of output quality, speed, and pricing.

CHEAPEST

Gemini

from $0.020

Lowest starting price across the Google catalog, suited for high-volume workflows and cost-sensitive production pipelines.

MODELS

All Google models available through RunAPI

Gemini

Google

Text

Gemini API access for Google's multimodal LLM across chat, code generation, reasoning, and long-context tasks.

/v1/chat/completions endpoint

from $0.020 / 1K tokens View →

Gemini Omni

Google

Video

Gemini Omni API access for voice, character, and multimodal video resources in agent media workflows.

from $0.0000 / call View →

Imagen 4

Google

Image

Imagen 4 API access for photorealistic text-to-image, precise typography, broad styles, and up to 2K resolution.

from $0.040 / call View →

Nano Banana

Google

Image

Nano Banana API access for fast text-to-image with accurate in-image text and multi-character consistency.

from $0.040 / call View →

Veo 3.1

Google

Video

Veo 3.1 API access for high-fidelity video generation up to 4K with synthesized dialogue, sound effects, and ambience.

from $0.600 / call View →

QUICKSTART

Install a Google model skill for your app.

Pick a model and add its skill so your coding tool has docs, schemas, pricing notes, and setup steps. Skills work with Claude Code, Codex, Gemini CLI, Cursor, and VS Code. Install once, then switch models by changing one parameter.

runapi.ai

# Base URL
https://runapi.ai

# Endpoints
POST /v1/chat/completions
POST /v1beta/models/*:streamGenerateContent

curl https://runapi.ai/v1/chat/completions \
  -H "Authorization: Bearer $RUNAPI_KEY" \
  -H "Content-Type: application/json" \
  -d '{
  "model": "gemini-3.5-flash",
  "messages": [
    {
      "role": "user",
      "content": "Analyze this codebase and suggest three performance improvements with before/after examples."
    }
  ]
}'

from openai import OpenAI

client = OpenAI(
    base_url="https://runapi.ai/v1",
    api_key="your-runapi-key"
)

response = client.chat.completions.create(
    model="gemini-3.5-flash",
    messages=[{"role": "user", "content": "Analyze this codebase and suggest three performance improvements with before/after examples."}]
)

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://runapi.ai/v1",
  apiKey: "your-runapi-key"
});

const response = await client.chat.completions.create({
  model: "gemini-3.5-flash",
  messages: [{ role: "user", content: "Analyze this codebase and suggest three performance improvements with before/after examples." }]
});

https://runapi.ai /v1/chat/completions /v1beta/models/*:streamGenerateContent

REFERENCE

Every Google variant with pricing and model IDs

Full pricing table →

Model	Variant	Billing	From
Gemini	gemini-2.5-flash	1K tokens	$0.020	View →
	gemini-2.5-pro	1K tokens	$0.050	View →
	gemini-3-flash-preview	1K tokens	$0.020	View →
	gemini-3-pro-preview	1K tokens	$0.060	View →
	gemini-3.1-pro-preview	1K tokens	$0.060	View →
	gemini-3.5-flash	1K tokens	$0.050	View →
Gemini Omni	gemini-omni-audio	call	$0.0000	View →
	gemini-omni-character	call	$0.0000	View →
	gemini-omni-text-to-video	call	$3.60	View →
Imagen 4	imagen-4	call	$0.080	View →
	imagen-4-fast	call	$0.040	View →
	imagen-4-pro-remix-image	call	$0.180	View →
	imagen-4-ultra	call	$0.120	View →
Nano Banana	nano-banana	call	$0.040	View →
	nano-banana-2	call	$0.080	View →
	nano-banana-edit	call	$0.040	View →
	nano-banana-pro	call	$0.180	View →
Veo 3.1	veo-3.1	call	$2.50	View →
	veo-3.1-fast	call	$0.600	View →

FAQ

Frequently asked questions about Google

Is this an official Google integration?

RunAPI exposes a managed API surface with transparent per-call pricing, fully documented capability and parameters, and clear error behavior. You get the same model output quality without managing a direct provider relationship or provider-side account.

Do I need a Google account?

No. Your RunAPI API key is enough for managed access to all Google models. You do not need to create a separate account, manage provider-specific credentials, or handle provider-side billing.

What's the latency overhead from proxying through RunAPI?

Typically under 20 ms. RunAPI keeps the proxy layer close to model execution regions to minimize added latency. Media generation time is dominated by the model itself, not the proxy.

Are images / videos cached?

Generated outputs are stored and retrievable by task ID. You can fetch completed results at any time using the task status endpoint or the RunAPI dashboard. Output URLs remain accessible for the retention period shown in the API docs. Inputs are not cached or stored.

Can I bring my own key?

Not currently. Calls use RunAPI-managed access, which simplifies authentication and lets RunAPI handle rate limiting, retries, and billing consolidation on your behalf.

How is billing consolidated?

All API calls across all providers appear on a single monthly USD invoice. There is no per-provider billing, no subscription, and no minimum spend. Failed generations are never charged.

What SDKs can I use with Google models?

Official SDKs are available for Python, Node.js, PHP, Java, Ruby, and Go. Each SDK handles authentication, async task polling, and typed responses. For LLM models, the OpenAI and Anthropic SDKs also work by pointing the base URL to RunAPI.

What are model skills and how do they work?

Model skills are installable packages that load a model's docs, typed schemas, pricing notes, and setup steps directly into your coding workspace. Install a skill with one command and your agent has the right context before you write integration code. Skills work with Claude Code, Codex, Gemini CLI, Cursor, and VS Code.

How do I switch between Google models?

Change the model parameter in your API request. All Google models share the same API key, the same request shape, and the same billing. No code changes, no re-authentication, and no separate billing setup are required when switching between models or between variants of the same model. You can also switch to models from other providers by changing the same parameter — the API surface is unified across the entire catalog.