LLM API Pricing

Gemini API Pricing in 2026

Google charges $1.25–$10 per million tokens for Gemini 2.5 Pro and far less for Flash. Gemini is the cheapest flagship among the big three. RunAPI mirrors every Gemini model at half the official rate.

Updated June 18, 2026 RunAPI Editorial
At a glance

What does the Gemini API cost right now?

Google publishes per-token prices for two main tiers: Flash for fast, cheap tasks and Pro for maximum capability. There is also a free tier with daily request limits. All prices below are per million tokens, the billing unit Google uses.

Cheapest option

Gemini 2.5 Flash at $0.08/M input and $0.30/M output through RunAPI. Official rate is $0.15/$0.60.

Maximum capability

Gemini 2.5 Pro at $0.63/M input and $5.00/M output through RunAPI. Official rate is $1.25/$10.

Free tier

Google offers a free tier with up to 500 requests per day on Flash, useful for prototyping before you pay.

Cheapest flagship

Gemini 2.5 Pro undercuts Claude Sonnet and GPT-5.4 on input price, making it the cheapest flagship of the big three.

Model-by-model breakdown

How much does each Gemini model cost per million tokens?

The table below shows official Google pricing alongside RunAPI pricing. RunAPI applies a flat 50% discount across all Gemini models. No volume commits, no subscriptions.

Model Official input /M Official output /M RunAPI input /M RunAPI output /M Context window
Gemini 2.5 Pro $1.25 $10.00 $0.63 $5.00 1M
Gemini 2.5 Flash $0.15 $0.60 $0.08 $0.30 1M
Gemini 2.5 Flash-Lite $0.10 $0.40 $0.05 $0.20 1M
Free tier

How does Google's Gemini free tier work?

Google offers a free tier so you can prototype before paying. It has daily request limits and lower rate limits than the paid tier. It suits testing, not production traffic.

Daily request cap

Up to 500 requests per day on Gemini 2.5 Flash through the free tier. Enough for prototyping and low-volume side projects.

Lower rate limits

The free tier caps requests per minute well below the paid tier. Bursty or production workloads will hit the limit quickly.

Data usage terms

Free-tier inputs may be used to improve Google's products. Paid-tier and RunAPI traffic is not used for training, which matters for sensitive data.

When to upgrade

Move to paid or RunAPI once you need steady throughput, higher rate limits, or stronger data handling. RunAPI charges 50% of the official paid rate with no daily cap.

Provider comparison

Is Gemini cheaper than Claude and GPT?

Developers often compare Gemini against Claude Sonnet and GPT-5.4. Here is how the flagship models stack up on a per-million-token basis through RunAPI.

Provider Flagship model Input /M Output /M RunAPI rate
Google Gemini 2.5 Pro $1.25 $10.00 $0.63 / $5.00
OpenAI GPT-5.4 $2.50 $15.00 $1.25 / $7.50
Anthropic Claude Sonnet 4.6 $6.00 $30.00 $3.00 / $15.00

RunAPI applies a 50% discount on all providers listed above. Gemini 2.5 Pro is the cheapest flagship on input price. Prices verified June 2026.

Real-world costs

What does the Gemini API actually cost for real workloads?

Token costs look abstract until you attach them to real tasks. Below are five common developer workloads with estimated monthly costs at two usage levels through RunAPI.

Workload Model Light use (~50 tasks/day) Heavy use (~200 tasks/day) Monthly saving vs official
Long-context document analysis Gemini 2.5 Pro $18/mo $72/mo $18–$72
High-volume classification Gemini 2.5 Flash $3/mo $12/mo $3–$12
RAG-powered docs chatbot Gemini 2.5 Flash $5/mo $20/mo $5–$20
Content generation pipeline Gemini 2.5 Pro $15/mo $60/mo $15–$60
Multi-agent orchestrator Gemini 2.5 Pro $60/mo $240/mo $60–$240
Getting started

How to access the Gemini API through RunAPI

1

Create a RunAPI account

Sign up at runapi.ai. No credit card required for the free tier.

2

Copy your API key

Go to Dashboard → API Keys. Create a key and save it — you will use this as your OpenAI API key.

3

Point your SDK to RunAPI

Set the base URL to https://api.runapi.ai/v1 and use your RunAPI API key. Any OpenAI-compatible client works — Python, Node.js, Go, Ruby, or curl.

4

Start making requests

Use gemini-2.5-pro, gemini-2.5-flash, or any Gemini model ID in the model parameter. RunAPI handles routing and billing at 50% of the official rate.

Frequently asked questions

Gemini API Pricing FAQ

How much does the Gemini API cost?

Official Google pricing for Gemini 2.5 Pro is $1.25/M input and $10/M output. Gemini 2.5 Flash is $0.15/M input and $0.60/M output. Through RunAPI every Gemini model is half that rate, with no subscription or volume commitment. You only pay for the tokens each request uses.

Is Gemini cheaper than Claude and GPT?

On input price, yes. Gemini 2.5 Pro at $1.25/M input undercuts GPT-5.4 ($2.50) and Claude Sonnet ($6) on the official rate, making it the cheapest flagship of the big three. RunAPI halves all three, so the gap holds.

Does Gemini have a free tier?

Yes. Google offers a free tier with up to 500 requests per day on Gemini 2.5 Flash and lower rate limits than the paid tier. It suits prototyping. Free-tier inputs may be used to improve Google's products, so avoid sensitive data.

Why is RunAPI 50% cheaper than the official API?

RunAPI negotiates volume pricing with model providers and passes the savings to developers. There is no quality difference — same models, same OpenAI-compatible API, same output. You only change the base URL and key, and your existing client code keeps working unchanged.

Can I use Gemini with the OpenAI SDK?

Yes. RunAPI is OpenAI-compatible. Point any OpenAI client at api.runapi.ai/v1 and use Gemini model IDs like gemini-2.5-pro. Existing OpenAI SDK code works without changes beyond the base URL and key.

Why is Gemini good for long-context tasks?

Gemini 2.5 Pro and Flash both offer a 1M-token context window, larger than most Claude and GPT models. Combined with low input pricing, this makes Gemini cost-effective for analyzing long documents, large codebases, or many files at once.

Is there a usage cap on the paid API?

The paid API has no fixed message cap. You pay per token and scale as needed, subject to rate limits on requests per minute. Through RunAPI there is no daily request cap like the free tier, so production traffic runs without throttling.

How does billing work?

Pay-as-you-go. You fund your account with a balance, and each API call deducts the token cost at half the official rate. No subscriptions, no invoices, no contracts. You can monitor spend per key from the RunAPI dashboard and set up alerts before the balance runs low.

Start using Gemini at half price.

Create a free RunAPI account, get your API key, and start calling Gemini 2.5 Pro or Flash at 50% off official Google pricing — the cheapest flagship of the big three.