Daily request cap
Up to 500 requests per day on Gemini 2.5 Flash through the free tier. Enough for prototyping and low-volume side projects.
Google charges $1.25–$10 per million tokens for Gemini 2.5 Pro and far less for Flash. Gemini is the cheapest flagship among the big three. RunAPI mirrors every Gemini model at half the official rate.
Google publishes per-token prices for two main tiers: Flash for fast, cheap tasks and Pro for maximum capability. There is also a free tier with daily request limits. All prices below are per million tokens, the billing unit Google uses.
Gemini 2.5 Flash at $0.08/M input and $0.30/M output through RunAPI. Official rate is $0.15/$0.60.
Gemini 2.5 Pro at $0.63/M input and $5.00/M output through RunAPI. Official rate is $1.25/$10.
Google offers a free tier with up to 500 requests per day on Flash, useful for prototyping before you pay.
Gemini 2.5 Pro undercuts Claude Sonnet and GPT-5.4 on input price, making it the cheapest flagship of the big three.
The table below shows official Google pricing alongside RunAPI pricing. RunAPI applies a flat 50% discount across all Gemini models. No volume commits, no subscriptions.
| Model | Official input /M | Official output /M | RunAPI input /M | RunAPI output /M | Context window |
|---|---|---|---|---|---|
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.63 | $5.00 | 1M |
| Gemini 2.5 Flash | $0.15 | $0.60 | $0.08 | $0.30 | 1M |
| Gemini 2.5 Flash-Lite | $0.10 | $0.40 | $0.05 | $0.20 | 1M |
Google offers a free tier so you can prototype before paying. It has daily request limits and lower rate limits than the paid tier. It suits testing, not production traffic.
Up to 500 requests per day on Gemini 2.5 Flash through the free tier. Enough for prototyping and low-volume side projects.
The free tier caps requests per minute well below the paid tier. Bursty or production workloads will hit the limit quickly.
Free-tier inputs may be used to improve Google's products. Paid-tier and RunAPI traffic is not used for training, which matters for sensitive data.
Move to paid or RunAPI once you need steady throughput, higher rate limits, or stronger data handling. RunAPI charges 50% of the official paid rate with no daily cap.
Developers often compare Gemini against Claude Sonnet and GPT-5.4. Here is how the flagship models stack up on a per-million-token basis through RunAPI.
| Provider | Flagship model | Input /M | Output /M | RunAPI rate |
|---|---|---|---|---|
| Gemini 2.5 Pro | $1.25 | $10.00 | $0.63 / $5.00 | |
| OpenAI | GPT-5.4 | $2.50 | $15.00 | $1.25 / $7.50 |
| Anthropic | Claude Sonnet 4.6 | $6.00 | $30.00 | $3.00 / $15.00 |
RunAPI applies a 50% discount on all providers listed above. Gemini 2.5 Pro is the cheapest flagship on input price. Prices verified June 2026.
Token costs look abstract until you attach them to real tasks. Below are five common developer workloads with estimated monthly costs at two usage levels through RunAPI.
| Workload | Model | Light use (~50 tasks/day) | Heavy use (~200 tasks/day) | Monthly saving vs official |
|---|---|---|---|---|
| Long-context document analysis | Gemini 2.5 Pro | $18/mo | $72/mo | $18–$72 |
| High-volume classification | Gemini 2.5 Flash | $3/mo | $12/mo | $3–$12 |
| RAG-powered docs chatbot | Gemini 2.5 Flash | $5/mo | $20/mo | $5–$20 |
| Content generation pipeline | Gemini 2.5 Pro | $15/mo | $60/mo | $15–$60 |
| Multi-agent orchestrator | Gemini 2.5 Pro | $60/mo | $240/mo | $60–$240 |
Sign up at runapi.ai. No credit card required for the free tier.
Go to Dashboard → API Keys. Create a key and save it — you will use this as your OpenAI API key.
Set the base URL to https://api.runapi.ai/v1 and use your RunAPI API key. Any OpenAI-compatible client works — Python, Node.js, Go, Ruby, or curl.
Use gemini-2.5-pro, gemini-2.5-flash, or any Gemini model ID in the model parameter. RunAPI handles routing and billing at 50% of the official rate.
Official Google pricing for Gemini 2.5 Pro is $1.25/M input and $10/M output. Gemini 2.5 Flash is $0.15/M input and $0.60/M output. Through RunAPI every Gemini model is half that rate, with no subscription or volume commitment. You only pay for the tokens each request uses.
On input price, yes. Gemini 2.5 Pro at $1.25/M input undercuts GPT-5.4 ($2.50) and Claude Sonnet ($6) on the official rate, making it the cheapest flagship of the big three. RunAPI halves all three, so the gap holds.
Yes. Google offers a free tier with up to 500 requests per day on Gemini 2.5 Flash and lower rate limits than the paid tier. It suits prototyping. Free-tier inputs may be used to improve Google's products, so avoid sensitive data.
RunAPI negotiates volume pricing with model providers and passes the savings to developers. There is no quality difference — same models, same OpenAI-compatible API, same output. You only change the base URL and key, and your existing client code keeps working unchanged.
Yes. RunAPI is OpenAI-compatible. Point any OpenAI client at api.runapi.ai/v1 and use Gemini model IDs like gemini-2.5-pro. Existing OpenAI SDK code works without changes beyond the base URL and key.
Gemini 2.5 Pro and Flash both offer a 1M-token context window, larger than most Claude and GPT models. Combined with low input pricing, this makes Gemini cost-effective for analyzing long documents, large codebases, or many files at once.
The paid API has no fixed message cap. You pay per token and scale as needed, subject to rate limits on requests per minute. Through RunAPI there is no daily request cap like the free tier, so production traffic runs without throttling.
Pay-as-you-go. You fund your account with a balance, and each API call deducts the token cost at half the official rate. No subscriptions, no invoices, no contracts. You can monitor spend per key from the RunAPI dashboard and set up alerts before the balance runs low.
Create a free RunAPI account, get your API key, and start calling Gemini 2.5 Pro or Flash at 50% off official Google pricing — the cheapest flagship of the big three.