Google has made Gemini 3.5 Flash generally available at $1.50 per million input tokens, offering frontier-level reasoning at four times the speed of comparable models — putting direct price pressure on OpenAI's fast inference lineup.

Google's Gemini 3.5 Flash Is Now Generally Available at $1.50 Per Million Tokens

Q: What is Why This Matters?

For enterprises: The combination of frontier-quality reasoning and sub-$2 per million token pricing removes the economic barrier that has kept fast inference workloads from scaling. Real-time document review, agentic task orchestration, and high-volume classification jobs all become more viable.

By Hector Herrera | May 24, 2026 | News

Google has made Gemini 3.5 Flash generally available, offering frontier-level reasoning at four times the speed of comparable models, priced at $1.50 per million input tokens and $9 per million output tokens. The release puts direct cost-and-speed pressure on OpenAI's o-series fast inference models and marks Google's most aggressive pricing move yet to win enterprise AI workloads that demand real-time response.

For developers building production applications where latency matters — customer support agents, code assistants, document processing pipelines — Gemini 3.5 Flash is now a credible and significantly cheaper option at scale.

The Numbers That Matter

According to LLM Stats tracking of the model release, Gemini 3.5 Flash's pricing breaks down as:

Input: $1.50 per million tokens
Output: $9.00 per million tokens
Speed: Approximately 4x faster than comparable frontier-class models in its tier
Context window and benchmark performance: Not yet independently verified at publication

Note: This report originates from LLM Stats, a model-tracking aggregator. Google has not separately issued a standalone press release on the GA launch as of publication. Verify pricing via Google AI Studio before building pricing models into your infrastructure plans.

Gemini 3.5 Flash first appeared at Google I/O in May 2026, where it was announced alongside Gemini 2.5 Pro and a suite of agentic features. The GA launch moves it from preview access to full production availability.

Context: The Fast-Inference Race

The fastest-growing segment of the AI model market in 2026 is not the largest models — it is fast, cheap, high-quality inference models that can run at production scale without burning through API budgets. OpenAI's o4-mini and o3 models have dominated this tier since early 2026. Anthropic's Claude Haiku 4.5 is competitive in cost.

Google has historically led on raw infrastructure scale but lagged on enterprise developer mindshare. Gemini 3.5 Flash's $1.50 input price point is designed to change that calculation. At that price, a company processing 10 billion input tokens per month pays $15,000 — competitive with, and in some scenarios below, alternatives in the same capability tier.

Why This Matters

For enterprises: The combination of frontier-quality reasoning and sub-$2 per million token pricing removes the economic barrier that has kept fast inference workloads from scaling. Real-time document review, agentic task orchestration, and high-volume classification jobs all become more viable.

For the model market: Every time Google prices aggressively at GA, it compresses margins for the entire industry. OpenAI and Anthropic face pressure to match or differentiate on quality. This is good for buyers and tightens the economics for every lab competing at this tier.

For Google: Winning enterprise AI workloads through Google Cloud is strategically more valuable than headline benchmark wins. GA availability means Google Cloud enterprise sales teams can now quote Gemini 3.5 Flash to customers with service-level agreements rather than preview caveats — a meaningful shift in the sales motion.

What to Watch

Independent benchmarks from the ML research community on Gemini 3.5 Flash's actual performance against o4-mini and Claude Haiku 4.5 will determine whether the speed and pricing claims hold under real workload conditions. Also watch for Google I/O developer adoption numbers — early API call volumes will signal whether the GA launch is driving new usage or consolidating existing Gemini preview users.

Google's Gemini 3.5 Flash Is Now Generally Available at $1.50 Per Million Tokens

Google's Gemini 3.5 Flash Is Now Generally Available at $1.50 Per Million Tokens

The Numbers That Matter

Context: The Fast-Inference Race

Why This Matters

What to Watch

More from NexChron

DuckDuckGo's No-AI Search Hits Record Traffic After Google AI Mode Rollout

Daily AI Briefing — 2026-06-02

Daily AI Briefing — 2026-05-29