News

Google Gemini 3.5 Flash Announced: Fast AI Model Rivals GPT 5.5

Google launches Gemini 3.5 Flash, an agent-optimised model that matches GPT 5.5 on coding benchmarks while being dramatically more efficient and cost-effective.

Robson PereiraMay 31, 20265 min read

Google Gemini 3.5 Flash announcement with benchmark comparisons.

Google Gemini 3.5 Flash Announced: Fast AI Model Rivals GPT 5.5

Google has unveiled Gemini 3.5 Flash, a new model optimised for agentic AI workflows that rivals OpenAI's GPT 5.5 on key benchmarks while being dramatically more efficient. Alongside the Flash release, Google previewed an "Omni" model that can natively process and generate multiple modalities.

Performance that punches above its weight

According to Google, Gemini 3.5 Flash substantially outperforms older Flash models on both Terminal Bench and SWE-Bench Pro coding evaluations. Crucially, it shows a small but measurable improvement versus Google's own Gemini 3.1 Pro, and scores in the same neighbourhood as OpenAI's much larger and more expensive GPT 5.5.

On OSWorld-Verified, which tests how models handle general tasks in real computing environments, the new Flash again substantially outperforms older Flash models and is slightly faster than Gemini 3.1 Pro. It's essentially tied with GPT 5.5.

Built for agentic workflows

The model is designed from the ground up for agentic AI—systems where AI takes action on behalf of users rather than just generating text. Tulsee Doshi, Google's head of product for Gemini, described agents as "a model plus a harness such that the combination can actually take action on your behalf."

Google says Gemini 3.5 Flash excels at UI control tasks that are critical for agents: "Certain things like UI control are expensive to do because the model has to search the page, it has to know where to click, it has to act through multiple steps. I think Flash is able to do that well because of that combination of quality and cost."

Where it's available

Gemini 3.5 Flash is rolling out across Google's ecosystem:

**Gemini app** and **AI Studio** for developers
**Android Studio** for mobile development
**Antigravity IDE 2.0** with support for spawning parallel sub-agents
**All Google enterprise products**

Google says a Pro variant of Gemini 3.5 is already in internal testing and expected next month.

The Omni model

Alongside 3.5 Flash, Google announced Omni Flash—a multimodal model that can natively generate text, images, and audio without relying on separate specialised models. "The vision for Gemini has always been that it would be multimodal in, multimodal out," Doshi said. "Omni is a step toward that vision."

An Omni Pro model is planned but has no timeline yet.

Implication for self-hosted AI

While Gemini 3.5 Flash is a cloud-first model, its efficiency gains are notable for the self-hosted community. The trend toward smaller, more capable models that can run on less hardware is what makes local AI deployment viable. Models like Phi-4, Gemma, and Llama 3 have shown that distilled and optimised architectures can deliver strong results on consumer hardware.

For teams looking to run Google's open-weight models locally, our Gemma 3 Local Setup guide covers practical deployment. The improvements in Gemini 3.5 Flash suggest future open-weight releases from Google may also benefit from similar architectural advances.

Source

**Ars Technica:** https://arstechnica.com/google/2026/05/google-announces-agent-optimized-gemini-3-5-flash-and-a-do-anything-model-called-omni/

Google Gemini 3.5 Flash Announced: Fast AI Model Rivals GPT 5.5

Google Gemini 3.5 Flash Announced: Fast AI Model Rivals GPT 5.5

Performance that punches above its weight

Built for agentic workflows

Where it's available

The Omni model

Implication for self-hosted AI

Source

Related articles

US Government Forces Anthropic to Suspend Fable 5 and Mythos 5 Worldwide — National Security Directive Blocks Non-US Access

[TechCrunch] After Nvidia's $20B Deal, AI Chip Startup Groq Reportedly Raising $650M

[TechCrunch] GitHub Copilot's Token Billing Backlash: What It Means for Self-Hosted AI