Google’s I/O 2026: Gemini 3.5 Flash and Gemini Omni raise the bar
New Gemini models pair frontier intelligence with action and turn any input into high-quality generated video.
Inteeka · 20 May 2026 · 4 min read

At Google I/O 2026 on 20 May, Google introduced two models that move the conversation on from what an AI can say to what it can actually do. Gemini 3.5 Flash leads a new series that, in Google’s words, combines frontier intelligence with action. Alongside it, Gemini Omni takes images, audio, video and text and turns them into new video. For businesses already weighing how to put generative AI to work, these are not incremental updates. They shift what is realistic to build, and at what cost.
What Google announced
Google described Gemini 3.5 Flash as the first in its latest series of models combining frontier intelligence with action. The headline is that it pairs strong reasoning with speed and low cost: Google says it delivers frontier-level intelligence at exceptional speed, and that it often costs less than half the cost of other frontier models. It is aimed squarely at long-horizon agentic tasks: rapidly planning, building and iterating to solve real-world problems, from developing applications and maintaining codebases to preparing financial documents. Google reports it outperforms Gemini 3.1 Pro on benchmarks including Terminal-Bench 2.1, GDPval-AA and MCP Atlas, and that it is available through Google Antigravity, the Gemini API, Google AI Studio and Android Studio.
Gemini Omni is the second strand. Google says it can create anything from any input (starting with video) by combining Gemini’s intelligence with generative media models. It accepts references across images, text, video and audio, and produces cohesive output from those mixed inputs. The notable advance is an improved understanding of physical forces like gravity, kinetic energy and fluid dynamics, which Google says enables more realistic scenes and better character consistency, so identity and voice are preserved across every shot. Generated content carries an imperceptible SynthID watermark that can be verified through the Gemini app, Chrome and Search.
Why it matters for businesses
The significance is not any single benchmark; it is the combination. The two constraints that have held back real deployments are cost and reliability at speed. A model that offers frontier-level reasoning while costing materially less changes the arithmetic of automating work that was previously too expensive to hand to an AI. And the explicit focus on action (agentic tasks that plan and iterate over many steps) points at the kind of work most businesses actually want done, not a single clever answer but a job carried through to completion.
On the creative side, Gemini Omni lowers the barrier to producing video from assets a business already holds. The practical gains worth noting:
- Lower cost per task: work that once justified only a premium model can now run at a price that makes routine automation viable.
- Agentic reach: longer, multi-step tasks such as maintaining a codebase or preparing documents move within range of a single capable model.
- Multimodal content: turning existing images, audio and text into consistent video opens up marketing and product use cases without a production crew.
- Provenance built in: the SynthID watermark gives a verifiable signal of AI-generated media, which matters for trust and for emerging disclosure rules.
What to do about it
A new model is an opportunity, not a project. The temptation is to chase the announcement; the better move is to start from a specific job that a cheaper, faster, more capable model now makes worthwhile. If you already run a process on an older or pricier model, this is a moment to re-evaluate the trade-off rather than assume the original choice still holds. The cost difference alone can change whether a use case is worth shipping.
Treat agentic capability with the same care you would any system that acts on your behalf. Faster, cheaper reasoning is welcome, but a model that takes action still needs well-scoped tools, sensible limits and a way to measure whether it is getting the job right. For Gemini Omni, the near-term wins are concrete: product visuals, short-form video, repeated creative variations, provided you keep a human eye on the output and lean on the built-in watermark for transparency. The right posture is curiosity with discipline: test against your own tasks, measure the result, and only then widen the remit.
The takeaway
Gemini 3.5 Flash and Gemini Omni are a useful marker of where the ground is shifting: intelligence that is cheaper and faster, paired with the ability to act, and generative media that can build coherent video from whatever you already have. The opportunity is real, but it rewards a clear head: pick a specific job, test it on your own data, measure the outcome, and grow from there. That is how a new model becomes business value rather than a headline.
Source: Google: Google launches Gemini 3.5 Flash and Gemini Omni at I/O 2026