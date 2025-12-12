Google has updated Search Live with Gemini 2.5 Flash Native Audio, upgrading how voice functions inside Search while also extending the model’s use across translation and live voice agents. The update introduces more natural spoken responses in Search Live and reflects Google’s effort to improve natural voice queries, treating voice as a core interface as a way for users to get everything they can get from regular search plus enabling them to ask questions about the physical world around them and receive immediate voice translations between two people speaking different languages.

The new updated voice capabilities, rolling out this week in the United States, will enable Google’s voice responses to sound more natural and can even be slowed down for instructional content.

According to Google:

“When you go Live with Search, you can have a back-and-forth voice conversation in AI Mode to get real-time help and quickly find relevant sites across the web. And now, thanks to our latest Gemini model for native audio, the responses on Search Live will be more fluid and expressive than ever before.”

Broader Gemini Native Audio Rollout

This Search upgrade is part of a broader update to Gemini 2.5 Flash Native Audio rolling out across Google’s ecosystem, including Gemini Live (in the Gemini App), Google AI Studio, and Vertex AI. The model processes spoken audio in real time and produces fluid spoken responses, reducing barriers to natural conversation, reducing friction in live interactions. Although Google’s announcement didn’t say that the model was a speech-to-speech model (as opposed to speech-to-text then text-to-speech), this update follows Google’s October announcement of “Speech-to-Retrieval (S2R). It’s a neural network-based machine-learning model trained on large datasets of paired audio queries.”

These changes show Google treating native audio as a core capability across consumer-facing products, making it easier for users to ask and receive information about the physical world around them in a natural manner that wasn’t previously possible.

Improvements For Voice-Based Systems

For developers and enterprises building voice-based systems, Google says the updated model improves reliability in several areas. Gemini 2.5 Flash Native Audio more consistently triggers external functions during conversations, follows complex instructions, and maintains context across multiple turns. These improvements make live voice agents more dependable in real-world workflows, where misinterpreted instructions or broken conversational flow reduce usability.

Smooth Conversational Translation

Beyond Search and voice agents, the update introduces native support for “live speech-to-speech translation.” Gemini translates spoken language in real time, either by continuously translating ambient speech into a target language or by handling conversations between speakers of different languages in both directions. The system preserves vocal characteristics such as speech rhythm and emphasis, supporting translation that sounds smoother and conversational.

Google highlights several capabilities supporting this translation feature, including broad language coverage, automatic language detection, multilingual input handling, and noise filtering for everyday environments. These features reduce setup friction and allow translation to occur passively during conversation rather than through manual controls. The result is a translation experience that behaves much like an actual person in the middle translating between two people.

Voice Search Realizing Google’s Aspirations

The update reflects Google’s continued iteration of voice search toward an ideal that was originally inspired by the science fiction voice interactions between humans and computers in the popular Star Trek television and movie series.

Featured Image by Shutterstock/Jackbin