Google expands with Gemini: more human and precise conversations with voice

Google Revolutionizes Voice Interaction with Gemini 2.5 Flash Native Audio and Simultaneous Translation – (Photo: Google)

Google an update announced Gemini 2.5 Flash Native Audioit is advanced technology for voice assistants and artificial intelligence in conversation. This development represents a leap in the naturalness, precision and efficiency of voice interactions, with the aim of delivering experiences closer together a real human conversation for both consumers As far as companies are concerned.

The current version, integrated into products such as Google AI Studio, Vertex AI, Gemini Live and Search Live, i.eIntroduces notable improvements in three key areas: more precise function calls, better instruction following, and smoother dialog.

FILE PHOTO: The Google logo hangs on the wall during the “Made by Google” event organized to unveil the latest additions to Google’s Pixel device portfolio on August 20, 2025 in Brooklyn, New York, USA. REUTERS/Brendan McDermid/File Photo

The model now recognizes in real time when you need to collect information and integrates it into the conversation without losing coherence. This advancement is critical for complex workflows or conversations that require dynamic access to data, such as telephone customer support.

Internal and benchmark testing shows Gemini 2.5 Flash Native Audio to be a leader in the ComplexFuncBench Audio rating, achieving a 71.5% success rate in handling multi-level features. Additionally, the compliance rate of following instructions increased to 90%, increasing user and developer satisfaction compared to the previous version.

Another notable advancement is restoring context in multi-round conversations allows us to return to previous topics with a cohesion that is ever closer to a conversation between people.

Business applications of this model are already showing tangible results. Shopify reported this Users often forget that they are talking to an artificial intelligence during the first interaction with the sidekick assistant.

Gemini translates live conversations and improves attention in more than 70 languages with Google updates – REUTERS/Dado Ruvic/Illustration/File Photo

In the financial sector, United Wholesale Mortgage (UWM) highlighted Generated more than 14,000 loans thanks to Gemini’s management ability complex calls.

For artificial intelligence solutions provider Newo.ai, Vertex AI’s update to Gemini enables its virtual receptionists Identify the main speaker even in noisy environments and switch languages during a conversation and maintain a natural expressiveness.

One of the most promising features is live voice translation. Gemini now supports simultaneous speech-to-speech translation, enabling both continuous listening and two-way conversations in real time. When using headphones, the system translates the surrounding language into a selected language without losing intonation, rhythm or pitch Original. It also enables smooth conversations between people who speak different languages, with the output language automatically changing depending on the speaker.

The tool supports more than 70 languages and 2,000 translation pairs with multilingual input capabilities which allow you to understand and process multiple languages in a single session. Thanks to automatic detection, it recognizes the spoken language and starts translating without the need for manual configuration.

In addition, the model filters ambient noise, expanding its uses outdoors or in crowded environments while maintaining high audio quality.

This technology is now available in public beta via the Google Translate application on Android devices in the US, Mexico and India, with plans to expand into additional countries and iOS systems. Google expects to gradually integrate the experience into other platforms, including the Gemini API, throughout 2026.

As part of the competition to create increasingly intelligent and useful voice assistants, Google’s strategy with Gemini aims not only to enrich the user experience but also to open up new business and global communication applications.

Improving the naturalness of conversations, precision in following instructions and the use of real-time language translation with authentic nuances position Gemini as a benchmark in the development of artificial intelligence for human interaction.