Google launches Gemini 3.1 Flash Live, a more human-like conversational voice model

Google launches Gemini 3.1 Flash Live, a more human-like conversational voice model
Ars Technica2

Key Points

  • Google released Gemini 3.1 Flash Live, a real‑time conversational voice model.
  • The model scored 36.1 percent in Scale AI’s Audio MultiChallenge, below non‑conversational models that exceed 50 percent.
  • SynthID watermarks are embedded in the audio output; they are inaudible but can be detected for verification.
  • Early partners Home Depot and Verizon reported positive experiences with the model.
  • Developers can access the model via AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience.
  • Gemini 3.1 Flash Live will appear in Gemini Live and Search Live (AI Mode) starting today.

Google introduced Gemini 3.1 Flash Live, a real‑time voice model designed to sound more like a person. In Scale AI’s Audio MultiChallenge the model scored 36.1 percent, trailing non‑conversational audio models that exceed 50 percent. The new system embeds SynthID watermarks that are invisible to listeners but detectable for verification. Early partners—including Home Depot and Verizon—reported positive results. Developers can access the model via AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience, with the technology appearing in Gemini Live and Search Live features.

Google unveils Gemini 3.1 Flash Live

Google announced the rollout of Gemini 3.1 Flash Live, a conversational voice model that aims to make AI speech sound more like a human. The model is part of the Gemini family and is being integrated into several Google products, including Gemini Live and Search Live, a feature of AI Mode.

In a recent evaluation by Scale AI’s Audio MultiChallenge, Gemini 3.1 Flash Live achieved a score of 36.1 percent. While this places the model ahead of many real‑time audio solutions, it remains lower than non‑conversational audio models that can reach scores above 50 percent on the same test.

To help distinguish AI‑generated speech from real human voices, Google embedded SynthID watermarks into the output of Gemini 3.1 Flash Live. These watermarks are not audible to listeners but can be detected by tools designed to verify the source of the audio. Google indicated that the watermarks are intended to prevent misuse of the technology.

Early testing partners such as Home Depot and Verizon have shared positive feedback on the model’s performance. Their reports, highlighted in a Google blog post, describe the model’s ability to mimic human speech convincingly. The partners noted that the new voice capabilities could improve customer interactions across phone and digital channels.

Developers now have multiple ways to work with Gemini 3.1 Flash Live. The model is available through AI Studio, the Gemini API, and Gemini Enterprise for Customer Experience. The enterprise offering is positioned as a toolkit for “agentic shopping,” allowing businesses to build more natural‑sounding conversational experiences.

Google emphasized that the model will be most visible in Gemini Live and Search Live, where users can experience the enhanced voice interactions directly. The rollout begins today, marking the latest step in Google’s effort to make AI assistants sound more realistic.

#Google#Gemini#AI#voice AI#synthetic speech#SynthID#Scale AI#Home Depot#Verizon#AI Studio#customer experience#conversational AI
Generated with  News Factory -  Source: Ars Technica2

Also available in:

Google launches Gemini 3.1 Flash Live, a more human-like conversational voice model | AI News