The focus of generative AI has largely been on text-based interfaces that generate text, images, etc. Voice is the next wave, and it’s coming fast. Google announced today that its Vertex AI platform will be adding Chirp 3 – its speech-to text and HD text-tospeech models – starting next week.
This week, Google quietly announced the addition of eight new voices to Chirp 3 for 31 languages. The platform can be used to create voice assistants, audiobooks, support agents, and voice-overs for video. The news was revealed at an event held at Google’s DeepMind office in London.
The company’s efforts come at a time when others are making great strides with voice AI. Sesame, the startup behind the viral “Maya” AI app and the “Miles AI app” with their very realistic sounds, announced last week that it would be launching a model for developers who want to build customized apps and services based on its tech.
There will be usage restrictions for Chirp 3 in order to prevent misuse. Thomas Kurian, Google Cloud CEO, said at a press event today that “we’re just working on some of these things” with the safety team.
ElevenLabs, a major startup, has raised hundreds of millions of dollars to expand its work in AI voice services.
This news will bring Chirp 3 in line with newer versions being tested of its flagship LLM Gemini as well as Imagen, its image-generation tool, and Veo 2, its expensive video generation tool.
Although it is yet to be confirmed, the Chirp 3 that Google will release with its AI-powered “human” voice (Sesame in particular) may not be as “realistic”. Demis Hassabis is the CEO of DeepMind and he stressed that this is a marathon not a sprint.
I don’t think that the silver bullet [AI is] will be available in the next two years. He said that we are still a long way from AGI. “It will change things over the next decade. So the medium to long term.” It’s an interesting moment in time.
Google launched Vertex AI in 2021, as a platform that developers could use to build machine-learning services in the cloud. This was before the explosion in interest in AI and specifically generative AI that followed the launch of OpenAI GPT services.
The company has been focusing on Vertex AI as it tries to catch up. Other companies such as Microsoft and Amazon are also developing generative AI tools for developers. Vertex AI allows developers to create generative AI on Gemini and also train models, classify data and create models for production. It will be interesting to see if it expands its walled-garden to include models other than those created by Google.
Google’s “Chirp”voice services have been around for years. Code name for its initial efforts in order to compete with Amazon’s Alexa.
Ingrid joined TechCrunch in February 2012 and is based in London.
Ingrid was a writer and editor for TechCrunch before joining the site in February 2012. She has also worked as a freelancer for publications like the Financial Times. Ingrid covers mobile media, digital advertising, and the intersections between these.
She is most comfortable in English, but she can also speak Russian and Spanish (in order of proficiency).
View Bio