Business Standard

OpenAI starts rolling out Gemini Live-like Advanced Voice Mode for ChatGPT

ChatGPT's Advanced Voice Mode utilises the GPT-4o AI model's multimodal capability to offer more natural-sounding real-time conversations with the AI chatbot

ChatGPT Advanced Voice

ChatGPT Advanced Voice

Harsh Shivam New Delhi

Listen to This Article

Microsoft-backed artificial intelligence startup OpenAI has announced the rollout of the Advanced Voice Mode feature for all ChatGPT Plus and Team subscribers. This new feature will allow users to engage in more natural conversations with the ChatGPT AI chatbot. It will be available within the ChatGPT app for paid members by the end of this week. Enterprise and education customers will gain access at a later date.

OpenAI has also noted that Advanced Voice Mode is currently unavailable in the European Union and select regions, including the UK, Switzerland, Iceland, Norway, and Liechtenstein.
 

ChatGPT has also introduced five new voices—Arbor, Maple, Sol, Spruce, and Vale—bringing the total to nine voice options.

ChatGPT Advanced Voice Mode: What is it?

OpenAI introduced Advanced Voice Mode during the launch of its GPT-4o model in May this year. The company explained that this mode facilitates more natural, real-time conversations with the AI chatbot, stating that it will “allow you to interrupt at any time and senses and responds to your emotions.”
In the current version, the voice mode operates with latencies averaging 2.8 seconds on GPT-3.5 and 5.4 seconds on GPT-4 models. This latency results from a data processing pipeline involving three separate models: one for transcribing audio to text, GPT-3.5 or GPT-4 for text processing, and another for converting text back to audio. OpenAI noted that this multi-model process can lead to a significant loss of information for GPT-4.

The GPT-4o model addresses this issue by processing all inputs and outputs—text, vision, and audio—through the same neural network. This integration reduces latency, enhances the naturalness of conversations, and improves overall performance. Additionally, GPT-4o is better equipped to handle interruptions, manage group conversations, filter out background noise, and adapt to tone.
While announcing the rollout schedule for ChatGPT Advanced Voice Mode, OpenAI stated that users will be able to set custom instructions for the Advanced Voice. Furthermore, the company has improved conversational speed, smoothness, and accents in select foreign languages.

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: Sep 25 2024 | 4:07 PM IST

Explore News