OpenAI rolls out advanced voice mode in ChatGPT for Plus users: What is it

Explaining advanced voice mode, OpenAI said that ChatGPT will offer more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions

ChatGPT's advanced voice mode
ChatGPT's advanced voice mode
Prakruti Mishra New Delhi
3 min read Last Updated : Jul 31 2024 | 12:28 PM IST
Advanced voice mode is starting to roll out to a small group of ChatGPT Plus users, announced Microsoft-backed artificial intelligence startup OpenAI in a post on X. Announced in May, the feature was originally slated for release in June this year but got delayed because it needed time to reach its launch standard. Here is all you need to know about OpenAI’s advanced voice mode for ChatGPT:

ChatGPT advanced voice mode: What is it

In September 2023, OpenAI announced support for voice and image capabilities in ChatGPT. The announcement was followed by a new multimodal language model in May this year, dubbed GPT-4o, that it said will enable advanced voice mode in ChatGPT.

Explaining advanced capabilities in voice mode, OpenAI said that ChatGPT will “offer more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions.”

In the current version, voice mode to talk to ChatGPT works with latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average. This latency is the result of a data processing pipeline of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model converts that text back to audio. According to OpenAI, this process results in loss of lots of information to the main source of intelligence, GPT-4.

With the GPT-4o model, which the company said is trained end-to-end across text, vision, and audio, all inputs and outputs are processed by the same neural network. This lowers down the latency for natural conversational experience and improves results since all the information is processed over the same neural network. Additionally, OpenAI said that GPT-4o is more capable of handling interruptions, manages group conversations effectively, filters out background noise, and adapts to tone.

Essentially, the advanced voice mode enables conversational artificial intelligence in ChatGPT.

ChatGPT advanced voice mode: Availability

The advanced voice mode capability is currently being tested with a small batch of ChatGPT Plus users. OpenAI said the users selected in this alpha will receive an email with instructions and a message in their mobile app. OpenAI plans to add more people on a rolling basis and plan for everyone on Plus to have access in the fall.

OpenAI said that learning from this alpha will help it make the advanced voice experience safer and more enjoyable for everyone. The startup plans to share a detailed report on GPT-4o’s capabilities, limitations, and safety evaluations in early August.

*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :OpenAIChatGPTTechnologyChatbot

First Published: Jul 31 2024 | 12:28 PM IST

Next Story