OpenAI announces GPT-4o, ChatGPT macOS app, conversational AI in Voice Mode

OpenAI said the GPT-4o is its most advanced model that is trained end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network

OpenAI's GPT-4o
OpenAI's GPT-4o
Harsh Shivam New Delhi
5 min read Last Updated : May 14 2024 | 11:32 AM IST

Don't want to miss the best from Business Standard?

OpenAI has announced GPT-4o, its maiden artificial intelligence model with native support to reason across audio, visual and text. OpenAI said the “o” in GPT-4o stands for “Omni”, since it is much better at understanding and interpreting texts, images and audios than its predecessor. Alongside, the company announced ChatGPT application for Apple’s macOS-based desktops, and previewed conversational AI in Voice Mode. Below are the details:

GPT-4o

OpenAI calls the GPT-4o “a step towards much more natural human-computer interaction”. The new version of the company’s GPT-4 model is capable of taking any combination of text, audio and image as input and producing output along the same lines. The GPT-4o model can respond to audio inputs in 232 milliseconds, which the company said is similar to a human’s response time during a conversation.

Comparing it to the existing GPT-4 Turbo model, which is another iteration of the company’s GPT-4 model, the GPT-4o matches its performance for English text understanding and coding, while significantly outperforming it in audio understanding. The GPT-4o model also brings significant improvements on text in non-English languages.

OpenAI said the GPT-4o model brings significant improvements in understanding images. For example, with ChatGPT based on GPT-4o, users can share an image of a food menu in different languages and ask the chatbot to translate it, learn about the food’s history, and get recommendations based on it.

Voice Mode with GPT-4o

Talkback feature in Voice Mode already exists in ChatGPT across both free and paid tires. However, OpenAI said that the new GPT-4o model brings significant improvements to it. OpenAI said the GPT-4o is its most advanced model that is trained end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. It essentially lowers down the latency for natural conversational experience and improves results since all the information is processed over the same neural network.

Prior to GPT-4o, OpenAI said, you could use Voice Mode to talk to ChatGPT with latencies of 2.8 seconds (GPT-3.5) and 5.4 seconds (GPT-4) on average. This latency is the result of a data processing pipeline of three separate models: one simple model transcribes audio to text, GPT-3.5 or GPT-4 takes in text and outputs text, and a third simple model converts that text back to audio. According to OpenAI, this process resulted in loss of lots of information to the main source of intelligence, GPT-4.

ChatGPT app for macOS

Expanding the ChatGPT app ecosystem, OpenAI launched the chatbot app for Apple’s macOS-based desktops. The ChatGPT app for macOS will have deeper integration into the platform. With a keyboard shortcut (Option + Space) users will be directed on to the conversation page of ChatGPT to prompt the chatbot with a query.

OpenAI confirmed that it is currently working on the Windows version of the app that will be launched “later this year”.

The macOS app for ChatGPT is currently rolling-out to Plus subscribers and will also be available to free tier users in the coming weeks.

ChatGPT app for macOS
Expanding more capabilities to free users

The new GPT-4o model is available on ChatGPT for free tier users, but with a limit on the number of messages. This limit will depend on usage and demand at the time of use and ChatGPT will automatically switch to GPT-3.5 once the limit is reached. However, while using chatGPT with GPT-4o, a free tier user will get access to some of the advanced features which were limited to paid tier subscribers earlier.

A free tier user with GPT-4o can upload files and pictures for summarising, analysing, and more. With the new model, free users can leverage the “Memory” feature and ask ChatGPT to remember information for future conversations. Additionally, free tier users will get access to the GPT Store for browsing and using custom bots. The GPT store was launched earlier this year for paid subscribers, allowing users to create their own chatbots, called GPTs, and share them on the store for other users. While free tier users will gain access to the GPT store and custom GPTs, they cannot create and share one.

What remains exclusive to paid-tier users

While free-tier users are getting features that were previously limited to the paid tier, the new Voice Mode with GPT-4o will remain exclusive to paid-tier subscribers. The Voice Mode with GPT-4o model support will be rolled-out to ChatGPT Plus subscribers in the coming weeks, while it will be soon available to Team and enterprise users. OpenAI is also rolling-out the GPT-4o model to paid subscribers with “fewer limitations”.

The new model is rolling-out to ChatGPT Plus and Team users, while it will be available for Enterprise users in the coming days. The company said that Plus users will have a message limit up to 5 times greater than free users, and Team and Enterprise users will have even higher limits.
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :Artificial intelligenceAI technologyMicrosoft's artificial intelligenceTechnology

First Published: May 14 2024 | 11:32 AM IST

Next Story