Business Standard

Google I/O 2024: Gemini AI gets new capabilities to rival OpenAI's ChatGPT

Google said Gemini AI is set to become the foundational model powering its services such as Search, Photos, Workspace, Android, and more

Google I/O 2024

Google I/O 2024

Harsh Shivam New Delhi
Google I/O 2024 kicked off with a keynote address focused on Gemini, its artificial intelligence (AI) model that is set to get new capabilities to become the foundational model powering its services such as Search, Photos, Workspace, Android, and more. 

With Gemini, Google said, the goal is to make AI helpful for everyone. On that note, Google announced that it is expanding “AI overviews in Search” to everyone in US this week and to more countries soon. While this was long time coming, Google threw in a surprise with Gemini-powered “Ask Photos” feature for Google Photos. It essentially lets you search your entire library on Google Photos and follow-up the results with even more complex prompts. More details on the “Ask Photos” will be available later this year, which is when the feature is slated to roll out. 

About the Gemini itself, the model has been updated with new capabilities, said Google. Called Gemini 1.5 Pro, the new and improved version will be available to all developers globally. In addition, Google announced that Gemini 1.5 Pro with one-million context is now directly available for consumers in Gemini Advanced. This can be used across 35 languages. 

Here is a roundup of everything Google announced at I/O 2024 keynote:

Gemini in Workspace

Google said that it is rolling-out the Gemini 1.5 Pro model to its paid-tier customers with a new side-panel on Workspace apps such as Gmail, Drive, Docs, Sheets and more. The side-panel resembles the Microsoft’s Copilot side-panel on desktops and offers better accessibility to AI from any Workspace app. 

Another feature coming to Workspace is the new Gemini AI teammate, which is essentially an AI-powered assistant for Workspace apps. The Gemini Teammate has its own Google Account and can be incorporated into groups within Chats.

Google Project Astra

Google’s Project Astra is a multimodal AI agent with real-time spatial understanding. Google said that the AI agent is capable of understanding objects in a physical space and can process the data in real-time. It can basically watch and remember what it sees through your device’s camera and can respond to prompts based on it. Google said that the AI agent will be powering the company’s Gemini product starting later this year.

AI in Search

One of the biggest takeaways from Google’s announcement is AI in Search. The search engine will soon get the ability to analyse and search based on video inputs, similar to how it does with images using Google Lens. 

Google said that the Search is backed by a custom Gemini AI model and gets improved contextual understanding. Search results get AI-powered overviews, which were previously part of the Search Generative Experience (SGE) and was available as an experimental feature. Leveraging the Gemini AI, Google said, Search can break longer queries into smaller parts for better understanding as well.

Circle to Search

Circle to search for Android is set to get new features. Google said that the updated version of the feature will allow users to simply circle a mathematical problem and Google’s AI will provide with steps that should make it easy to solve the question. 

Smarter Gemini Assistant for Android

Google said that the Gemini AI assistant for Android will soon be able to harness multimodality by understand the video playing on the display and let users ask questions based on the video. The assistant will also gain the ability to answer the user's query based on a document such as a PDF files. 

Gemini is also getting a new “Live feature” that will allow it to understand live videos in real-time and will be able to hold a more natural conversation with the user. 

Gems: Custom Gemini chatbots

Google said that it will soon allow Gemini Advanced subscribers to create custom chatbots for carrying out a specific task. The feature is similar to custom GPTs on OpenAI’s ChatGPT. 

Scam call detection on Android

Using the on-device Gemini Nano model, select Android powered smartphones will soon be able to detect if the phone call received is a scam call. Google said that the feature will understand the conversation pattern during the phone call and will notify the user if it thinks the on-going call is a scam call. According to Google, call data will be processed on-device for privacy and security. 

Google Veo

Google is set to rival OpenAI’s Sora with its new generative AI model called Veo, which the company said will be able to generate videos in 1080p resolution. The model will generate videos based on text, image, and video-based prompts and will allow users to further edit the generated video with more prompts.

Don't miss the most important news and views of the day. Get them on our Telegram channel

First Published: May 15 2024 | 9:49 AM IST

Explore News