Home / Technology / Tech News / Google enhances Gemini AI with new audio, video, agent features: Details
Google enhances Gemini AI with new audio, video, agent features: Details
At Google I/O 2025, the tech giant announces major updates to Gemini AI, including real-time visual assistance, creative tools for professionals, and deeper integration with Google services
Gemini features introduced at Google I/O (Image: Google)
3 min read Last Updated : May 21 2025 | 12:15 PM IST
At its annual developer conference, Google I/O 2025, the US-based software giant introduced a comprehensive set of upgrades to its Gemini AI platform. The updates aim to make Gemini more interactive, creative, and capable – extending its use beyond text to support image creation, video generation, research, and real-time assistance.
Here is a breakdown of what’s new in Gemini:
Gemini Live: Real-time visual help now on mobile
Gemini Live is now available for free on both Android and iOS, providing users with real-time assistance using the device’s camera and screen sharing features. In upcoming updates, Gemini Live will integrate more closely with Google services such as Calendar, Maps, and Tasks, allowing users to convert conversations into actionable items – like planning events, locating restaurants, or setting reminders.
Google said that users will retain complete control over app permissions and personal data through privacy settings. Gemini will also enable navigation with Maps and interaction with other Google services via its conversational interface.
Agent Mode: Smarter autonomy on the desktop
Gemini’s new Agent Mode is an experimental feature launching soon on desktop for users subscribed to the Ultra plan. According to Google, the Agent Model will offer intelligent automation by combining advanced tools such as live web browsing, deep research, and integration with services like Chrome.
Agent Mode is designed to autonomously handle tasks such as scheduling, information gathering, and workflow organisation – marking a step towards fully agentic AI experiences within Google’s ecosystem.
AI Ultra: Premium subscription for advanced users
Google has introduced AI Ultra, a new premium subscription plan priced at $249.99/month, offering access to its powerful AI tools. Designed for creative professionals, developers, and filmmakers, the plan includes higher usage limits and exclusive capabilities.
Access to advanced image and video tools like Whisk
VIP-level support for complex or technical projects
YouTube Premium
Deep Research: Integrating public and personal data
Google said that the Deep Research feature allows users to generate detailed, personalised reports by merging public data with their own private files – such as PDFs and images. This hybrid approach will enable a more contextual understanding of complex topics by aligning individual insights with broader data trends.
Google will soon allow direct integration with Gmail and Google Drive, making research processes faster and more efficient.
Veo 3: AI video generation with audio and dialogue
Google’s latest AI video model, Veo 3, introduces native audio generation for the first time. This includes background sounds and character dialogue, alongside significant improvements in visual fidelity and text rendering.
The model supports photorealistic scene creation, lip synchronisation, and short-form storytelling—offering advanced capabilities for content creators. Veo 3 is available exclusively to Ultra plan subscribers.
Imagen 4: Next-generation AI image model
Google has also launched Imagen 4, calling it the most advanced image generation model to date. It offers improved text rendering and visual detail, enabling users to generate high-quality graphics for presentations, marketing, or social media content.
You’ve reached your limit of {{free_limit}} free articles this month. Subscribe now for unlimited access.