Google has announced its next-generation AI model- Gemini 1.5. According to the company, this multimodal large language model (MLLM) showcases "dramatic improvements" in various departments. Google said that this new model could achieve comparable quality to Gemini Ultra 1.0, which is Google's most advanced AI model currently while using less computation.
The first Gemini 1.5 model that the company is releasing for early testing is the Pro model. Gemini 1.5 Pro, which is a mid-size multimodal, will be available to select developers and enterprise customers through AI Studio and Vertex AI in a private preview.
Google CEO Sundar Pichai, in a blog post, stated that the Gemini 1.5 Pro model can process more information compared to the previous generation. "We've been able to significantly increase the amount of information our models can process — running up to 1 million tokens consistently, achieving the longest context window of any large-scale foundation model yet," wrote Pichai.
READ: Google to replace Assistant by Gemini on wearable audio accessories: Report
Gemini 1.5: What is new
Google Gemini 1.5 model is based on Mixture-of-Experts (MoE) architecture. Compared to traditional Transfer architecture that works as one large neural network, models based on MoE divide the network into smaller "experts" that are specialised to compute a specific task.
Depending on the type of input provided, these models selectively activate only the most relevant expert to carry out the task. This technique enhances the efficiency of the model and also improves the quality of the output. MoE architecture also allows the model to be trained to carry out more complex tasks.
Google said that the Gemini 1.5 model has a bigger "context window". The context window is made up of tokens, which can be words, images, videos or codes. The bigger the context window, the more information a model can take as an input.
READ: Keyframer: Apple's new AI-editor for generating animations with text input
With Gemini 1.5 Pro, which is currently under testing, Google has increased the context window capacity to 1 million tokens from 32,000 on the Gemini 1.0 model. Google said that the new model is capable of processing one hour of video, 11 hours of audio and over 700,000 words in one go.
You’ve reached your limit of {{free_limit}} free articles this month.
Subscribe now for unlimited access.
Already subscribed? Log in
Subscribe to read the full story →
Smart Quarterly
₹900
3 Months
₹300/Month
Smart Essential
₹2,700
1 Year
₹225/Month
Super Saver
₹3,900
2 Years
₹162/Month
Renews automatically, cancel anytime
Here’s what’s included in our digital subscription plans
Exclusive premium stories online
Over 30 premium stories daily, handpicked by our editors


Complimentary Access to The New York Times
News, Games, Cooking, Audio, Wirecutter & The Athletic
Business Standard Epaper
Digital replica of our daily newspaper — with options to read, save, and share


Curated Newsletters
Insights on markets, finance, politics, tech, and more delivered to your inbox
Market Analysis & Investment Insights
In-depth market analysis & insights with access to The Smart Investor


Archives
Repository of articles and publications dating back to 1997
Ad-free Reading
Uninterrupted reading experience with no advertisements


Seamless Access Across All Devices
Access Business Standard across devices — mobile, tablet, or PC, via web or app
)