Google goes after OpenAI with Veo 2 video generation AI model: Details here

Built on the model Google introduced earlier this year, Veo 2 can generate videos in 4K resolution and lengths in minutes, while OpenAI's Sora can do full-HD resolution short videos

2 min read Last Updated : Dec 17 2024 | 2:35 PM IST

Google has enhanced its artificial intelligence-powered image and video generation capabilities. The American tech giant has introduced its second-generation video generation model, Veo 2, alongside improvements to its existing Imagen 3 image-generation model, which now produces brighter and more composed images. The company has also unveiled a new experimental tool, Whisk, which allows users to stylise and remix images for unique outputs.

Google Veo 2 model

Google said that the Veo 2 model is designed to improve the understanding of real-world physics, human movement, and expression, enabling it to generate more realistic videos with finer details. Google claims that the model can process complex requests, including genre, lens type, and cinematic effects. The new model can generate videos in resolutions up to 4K, with video lengths extending to several minutes.

Veo 2 is integrated into Google Labs' video generation tool, VideoFX. Users can visit Google Labs and join the waitlist for access to the new features. Google also plans to expand Veo 2 to YouTube Shorts and other products next year.

ALSO READ: OpenAI releases Sora: All you need to know about AI video generation model

Imagen 3

Google has also enhanced its Imagen 3 image-generation model, which now offers the ability to render a wider variety of art styles with greater accuracy—from photo-realism and impressionism to abstract and anime. The update also improves the model’s ability to follow input prompts more closely and produce images with more detail and texture.

Like Veo 2, the updated Imagen 3 model will be available in Google Labs, in the image-generation tool called ImageFX.

Whisk

Google’s latest experimental tool, Whisk, combines the capabilities of Imagen 3’s image generation with Gemini’s visual understanding and description capabilities. Whisk allows users to input or create images according to their preferences and remix them for unique outputs. When a user inputs an image, Gemini automatically writes a detailed caption, which is then fed into Imagen 3. This process allows the model to generate images in different styles based on the input and description.

Already subscribed? Log in

Subscribe to read the full story →

^*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%

*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

In-depth market analysis & insights with access to The Smart Investor

Ad-free Reading

Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

Access Business Standard across devices — mobile, tablet, or PC, via web or app

SAVE 25%

Subscribe for ₹2,700 / 1 Year

Connect with us on WhatsApp

Google goes after OpenAI with Veo 2 video generation AI model: Details here

Built on the model Google introduced earlier this year, Veo 2 can generate videos in 4K resolution and lengths in minutes, while OpenAI's Sora can do full-HD resolution short videos