Gemma 3n: All about Google's open model for on-device AI on phones, laptops

Google says Gemma 3n makes use of a new technique called Per-Layer Embeddings (PLE), which allows the model to consume much less RAM than similarly sized models

Google Gemma 3n
Google Gemma 3n
Harsh Shivam New Delhi
2 min read Last Updated : May 22 2025 | 5:12 PM IST
At its annual Google I/O conference, Google unveiled the Gemma 3n, a new addition to its Gemma 3 series of open AI models. The company said that the model is designed to run efficiently on everyday devices like smartphones, laptops, and tablets. Gemma 3n shares its architecture with the upcoming generation of Gemini Nano, the lightweight AI model that already powers several on-device AI features on Android devices such as voice recorder summaries on Pixel smartphones. 

Gemma 3n model: Details

Google says Gemma 3n makes use of a new technique called Per-Layer Embeddings (PLE), which allows the model to consume much less RAM than similarly sized models. Although the model has 5 billion and 8 billion parameters (5B and 8B), this new memory optimisation brings its RAM usage closer to that of a 2B or 4B model. In practical terms, this means Gemma 3n can run with just 2GB to 3GB of RAM, making it viable for a much wider range of devices. 

Gemma 3n model: Key capabilities

  • Audio input: The model can process sound-based data, enabling applications like speech recognition, language translation, and audio analysis.
  • Multimodal input: With support for visual, text, and audio inputs, the model can handle complex tasks that involve combining different types of data.
  • Broad language support: Google said that the model is trained in over 140 languages.
  • 32K token context window: Gemma 3n supports input sequences up to 32,000 tokens, allowing it to handle large chunks of data in one go—useful for summarising long documents or performing multi-step reasoning.
  • PLE caching: The model’s internal components (embeddings) can be stored temporarily in fast local storage (like the device’s SSD), helping reduce the RAM needed during repeated use.
  • Conditional parameter loading: If a task doesn’t require audio or visual capabilities, the model can skip loading those parts, saving memory and speeding up performance.

Gemma 3n model: Availability

As part of the Gemma open model family, Gemma 3n is provided with accessible weights and licensed for commercial use, allowing developers to tune, adapt, and deploy it across a variety of applications. Gemma 3n is now available as a preview in Google AI Studio. 
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :GoogleGemini AIAI ModelsTechnology

First Published: May 22 2025 | 5:12 PM IST

Next Story