Inside Apple's new AI models: How it works, where it gets its training data

Apple has shared updates on its on-device and server foundation language models, offering a deep dive into the architecture, training methods, and privacy safeguards behind its new AI models

Apple Intelligence
Apple Intelligence
Harsh Shivam New Delhi
4 min read Last Updated : Jul 22 2025 | 1:21 PM IST
Apple unveiled upgraded AI models designed to power Apple Intelligence features across iPhones, iPads, and Macs at WWDC 2025. These include both on-device and cloud-based models. Now, Apple has published a detailed technical report titled "Apple Intelligence Foundation Language Models Tech Report 2025", outlining how these models work and where their training data comes from. Here’s a breakdown of the key points:

What are Apple’s new AI models

Apple introduced two foundation models that support Apple Intelligence across apps and services. The first is a compact 3-billion-parameter model designed to run directly on Apple devices powered by Apple Silicon. The second is a more powerful server-based model hosted on Apple’s Private Cloud Compute (PCC) infrastructure. 
  Both models are multilingual and multimodal, meaning they understand multiple languages and can process different types of content, including text and images. Apple said that they were trained using large-scale datasets sourced through licensed content, web crawling, and synthetic data generation.

How do these new AI models work?

On-device model:

Starting with the on-device AI model, Apple split this model into two parts to reduce memory usage and make it faster:
  • Block 1: Handles most of the processing.
  • Block 2: Skips some operations to save memory and boost speed.
This architecture helps the model respond faster without losing quality, making it more suitable for real-time features like text suggestions or summarisation.

Cloud-based model:

The cloud-based model uses a more advanced architecture called Parallel-Track Mixture-of-Experts (PT-MoE). Instead of processing every task with the full model, PT-MoE routes each task to the most relevant "experts"—specialised mini-models trained for specific content types. So, for example, if you ask it to plan a vacation, only the travel-related experts get activated.
 
This not only speeds things up but also makes the model more efficient. Apple also created a new kind of Transformer (a type of neural network) that processes multiple parts of a request in parallel, reducing bottlenecks and improving performance.

What are the benefits of these new AI models?

One of the biggest improvements is expanded multilingual support. Apple increased the portion of training data in non-English languages from 8 per cent to 30 per cent. It also expanded the model’s vocabulary from 100,000 to 150,000 tokens.
 
This means Apple Intelligence can now better understand and respond in more languages, with improved fluency and accuracy. Apple said that it has also tested this using prompts written by native speakers, ensuring the models performed well across different cultures and languages. With this, features like Writing Tools should now work more reliably outside of English. 
  Apple has also opened access to its on-device model for third-party developers, allowing them to use AI-powered features like summarisation or rewriting directly within their apps, without sending data off the device.

From where did Apple source its data?

Apple said that it trained its AI models using a wide range of high-quality data, but it does not use your private information or personal device activity.
 
Instead, Apple said it relied on the following main sources:
  • Licensed content from publishers
  • Publicly available and open-source data
  • Web content collected by Applebot, Apple’s own web crawler
Apple said that its web crawler respects “robots.txt” rules that let websites opt out of being used for training. Publishers can also limit which pages are accessible while still being included in Apple services like Siri and Spotlight search.

Text data:

A big part of Apple’s training data came from websites. The company said that Applebot crawled billions of web pages across different topics and languages and used smart tools to load full pages, interact with dynamic content, and extract useful information. To ensure quality, Apple used AI-based filtering instead of rigid rules, which helped retain more relevant content while avoiding low-quality or inappropriate data.

Image data:

To help its models understand visuals, Apple trained them on image-text pairs. It used licensed images, public images with captions, and AI-generated image descriptions. Apple also included visual materials like infographics, tables, and charts. For example, it used AI to create sample data, generate charts, and then produce related questions for training.
*Subscribe to Business Standard digital and get complimentary access to The New York Times

Smart Quarterly

₹900

3 Months

₹300/Month

SAVE 25%

Smart Essential

₹2,700

1 Year

₹225/Month

SAVE 46%
*Complimentary New York Times access for the 2nd year will be given after 12 months

Super Saver

₹3,900

2 Years

₹162/Month

Subscribe

Renews automatically, cancel anytime

Here’s what’s included in our digital subscription plans

Exclusive premium stories online

  • Over 30 premium stories daily, handpicked by our editors

Complimentary Access to The New York Times

  • News, Games, Cooking, Audio, Wirecutter & The Athletic

Business Standard Epaper

  • Digital replica of our daily newspaper — with options to read, save, and share

Curated Newsletters

  • Insights on markets, finance, politics, tech, and more delivered to your inbox

Market Analysis & Investment Insights

  • In-depth market analysis & insights with access to The Smart Investor

Archives

  • Repository of articles and publications dating back to 1997

Ad-free Reading

  • Uninterrupted reading experience with no advertisements

Seamless Access Across All Devices

  • Access Business Standard across devices — mobile, tablet, or PC, via web or app

More From This Section

Topics :Apple artifical intelligenceAI Models

First Published: Jul 22 2025 | 1:20 PM IST

Next Story