Home / Technology / Tech News / Inside Apple's new AI models: How it works, where it gets its training data

Inside Apple's new AI models: How it works, where it gets its training data

Apple has shared updates on its on-device and server foundation language models, offering a deep dive into the architecture, training methods, and privacy safeguards behind its new AI models

Apple Intelligence

4 min read Last Updated : Jul 22 2025 | 1:21 PM IST

Listen to This Article

Apple unveiled upgraded AI models designed to power Apple Intelligence features across iPhones, iPads, and Macs at WWDC 2025. These include both on-device and cloud-based models. Now, Apple has published a detailed technical report titled "Apple Intelligence Foundation Language Models Tech Report 2025", outlining how these models work and where their training data comes from. Here’s a breakdown of the key points:

What are Apple’s new AI models

Apple introduced two foundation models that support Apple Intelligence across apps and services. The first is a compact 3-billion-parameter model designed to run directly on Apple devices powered by Apple Silicon. The second is a more powerful server-based model hosted on Apple’s Private Cloud Compute (PCC) infrastructure.

ALSO READ: AI web browsers explained: From Perplexity Comet to ChatGPT shopping

Both models are multilingual and multimodal, meaning they understand multiple languages and can process different types of content, including text and images. Apple said that they were trained using large-scale datasets sourced through licensed content, web crawling, and synthetic data generation.

Also Read

Apple's iPad Pro M5 may feature dual front camera set up: What to expect

iOS 26 public beta: Rollout timeline, eligible iPhone models, and more

Apple's ultra-thin iPhone 17 'Air' could debut this year: What to expect

Apple's foldable iPhone may not be much different from Samsung Fold: Report

Apple, Croma asked by consumer panel to refund customer's iPhone cost

How do these new AI models work?

On-device model:

Starting with the on-device AI model, Apple split this model into two parts to reduce memory usage and make it faster:

Block 1: Handles most of the processing.
Block 2: Skips some operations to save memory and boost speed.

This architecture helps the model respond faster without losing quality, making it more suitable for real-time features like text suggestions or summarisation.

Cloud-based model:

The cloud-based model uses a more advanced architecture called Parallel-Track Mixture-of-Experts (PT-MoE). Instead of processing every task with the full model, PT-MoE routes each task to the most relevant "experts"—specialised mini-models trained for specific content types. So, for example, if you ask it to plan a vacation, only the travel-related experts get activated.

This not only speeds things up but also makes the model more efficient. Apple also created a new kind of Transformer (a type of neural network) that processes multiple parts of a request in parallel, reducing bottlenecks and improving performance.

What are the benefits of these new AI models?

One of the biggest improvements is expanded multilingual support. Apple increased the portion of training data in non-English languages from 8 per cent to 30 per cent. It also expanded the model’s vocabulary from 100,000 to 150,000 tokens.

This means Apple Intelligence can now better understand and respond in more languages, with improved fluency and accuracy. Apple said that it has also tested this using prompts written by native speakers, ensuring the models performed well across different cultures and languages. With this, features like Writing Tools should now work more reliably outside of English.

ALSO READ: Apple Intelligence: New features coming to iPhones, iPads, Macs this year

Apple has also opened access to its on-device model for third-party developers, allowing them to use AI-powered features like summarisation or rewriting directly within their apps, without sending data off the device.

From where did Apple source its data?

Apple said that it trained its AI models using a wide range of high-quality data, but it does not use your private information or personal device activity.

Instead, Apple said it relied on the following main sources:

Licensed content from publishers
Publicly available and open-source data
Web content collected by Applebot, Apple’s own web crawler

Apple said that its web crawler respects “robots.txt” rules that let websites opt out of being used for training. Publishers can also limit which pages are accessible while still being included in Apple services like Siri and Spotlight search.

Text data:

A big part of Apple’s training data came from websites. The company said that Applebot crawled billions of web pages across different topics and languages and used smart tools to load full pages, interact with dynamic content, and extract useful information. To ensure quality, Apple used AI-based filtering instead of rigid rules, which helped retain more relevant content while avoiding low-quality or inappropriate data.

Image data:

To help its models understand visuals, Apple trained them on image-text pairs. It used licensed images, public images with captions, and AI-generated image descriptions. Apple also included visual materials like infographics, tables, and charts. For example, it used AI to create sample data, generate charts, and then produce related questions for training.