At WWDC on Monday, Apple unveiled Apple Intelligence, a set of features that bring generative AI tools like rewriting a draft email, summarizing notifications and creating custom emojis to iPhone, iPad and Mac. Apple spent a significant portion of its keynote explaining how useful the tools will be — and an almost equal amount of time assuring customers how private the new AI system keeps your data.
That privacy is possible thanks to a two-pronged approach to generative AI that Apple began explaining in its keynote and offered more details in papers and presentations afterward. They show that Apple Intelligence is built with an in-device philosophy that can perform common AI tasks that users want quickly, like transcribing phone calls and organizing their schedules. However, Apple Intelligence can also contact cloud servers for more complex AI requests that involve sending personal context data — and making sure both deliver good results while keeping your data private is where Apple concentrated her efforts.
The big news is that Apple is using its own AI models in-house for Apple Intelligence. Apple notes that it does not train its models with private data or user interactions, which is unique compared to other companies. Instead, Apple uses licensed material and data publicly available online, which is scraped by the company’s Applebot web crawler. Publishers should opt out if they don’t want their data swallowed by Apple, which sounds similar to policies from Google and OpenAI. Apple also says it doesn’t give out social security numbers and credit cards that circulate online, and ignores “profanity and other low-quality content.”
A big selling point for Apple Intelligence is its deep integration into Apple’s operating systems and apps, and how the company optimizes its models for power efficiency and size to fit on the iPhone. Keeping AI requirements local is key to allaying many privacy concerns, but the trade-off is the use of smaller, less capable models on the device.
To make those local patterns useful, Apple uses fine-tuning, which trains models to improve them at specific tasks like proofreading or summarizing text. Abilities are placed in the form of “adaptors”, which can be placed on the foundation model and substituted for the task at hand, similar to applying power-up attributes to your character in a role-playing game. Similarly, Apple’s diffusion model for Image Playground and Genmoji also uses adapters to take different art styles like illustration or animation (making people and pets look like free Pixar characters).
Apple says it has optimized its models to speed up the time between sending a request and receiving a response, and uses techniques such as “speculative decoding,” “context pruning” and “group query attention” to take advantage of Apple Silicon’s Neural Engine. Chipmakers have only recently started adding neural cores (NPUs) to the model, which helps ease CPU and GPU bandwidth when processing machine learning and AI algorithms. It’s part of the reason that only Macs and iPads with M-series chips and only the iPhone 15 Pro and Pro Max support Apple Intelligence.
The approach is similar to what we’re seeing in the Windows world: Intel launched its 14th-generation Meteor Lake architecture featuring a chip with an NPU, and Qualcomm’s new Snapdragon X chips built for Copilot Plus PCs Microsoft has them too. As a result, many AI features in Windows are limited to new devices that can run jobs locally on these chips.
According to Apple’s research, out of 750 responses to text summarization tested, the AI on Apple’s device (with the right adapter) scored more attractive to people than Microsoft’s Phi-3-mini model. It seems like a big achievement, but most chatbot services today use much larger models in the cloud to achieve better results, and this is where Apple is trying to tread carefully on privacy. For Apple to compete with larger models, it is devising a seamless process that sends complex requests to cloud servers while also trying to prove to users that their data remains private.
If a user request needs a more capable AI model, Apple sends the request to its Private Cloud Compute (PCC) servers. PCC runs on its own operating system based on “the basics of iOS” and has its own machine learning suite that powers Apple Intelligence. According to Apple, PCC has its own Secure Boot and Secure Enclave to hold encryption keys that only work with the requesting device, and the Trusted Execution Monitor makes sure that only signed and verified code runs.
Apple says the user’s device establishes an end-to-end encrypted connection with a PCC group before sending the request. Apple says it can’t access data on the PCC since it’s stripped of server management tools, so there’s no remote shell. Apple also does not provide PCC with any persistent storage, so the requests and potential personal context data extracted from the Apple Intelligence Semantic Index are presumably stored in the cloud afterwards.
Each PCC build will have a virtual build that the public or researchers can inspect, and only signed builds that are registered as inspected will go into production.
One of the big open questions is exactly what types of applications will go to the cloud. When processing a request, Apple Intelligence has a step called Orchestration, where it decides whether to proceed to the device or use PCC. We still don’t know what exactly constitutes a complex enough request to trigger a cloud process, and we probably won’t know until Apple Intelligence becomes available in the fall.
There’s another way Apple is dealing with privacy concerns: by making it someone else’s problem. Apple’s revamped Siri can send some questions to ChatGPT in the cloud, but only with permission after you ask some really tough questions. This process puts the issue of privacy in the hands of OpenAI, which has its own policies, and the user, who must agree to download their query. In an interview with Marques Brownlee, Apple CEO Tim Cook said ChatGPT will be invoked for requests involving “world knowledge” that are “outside the scope of personal context.”
Apple’s on-premise and cloud sharing approach to Apple Intelligence isn’t entirely new. Google has a Gemini Nano model that can run locally on Android devices alongside its Pro and Flash models that process in the cloud. Meanwhile, Microsoft’s Copilot Plus PCs can process AI requests locally as the company continues to build on its agreement with OpenAI and also build its in-house MAI-1 model. None of Apple’s rivals, however, have emphasized their privacy commitments as fully by comparison.
Of course, this all looks great in staged demos and edited papers. However, the real test will be later this year when we see Apple Intelligence in action. We’ll have to see if Apple can strike that balance of quality AI and privacy experiences — and continue to grow it in the coming years.