Thoughts on the WWDC 2024 keynote on Apple Intelligence

10th June 2024

Today’s WWDC keynote finally revealed Apple’s new set of AI features. The AI section (Apple are calling it Apple Intelligence) started over an hour into the keynote—this link jumps straight to that point in the archived YouTube livestream, or you can watch it embedded here:

There’s also a detailed Apple newsroom post: Introducing Apple Intelligence, the personal intelligence system that puts powerful generative models at the core of iPhone, iPad, and Mac.

There are a lot of interesting things here. Apple have a strong focus on privacy, finally taking advantage of the Neural Engine accelerator chips in the A17 Pro chip on iPhone 15 Pro and higher and the M1/M2/M3 Apple Silicon chips in Macs. They’re using these to run on-device models—I’ve not yet seen any information on which models they are running and how they were trained.

On-device models that can outsource to Apple’s servers

Most notable is their approach to features that don’t work with an on-device model. At 1h14m43s:

When you make a request, Apple Intelligence analyses whether it can be processed on device. If it needs greater computational capacity, it can draw on Private Cloud Compute, and send only the data that’s relevant to your task to be processed on Apple Silicon servers.

Your data is never stored or made accessible to Apple. It’s used exclusively to fulfill your request.

And just like your iPhone, independent experts can inspect the code that runs on the servers to verify this privacy promise.

In fact, Private Cloud Compute cryptographically ensures your iPhone, iPad, and Mac will refuse to talk to a server unless its software has been publicly logged for inspection.

There’s some fascinating computer science going on here! I’m looking forward to learning more about this—it sounds like the details will be public by design, since that’s key to the promise they are making here.

Update: Here are the details, and they are indeed extremely impressive—more of my notes here.

An ethical approach to AI generated images?

Their approach to generative images is notable in that they’re shipping an on-device model in a feature called Image Playground, with a very important limitation: it can only output images in one of three styles: sketch, illustration and animation.

This feels like a clever way to address some of the ethical objections people have to this specific category of AI tool:

If you can’t create photorealistic images, you can’t generate deepfakes or offensive photos of people
By having obvious visual styles you ensure that AI generated images are instantly recognizable as such, without watermarks or similar
Avoiding the ability to clone specific artist’s styles further helps sidestep ethical issues about plagiarism and copyright infringement

The social implications of this are interesting too. Will people be more likely to share AI-generated images if there are no awkward questions or doubts about how they were created, and will that help it more become socially acceptable to use them?

I’ve not seen anything on how these image models were trained. Given their limited styles it seems possible Apple used entirely ethically licensed training data, but I’d like to see more details on this.

App Intents and prompt injection

Siri will be able to both access data on your device and trigger actions based on your instructions.

This is the exact feature combination that’s most at risk from prompt injection attacks: what happens if someone sends you a text message that tricks Siri into forwarding a password reset email to them, and you ask for a summary of that message?

Security researchers will no doubt jump straight onto this as soon as the beta becomes available. I’m fascinated to learn what Apple have done to mitigate this risk.

Integration with ChatGPT

Rumors broke last week that Apple had signed a deal with OpenAI to use ChatGPT. That’s now been confirmed: here’s OpenAI’s partnership announcement:

Apple is integrating ChatGPT into experiences within iOS, iPadOS, and macOS, allowing users to access ChatGPT’s capabilities—including image and document understanding—without needing to jump between tools.

Siri can also tap into ChatGPT’s intelligence when helpful. Apple users are asked before any questions are sent to ChatGPT, along with any documents or photos, and Siri then presents the answer directly.

The keynote talks about that at 1h36m21s. Those prompts to confirm that the user wanted to share data with ChatGPT are very prominent in the demo!

Animated screenshot. User says to Siri: I have fresh salmon, lemons, tomatoes. Help me plan a 5-course meal with a dish for each taste bud. Siri shows a dialog Do you want me to use ChatGPT to do that? User clicks Use ChatGPT and gets a generated response.

This integration (with GPT-4o) will be free—and Apple don’t appear to be charging for their other server-side AI features either. I guess they expect the supporting hardware sales to more than cover the costs of running these models.

Posted 10th June 2024 at 8:19 pm · Follow me on Mastodon, Bluesky, Twitter or subscribe to my newsletter

Simon Willison’s Weblog