Detailed analysis of "Apple Intelligence": Apple's self-developed, GPT-4o, or Google Gemini?

ChatGPT is just a small part of it. Apple's AI strategy is based on a three-tier model architecture: small models on the client side, self-developed cloud models, and external large models. It focuses on seamlessly integrating AI capabilities into the operating system and incorporating them into user scenarios

After lagging behind its peers for nearly two years, Apple finally unveiled AI features at this week's WWDC conference.

"Not competing on AI technology, relying on OpenAI's large models, AI features lack novelty..." Apple, which embarked on a new journey of "redefining AI," seems to have not satisfied everyone. At this conference, Apple almost did not mention "artificial intelligence," but referred to it as "Apple Intelligence."

However, the capital market paid the bill. Apple reversed its decline on Wednesday, skyrocketing by 7% overnight, with its market value soaring by $200 billion.

So what made the market change its attitude? What does "Apple Intelligence" really mean? Can Apple's AI strategy rely solely on OpenAI?

AI Integrated into Apple's Ecosystem

Unlike big model companies rushing to launch standalone chatbots, Apple's choice is to integrate AI features into the apps and products that users are already using, integrating them into users' daily usage scenarios.

As a result, Apple's AI capabilities are far more extensive than what standalone chatbots can do, such as intelligent photo editing, note and message summarization, automatic transcription of voice memos, and more.

Apple uses self-developed internal models to achieve relatively simple AI functions, and relies on more powerful external models, such as GPT, to achieve more advanced and complex AI functions. Take the demonstrations of the new Siri and writing tools as examples:

New Siri: Apple demonstrated how Siri can help fill out PDF forms and find the user's driver's license photo, extract the number from the license, and input it into the form. In another demonstration, Siri can search for recipes that friends have sent in messages and emails.

What truly enhances Siri's capabilities is ChatGPT. When asked to do something it doesn't know how to do, such as arranging a dinner menu based on the recent shopping list, Siri will directly call the GPT interface and consult ChatGPT for advice with the user's permission. Users do not need to be ChatGPT Plus users to enjoy this feature.

Writing Tools: Apple will add AI functions for summarizing, rewriting, and proofreading to applications like memos, emails, and Pages documents. The integrated writing tool can also suggest replies and responses with different tones for messages and emails.

However, for more creative functions, such as writing a poem about the iPhone, this task will be handed over to ChatGPT. Similarly, this writing tool will seek the user's consent before consulting ChatGPT

Apple has announced that Apple Intelligence, iOS 18, iPadOS 18, and MacOS Sequoia will integrate a series of AI features in the fall, but some more powerful AI features may not be available until 2025.

Three-tier Large Model Architecture Behind the Scenes

The internal and external models that support the above scene functions can be further subdivided into on-device small models (Apple On-Device), self-developed cloud models (Apple Server), and external large models such as ChatGPT.

The first-tier architecture is Apple's on-device small model, with one parameter size of 3 billion, capable of running directly on terminal devices like phones.

As mentioned in a previous article, this is actually a result of comprehensive consideration of running speed and computational power. Most Apple on-device models are pre-trained for different tasks based on users' personal habits and data, resulting in very fast response times for various needs.

The second tier is Apple's self-developed larger language model, which runs on Apple chip servers through private cloud computing.

In terms of performance, some analyses indicate that although the specific parameter size of this model is not disclosed, its performance can be compared to GPT-4. In real-world prompt evaluations, the large Apple Server model outperforms GPT-3.5-Turbo and slightly lags behind GPT-4-Turbo. This model is also fine-tuned for Apple users' daily behaviors.

Apple emphasizes privacy as a top priority, stating that data is not saved or accessed by others during the operation of internal models.

As mentioned in a previous article, these servers are equipped with security tools written in the Swift language. Apple AI "only sends the relevant data needed to complete the task" to these servers, without granting full access to contextual information on the device.

The third-tier architecture involves collaboration with OpenAI and access to the GPT large model.

According to Apple's demonstration, when more complex AI functions need to be implemented and with user consent, the GPT interface can be called.

Moreover, it is certain that OpenAI may not be Apple's only external partner. Apple's Senior Vice President of Software Engineering, Craig Federighi, stated:

Apple plans to allow users to choose their favorite large models in the future, including Google's Gemini, etc