Li Yanhong returns to the center stage
Facing GPT-4 head-on.
Author: Zhou Zhiyu
Editor: Zhang Xiaoling
As a member of the old BAT (Baidu, Alibaba, Tencent), Baidu has been relatively low-key for a long time. However, this year, with the help of AI, Baidu has made a comeback, and its founder, Robin Li, has returned to the center stage of the technology industry. What is even more eye-catching than his handsome appearance is his newfound confidence.
In March, when Baidu announced the testing of its Wenxin AI model, Robin Li simply demonstrated it through a demo. This time, however, he no longer appeared hesitant and instead exuded full confidence.
At the Baidu World Conference on October 17th, Robin Li unveiled the Wenxin AI model 4.0 and boldly claimed that its comprehensive capabilities are on par with GPT-4.
In addition, Robin Li introduced more than ten AI-native applications, including new search and map features. He also shared his thoughts on how to find more commercial paths for large-scale AI models.
For the past decade, Baidu has invested billions in AI, and Robin Li and Baidu have always faced skepticism from investors about whether such investments are worthwhile. Robin Li compared Baidu's exploration of AI commercialization to groping in the dark. Now, a glimmer of light has appeared, and he is accelerating towards the light ahead.
Reconstruction
At the end of last year's Baidu all-hands meeting, when asked about his views on the popular ChatGPT, Baidu's founder, Robin Li, was both excited and somewhat uncertain.
He was excited because he believed that ChatGPT represented a new opportunity for AI technology after reaching a certain level of development. However, he was uncertain because "this opportunity is not yet clear."
Now, after ten months, it seems that the previous uncertainty has been resolved.
Robin Li is no longer hiding his excitement.
During an hour-long live demonstration, Robin Li, as the main presenter on stage, showcased the capabilities of the Wenxin AI model 4.0 in understanding, generation, logic, and memory. He also introduced new applications such as search, document library, and maps. Through concrete examples, the audience directly experienced the capabilities of these applications in the reconstruction of large-scale models.
"I want to buy a house in Chengde. Can I use my housing provident fund for a loan? What are the procedures? I work in Beijing." This question, with its unordered and relatively vague expressions, was accurately understood and answered by Wenxin AI through the reorganization of information points based on the context.
During the demonstration, Robin Li explained that although this sentence seems simple, it actually contains several small traps. The ultimate result is that Wenxin AI can accurately understand the implied meaning behind a sentence.
Accurate understanding of natural language instructions is a challenge in the application of large-scale Chinese models. Previously, Shang Guobin, Vice President of Baidu, demonstrated the reconstructed Baidu Maps after the reconstruction of large-scale models at the Wall Street Journal event. He emphasized that for applications like maps, accuracy is crucial. If the accuracy is not guaranteed, users will not use it.
For example, when reconstructing the AI-native map, in addition to ensuring the accuracy of route and navigation information, it is also necessary to enable the application to accurately understand the meaning of user speech and provide appropriate solutions during interaction. This means that Baidu Maps needs to be reconstructed using large-scale models within the map knowledge graph system, further raising the requirements for the comprehensive capabilities of large-scale models. In the past, if someone from Chengde working in Beijing wanted to buy a house back home, they could ask on Baidu Maps and get information about relevant government agencies, their addresses, working hours, contact numbers, and even directions. They could even book a ride-hailing service or get suggestions for flights and hotels. This ability to integrate thousands of plugins into a single sentence greatly enhances the user's interaction and efficiency when using the application.
Li Yanhong emphasized that AI-native applications developed based on the understanding, generation, logic, and memory capabilities of large models will become more powerful and widely used, leading to increased productivity and efficiency. This is a capability that applications in the past did not possess.
Lu Yi, Chief Analyst of Guojin Securities Media and Internet, believes that in the era of large models, there may be changes in the traffic entrance. As a high-demand industry with high barriers to entry, the map industry may further aggregate travel-related services and delve deeper into the value chain of travel.
Even outside the realm of information, the physical world will also be reconstructed as large model applications are implemented. Li Yanhong believes that future AI-native applications will definitely be multimodal, and autonomous driving is a typical application of large models reconstructing the physical world. Large models will enable Baidu's autonomous driving capabilities to surpass experiential systems, allowing for smarter handling of complex scenarios and achieving broader spatiotemporal coverage.
In addition, Li Yanhong believes that a large number of AI-native applications will continue to emerge, and digital technology will deeply integrate with the real economy. Moreover, large model technology has been applied in manufacturing, energy, power, chemical, transportation, and other physical industries.
An era of AI-native applications is upon us.
Turning Point
Among the tech giants participating in the battle of large models worldwide, Baidu is the first to publicly state that its large models can directly compete with GPT-4, rather than just being a plan on paper.
Baidu and Li Yanhong, who have won the jackpot, intend to demonstrate that they can compete with OpenAI on the world stage.
Looking at the past seven months, Baidu has indeed made rapid progress and done a lot of work to catch up with GPT-4. Wang Haifeng, Baidu's Chief Technology Officer, revealed that the Wenxin large model 4.0 was launched in small-scale testing in September, and its comprehensive capabilities have improved by nearly 30% in the past month. There has been significant improvement in the areas of logic and memory.
Whether it is launching the first generative commercial intelligence product in China, Baidu GBI, at the conference, or using autonomous driving as a surprise announcement, Li Yanhong's thinking on the commercialization path of large models and AI-native applications has become clearer.
Li Yanhong's bet on AI, which was once seen as a beautiful vision of the future, has now become more practical. Institutions such as Morgan Stanley and Daiwa Securities see the implementation and commercialization of large models as key catalysts for Baidu's future performance growth.
Of course, at present, the commercial model of large models is still in its early stages, and Baidu's management has acknowledged at the performance meeting that it will take time to see significant revenue contributions. However, recent signs indicate that Baidu's AI turning point is not far away.
The recent success of the new M7 in the automotive industry also indicates that the second half of the intelligent electric vehicle era is accelerating. Baidu, which has been immersed in the field of autonomous driving for many years, can also benefit from this.
Sultan, General Manager of the Intelligent Automotive Business Unit of Baidu's Intelligent Driving Group (IDG), pointed out that the intelligent automotive market is reaching a turning point, and the penetration rate of intelligent driving may far exceed industry expectations.
He cited report data that in 2021, the proportion of factors valued before purchasing a car for intelligent driving was only 1.2%, but by 2023, it had reached 10.3%. It is expected that this number will exceed 30% by 2025.
This also reflects the landing of AI in the application layer. After experiencing early difficulties and gradually finding a path to commercialization, it will enter a period of exponential rapid growth.
This is similar to the predictions of institutions such as Sequoia Capital. Sequoia pointed out that after the early hype, large models are entering the second stage and will be replaced by true value and complete product experiences.
After the frenzy of "big model wars" and "chaotic model dances" subsides, only those who can find a path to commercialization on the application side can claim to have entered the gate of the big model battle.
For such cutting-edge technological innovations as large models, there must also be scenarios with clear input-output ratios. Otherwise, it will end up like AR/MR, metaverse, and other technologies, with nothing left after the hype.
Technology companies including Baidu, Alibaba, and Huawei have all joined the race, trying to find solutions. Li Yanhong, who has been sprinting on the road of large models this year, has also proven externally that Baidu's billion-dollar bet on AI over the past decade was correct.
At the meeting, Li Yanhong confidently stated, "We are about to enter an AI-native era, an era where humans and machines interact through prompts."
As large models gradually land in the application layer, they will also have a real impact on people's daily lives. Large models will also enter the battle of applications, just like the iPhone changed the mobile internet ecosystem, reshaping people's daily lives. This undoubtedly makes people excited.