Bank of America AI Deep Dive Report: Where are the computing power opportunities in the AI era?
Bank of America pointed out that the computing power required for AI model training has increased by 275 times every 2 years. The next generation of computers will include high-performance computing, edge computing, spatial computing, quantum computing, and biological computing
Author: Li Xiaoyin
Source: Hard AI
In the post-Moore's Law era of exponential data growth, AI technology supported by powerful computing power is thriving, with an increasing demand for computing power.
As the costs of AI training and inference continue to rise, the number of parameters in LLM (Large Language Models) has evolved from 94 million parameters in 2018 to the commercially available 175 billion parameters in GPT-3, with GPT-4 expected to exceed 1 trillion. Data shows that the computing power required to train an AI model is expected to increase by 275 times every 2 years.
Advancements in data processing technology have driven the evolution of computers, but traditional processing units and large computing clusters cannot break through the limits of computational complexity. While Moore's Law is still progressing and evolving, it cannot explain the demand for faster and more powerful computing capabilities.
NVIDIA CEO Jensen Huang has admitted: Moore's Law is dead. With computing power continuously pushing boundaries, where are the opportunities for AI?
In a deep report released by Bank of America Merrill Lynch on March 21st, it was pointed out that the next generation of computers will include: High-Performance Computing (HPC), Edge Computing, Space Computing, Quantum Computing, and Bio Computing.
High-Performance Computing (HPC)
High-Performance Computing refers to computing systems that use supercomputers and parallel computing clusters to solve advanced computing problems.
The report states that high-performance computing systems are typically over 1 million times faster than the fastest desktops, laptops, or server systems, and have wide applications in established and emerging fields such as autonomous driving, Internet of Things, and precision agriculture.
Bank of America believes that the development trend of high-performance computing provides room for growth for accelerators in ultra-large-scale systems (including LLM).
Although high-performance computing accounts for only a small portion (about 5%) of the total available market (TAM) in data centers, the future trend is to become a leading indicator for cloud/enterprise applications.
Especially as the demand for computing power for LLMs continues to rise, 19 out of 48 new systems have adopted accelerators, representing approximately 40% of the accelerator attachment rate. A survey of the Global Fortune 500 companies shows that there is room for growth in adopting accelerators in ultra-large-scale service systems, with only about 10% of servers currently being accelerated.
Bank of America points out that another trend is that, with the help of coprocessors (processors developed and applied to assist CPUs in executing tasks that they cannot perform or perform inefficiently), computing will increasingly shift from serial to parallel.
The maturity of Moore's Law/serial computing is shifting more workloads to parallel computing, achieved through the use of independent coprocessors/accelerators (such as GPUs, Application-Specific Integrated Circuits (ASICs), and Field-Programmable Gate Arrays (FPGAs)).
As of November 2023, 186 machines in the Global Fortune 500 companies use coprocessors, an increase from 137 systems five years ago; coprocessor/accelerator usage in the Fortune 500 remains stable on a quarter-over-quarter basis and has grown by about 5% year-over-year; the total computing performance of the Fortune 500 supercomputers has increased to 7.0 exaflops, a 45% year-over-year growth
Spatial Computing
Spatial computing refers to integrating the user's graphical interface into the real physical world through the use of AR/VR technology, thereby changing the human-computer interaction computation.
In fact, we are reaching a turning point in human-computer interaction: moving from traditional keyboard and mouse configurations to touch gestures, conversational AI, and edge computing interactions with enhanced visual computing.
Bank of America believes that following PCs and smartphones, spatial computing has the potential to drive the next wave of disruptive change - making technology a part of our daily behavior, connecting our physical and digital lives with real-time data and communication.
For example, Apple's Vision Pro has taken a crucial step.
Edge Computing
Compared to cloud computing, edge computing refers to processing data closer to the end devices in terms of physical location, with advantages in latency, bandwidth, autonomy, and privacy. According to research firm Omdia, "edge" refers to locations with a round trip time of up to 20 milliseconds to end users.
Bank of America states that many companies are investing in edge computing and edge locations (from internal IT and OT to external, remote sites) to be closer to end users and where data is generated.
Tech giants like Facebook, Amazon, Microsoft, Google, and Apple are all investing in edge computing, with the expected returns from this investment to be a driving force for these companies' stock performance in the next 5 years.
By 2025, it is estimated that 75% of enterprise-generated data will be created and processed at the edge.
According to research firm IDC, the market size of edge computing is expected to reach $404 billion by 2028, with a compound annual growth rate of 15% from 2022 to 2028.
The development trajectory of the edge computing market is expected to be as follows from 2022 to 2025:
Phase One (2022): Use Cases - Highly Customized; Phase Two (2023): Vertical Areas - Vertical Suites/Packages; Phase Three (2024): Horizontal Areas - Cross-Vertical Technologies; Phase Four (2025): IT Strategy - Vertical Strategy Future, Bank of America believes that the opportunity for AI lies in reasoning, and for edge computing inference, CPU will be the best choice.
Unlike training in core computing, inference requires a distributed, scalable, low-latency, and low-cost model, which is exactly what edge computing models provide. The current divergence in the edge computing industry lies in whether to use CPU or GPU to support edge inference. Although all major vendors support GPU and CPU capabilities, we believe that CPU is the best choice to support edge inference.
Under the GPU model, only 6-8 requests can be processed at a time. However, CPUs can segment servers by user, making them a more efficient processing system at the edge. In contrast, CPUs provide cost efficiency, scalability, and flexibility, allowing edge computing providers to overlay proprietary software during the computation process.
Fog Computing
In the field of edge computing, there is also a related concept: Fog Computing.
Fog computing is a network architecture that uses terminal devices for on-site storage, communication, and data transmission during extensive edge computing.
Bank of America believes that fog computing and cloud computing are complementary, and in the future, a hybrid/multi-cloud deployment format may emerge.
As applications migrate to the cloud, hybrid/multi-cloud approaches are being deployed. Cloud computing and edge computing are complementary, and adopting a distributed approach can create value by addressing different needs in different ways.
An IDC survey shows that 42% of enterprise respondents face difficulties in designing and implementing key components (including infrastructure, connectivity, management, and security). In the long run, the combination of edge data aggregation and analysis with the scale capabilities of cloud access (such as analysis and model training) will create a new economy built on digital edge interactions.
Quantum Computing
Quantum computing refers to using subatomic particles to store information and perform complex calculations using superposition.
Bank of America believes that the importance of quantum computing lies in its inherent irreplaceable advantage in solving problems that traditional computers cannot solve—also known as "quantum supremacy." However, the commercialization process of quantum computing is still in its early stages.
Quantum computing can solve problems that traditional computers would take billions of years to solve almost instantaneously. We are at a very early stage of adoption, with only a few machines deployed in the cloud for commercial use, mainly for research. However, the commercialization process is rapidly advancing Bank of America believes that quantum computing has broken the boundaries of computation, and the combination of AI and quantum computing can fundamentally change the physical and mathematical worlds.
In the short to medium term, the life sciences, chemistry, materials, finance, and logistics industries will benefit the most. In the long term, when AI reaches human cognitive abilities or even self-awareness, General Artificial Intelligence (AGI) will lead to a fundamental transformation in technology.
The report points out that quantum computers are not suitable for routine tasks such as using the internet, office tasks, or emails, but are suitable for complex big data calculations such as blockchain, machine learning, deep learning, or nuclear simulation. The combination of quantum computing and 6G mobile networks will change the rules of various industries.
Big Data Analysis: The untapped potential of big data is enormous, with the data volume created by 2024 expected to double from 120ZB in 2022 to 183ZB.
IDC data shows that currently, due to computational bottlenecks, we only store, transmit, and use 1% of global data. But quantum computing can change this and unleash true economic value - potentially using 24% of global data and doubling global GDP.
Cybersecurity: With the parallel processing power of up to 1 trillion calculations per second (Bernard Marr), quantum computing can technically challenge all current encryption methods, including blockchain. This also opens the door to new encryption technologies based on quantum computing elements.
Artificial Intelligence and Machine Learning: The progress of machine learning and deep learning is limited by the speed of underlying data computation. Quantum computers can accelerate machine learning capabilities by using more data to solve complex data point connections faster.
Cloud: This may be one of the winners, as the cloud may be the platform where all data creation, sharing, and storage occur. Once the commercialization of quantum computing begins, cloud access will be needed, and data generation should grow exponentially. Therefore, cloud platforms will be the solution.
Autonomous Vehicle Fleet Management: A connected autonomous vehicle will generate the same amount of data as 3000 internet users; for two vehicles, the amount of data generated will skyrocket to about 8000-9000 users' data. Therefore, the data growth generated by autonomous vehicles alone will be exponential
Brain-Computer Interface
A brain-computer interface (BCI) refers to the direct interaction between human and animal brain waves and the external world.
Bank of America Merrill Lynch pointed out that startups like Neuralink are researching human-machine collaboration through brain-computer interfaces (BCI). Brain wave-controlled devices have been achieved in animal experiments, and early human clinical trials are still ongoing.
Currently, brain-computer interface (BCI) and brain-brain interface (CBI) technologies are under development, with examples of controlling hand movements through thoughts.
Synchron's approach involves placing a mesh tube with sensors and electrodes in the blood vessels supplying the brain, which can receive signals from neurons. Once the signals are transmitted to external units, they are translated and conveyed to computers. In clinical trials, paralyzed individuals are able to send text messages, emails, do online banking, and shopping.
Neural's implants include neural threads, which are inserted into the brain by a neurosurgical robot to pick up neural signals. Clinical patients can now move a computer mouse by just thinking.