Amazon is stepping up its efforts to defend its position in the cloud market by upgrading its self-developed AI chips and launching the Q chatbot, which is powered by the latest generation of NVIDIA's super chips.

Amazon is making efforts to defend its leading position in the field of Cloud Computing. On one hand, it is upgrading its self-developed cloud chips and launching its own version of GPT, an AI chatbot. On the other hand, it is deepening its cooperation with NVIDIA, launching new services based on NVIDIA chips, and jointly developing supercomputers with NVIDIA. Dave Brown, Vice President of Amazon, stated that by focusing on designing self-developed chips for important workloads for customers, Amazon can provide them with the most advanced cloud infrastructure. The newly launched Graviton4 is the fourth generation chip product in the past five years. With the increasing interest in generative AI, the second generation AI chip, Trainium2, will help customers train their machine learning models faster, with lower costs and higher energy efficiency.

Amazon is making efforts to defend its leading position in the field of Cloud Computing. On one hand, it is upgrading its self-developed cloud chips and launching Amazon's version of GPT, an AI chatbot. On the other hand, it is deepening its cooperation with NVIDIA, launching new services based on NVIDIA chips, and jointly developing supercomputers with NVIDIA.

Dave Brown, Vice President of Amazon, stated that by focusing on designing self-developed chips for workloads that are important to customers, Amazon can provide them with state-of-the-art cloud infrastructure. The newly launched Graviton4 is the fourth generation chip product in the past five years. With the increasing interest in generative AI, the second generation AI chip, Trainium2, will help customers train their machine learning models faster, at a lower cost, and with higher energy efficiency.

Graviton4 offers up to 30% improvement in computing performance compared to the previous generation

On Tuesday, November 28th, Amazon's Cloud Computing business, Amazon Web Services (AWS), announced the launch of a new generation of self-developed chips. Among them, the general-purpose chip, Graviton4, offers up to 30% improvement in computing performance compared to the previous generation, with a 50% increase in cores and a 75% increase in memory bandwidth. This provides the highest cost-effectiveness and energy utilization on Amazon's cloud server hosting service, Amazon Elastic Compute Cloud (EC2).

Graviton4 also enhances security by fully encrypting all high-speed physical hardware interfaces. Amazon stated that Graviton4 will be applied to memory-optimized Amazon EC2 R8g instances, allowing customers to improve the execution of high-performance databases, memory caches, and big data analytics workloads. R8g instances offer larger instance sizes, with up to three times more vCPUs and three times more memory than the previous R7g instances.

Computers equipped with Graviton4 will be launched in the coming months. Amazon stated that since the launch of the Graviton project about five years ago, they have produced over 2 million Graviton processors, and the top 100 users of Amazon EC2 have all chosen to use Graviton.

Trainium2 offers four times faster speed for training trillion-parameter-level models

Another new product from Amazon is the next-generation AI chip, Trainium2, which is four times faster than its predecessor, Trainium1. It can deploy up to 100,000 chips in the EC2 UltraCluster, allowing users to train basic models (PM) and large language models (LLM) with trillions of parameters in a short period of time. Additionally, Trainium2 offers up to twice the energy efficiency compared to the previous generation. Trainium2 will be used in Amazon EC2 Trn2 instances, with each instance containing 16 Trainium chips. The Trn2 instances are designed to help customers scale the number of chip applications in the next generation EC2 UltraCluster to up to 100,000 Trainium2 chips, connecting to the Amazon Elastic Fabric Adapter (EFA) petabit-level network, providing up to 65 exaflops of computing power.

Amazon stated that Trainium2 will be used to support new services starting next year.

Upgraded version of Grace Hopper's first major customer DGX Cloud adopts GH200 NVL32, the fastest GPU-driven AI supercomputer

In addition to its self-developed chips, Amazon also announced during the annual re:Invent conference on Tuesday that it is expanding its strategic partnership with NVIDIA to provide state-of-the-art infrastructure, software, and services to empower customers' generative AI innovations.

Amazon will become the first cloud service provider to adopt NVIDIA's H200 Grace Hopper superchip with the new multi-node NVLink technology, which means Amazon will be the first major customer of the upgraded version of Grace Hopper.

NVIDIA's H200 NVL32 multi-node platform combines 32 Grace Hopper chips with NVLink and NVSwitch technologies in a single instance. This platform will be used in Amazon EC2 instances connected to the Amazon Elastic Fabric Adapter (EFA), supported by advanced virtualization (Amazon Nitro System) and ultra-large-scale clusters (Amazon EC2 UltraClusters), enabling Amazon and NVIDIA's joint customers to scale the deployment of thousands of H200 chips.

NVIDIA and Amazon will jointly host NVIDIA's AI training-as-a-service DGX Cloud on Amazon, which will be the first DGX Cloud to adopt GH200 NVL32, providing developers with the largest shared memory in a single instance. Amazon's DGX Cloud will enhance the training of cutting-edge generative AI and large language models with over 10 trillion parameters.

NVIDIA and Amazon are also collaborating on a project called Ceiba, designing the world's fastest GPU-driven AI supercomputer, which is a large-scale system powered by GH200 NVL32 and interconnected with Amazon EFA. It is a supercomputer with 16,384 GH200 superchips, capable of AI processing power of 65 exaflops. NVIDIA is using its technology to drive the next wave of generative AI innovation.

Enterprise customer robot Amazon Q preview version launched to help develop apps on Amazon

In addition to chips and cloud services, Amazon has also launched its own AI chatbot preview version called Amazon Q. It is a new type of digital assistant supported by generative AI technology, which can work based on the business of enterprise customers, helping them search for information, write code, and review business metrics.

Q has received partial training on code and documentation within Amazon and is available for use by developers of Amazon Cloud.

Developers can use Q to create apps on Amazon, research best practices, correct errors, and get help in writing new features for their apps. Users can interact with Q through a conversational question-and-answer function, learning new knowledge, researching best practices, and understanding how to build apps on Amazon without diverting their attention from the Amazon console.

Q will also be added to Amazon's enterprise intelligence software, call center staff, and logistics management programs. Amazon states that customers can customize Q based on their company's data or personal profiles.

The conversational question-and-answer function of Q is currently available in preview in all regions where Amazon provides enterprise services.