NVIDIA AI Summit GTC unveiled, the most powerful AI chip Blackwell is here!
NVIDIA stated that Blackwell has improved its cost and energy consumption by 25 times compared to the previous generation, making it the most powerful chip in the world. It consists of 208 billion transistors and is manufactured using TSMC's 4nm process, supporting AI training with up to 10 trillion parameters and real-time Large Language Models (LLM) inference. The inference performance of GB200 NVL72 is up to 30 times higher than H100. Amazon, Microsoft, Google, and Oracle are among the first cloud service providers to support Blackwell. NVIDIA has launched the AI project Project GR00T to empower humanoid robots. TSMC and Synopsys will adopt NVIDIA's computational lithography technology. NVIDIA has introduced the new software NIM to make it easier for users to utilize existing NVIDIA GPUs for AI inference
Author: Li Dan
Source: Hard AI
NVIDIA's 2024 GTC AI Conference, touted as the world's top artificial intelligence (AI) developer conference this year, opened on Monday, March 18, Eastern Time.
This year marks NVIDIA's return to offline for the annual GTC after five years, and it is also seen as the AI event where NVIDIA is expected to "bring out some real heavyweights."
On Monday afternoon local time, NVIDIA's founder and CEO, Jensen Huang, delivered a speech at the SAP Center in San Jose, California, titled "Number 1 AI Conference for Developers."
Blackwell: Costs and Energy Consumption Improved 25 Times Compared to Previous Generations, TSMC 4nm Process for the World's Most Powerful Chip
Jensen Huang introduced the new generation of chips and software for running AI models. NVIDIA officially launched a new generation of AI graphics processor (GPU) called Blackwell, expected to be shipped later this year.
The Blackwell platform is capable of building and running real-time generative AI on trillion-parameter large language models (LLMs), with costs and energy consumption improved 25 times compared to the previous generation.
NVIDIA stated that Blackwell has six revolutionary technologies to support models with up to 10 trillion parameters for AI training and real-time LLM inference:
- World's most powerful chip: The Blackwell architecture GPU consists of 20.8 billion transistors, manufactured using a customized TSMC 4-nanometer (nm) process, and two reticle limit GPU wafers connect at 10 TB/s chip-to-chip links to form a single unified GPU.
- Second-generation Transformer engine: Combining Blackwell Tensor Core technology with advanced dynamic range management algorithms in TensorRT-LLM and NeMo Megatron frameworks, Blackwell will support double the computation and model size inference capabilities through new 4-bit floating-point AI.
- Fifth-generation NVLink: To enhance the performance of models with tens of trillions of parameters and hybrid expert AI models, the latest NVIDIA NVLink provides breakthrough 1.8TB/s bidirectional throughput per GPU, ensuring seamless high-speed communication between up to 576 GPUs for the most complex LLMs
- RAS Engine: The GPU supported by Blackwell includes a dedicated engine to achieve reliability, availability, and serviceability. In addition, the Blackwell architecture also adds chip-level features, utilizing AI-based predictive maintenance for diagnosing and predicting reliability issues. This can maximize the system's normal operation time and improve the scalability of large-scale AI deployments, enabling them to run continuously for weeks or even months, and reducing operational costs.
- Secure Artificial Intelligence: Advanced confidential computing capabilities can protect AI models and customer data without affecting performance, and support new native interface encryption protocols, which are crucial for privacy-sensitive industries such as healthcare and financial services.
- Decompression Engine: The dedicated decompression engine supports the latest formats, speeding up database queries and providing the highest performance for data analysis and data science. In the coming years, more and more data processing, costing enterprises hundreds of billions of dollars annually, will be accelerated by GPUs.
GB200 NVL72 Inference Performance Up to 30 Times Higher than H100
NVIDIA also introduced the super chip GB200 Grace Blackwell Superchip, which connects two B200 Tensor Core GPUs to the NVIDIA Grace CPU via 900GB/s ultra-low power NVLink.
For the highest AI performance, systems powered by GB200 can be connected to the NVIDIA Quantum-X800 InfiniBand and Spectrum-X800 Ethernet platforms announced on Monday, which can provide advanced networks with speeds of up to 800Gb/s.
GB200 is a key component of NVIDIA's GB200 NVL72, a multi-node, liquid-cooled, rack-scale system designed for the most compute-intensive workloads. It combines 36 Grace Blackwell super chips, including 72 Blackwell GPUs and 36 Grace CPUs interconnected via fifth-generation NVLink. GB200 NVL72 also includes the NVIDIA BlueField®-3 data processing unit, enabling cloud network acceleration, composable storage, zero-trust security, and GPU computing elasticity in ultra-large-scale AI clouds.
Compared to the number of H100 Tensor Core GPUs, GB200 NVL72 achieves up to a 30-fold performance improvement for LLM inference workloads, while reducing costs and energy consumption by up to 25 times GB200 NVL72 platform acts as a single GPU with 1.4 exaflops of AI performance and 30TB of fast memory, serving as a building block for the latest DGX SuperPOD.
NVIDIA has introduced the server motherboard HGX B200, which connects eight B200 GPUs via NVLink to support x86-based generative AI platforms. HGX B200 is supported by NVIDIA Quantum-2 InfiniBand and Spectrum-X Ethernet network platforms, enabling network speeds of up to 400Gb/s.
Amazon, Microsoft, Google, and Oracle among the first cloud service providers to offer Blackwell support
The Blackwell chip will serve as the foundation for new computers and other products deployed by global data center operators such as Amazon, Microsoft, and Google. Products based on Blackwell are expected to be available later this year.
NVIDIA stated that Amazon AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure will be among the first cloud service providers to offer Blackwell-supported instances. Members of the NVIDIA cloud partner program, including Applied Digital, CoreWeave, Crusoe, IBM Cloud, and Lambda, will also be among the first to offer Blackwell instances.
Sovereign AI Cloud, aimed at sovereignty, will also provide cloud services and infrastructure based on Blackwell, including Indosat Ooredoo Hutchinson, Nebius, Nexgen Cloud, Oracle EU Sovereign Cloud, Oracle US, UK, and Australia Government Cloud, Scaleway, Singtel, Northern Data Group's Taiga Cloud, Yotta Data Services' Shakti Cloud, and YTL Power International.
Huang Renxun said, "For thirty years, we have been pursuing accelerated computing with the goal of achieving transformative breakthroughs in areas such as deep learning and AI. Generative AI is the defining technology of our era. Blackwell is the engine driving this new industrial revolution. By collaborating with the world's most dynamic companies, we will fulfill the promise of AI across industries."
NVIDIA listed some organizations expected to adopt Blackwell in the press release, including Microsoft, Amazon, Google, Meta, Dell, OpenAI, Oracle, Tesla led by Musk, and xAI Huang Renxun introduced more partners, including these companies.
AI Project GR00T Empowers Humanoid Robots
Huang Renxun revealed in his speech that NVIDIA has launched a multimodal AI project called Project GR00T to empower future humanoid robots. This project utilizes a universal base model that allows humanoid robots to process text, speech, video, and even live demonstrations as inputs, and perform specific general operations.
Project GR00T was developed with the help of NVIDIA's Isaac robot platform tools, including the new Isaac Lab for reinforcement learning.
Huang Renxun stated that robots supported by the Project GR00T platform will be designed to understand natural language and mimic actions by observing human behavior, enabling them to quickly learn coordination, flexibility, and other skills to adapt to the real world and interact with it, without any risk of a robot uprising.
Huang Renxun said:
"Building a basic model for universal humanoid robots is one of the most exciting challenges that the AI field can solve today. By integrating the technologies it can achieve, leading robot experts from around the world can make huge leaps in the field of artificial general robots."
TSMC and Synopsys Adopt NVIDIA Lithography Technology
Huang Renxun also mentioned that TSMC and Synopsys will adopt NVIDIA computational lithography technology by using NVIDIA's computational lithography platform CuLitho.
TSMC and Synopsys have integrated NVIDIA's Culitho W software. They will utilize NVIDIA's next-generation Blackwell GPU for AI and HPC applications.
New Software NIM Makes It Easier for Users to Utilize Existing NVIDIA GPUs for AI Inference
NVIDIA also announced the launch of a named NVIDIA NIM, an optimized cloud-native microservice designed to shorten the time to market for generative AI models and simplify their deployment on cloud, data centers, and GPU-accelerated workstations.
NVIDIA NIM expands the developer pool by abstracting the complexity of AI model development and production packaging using industry-standard APIs. It is part of NVIDIA AI Enterprise, providing a simplified path for developing AI-driven enterprise applications and deploying AI models in production.
NIM makes it easier for users to perform inference using older NVIDIA GPUs or run AI software, allowing enterprise customers to continue using their existing NVIDIA GPUs The computing power required for inference is less than training a new AI model initially. NIM enables enterprises to run their own AI models instead of purchasing AI results from companies like OpenAI.
Customers using NVIDIA servers can use NIM by subscribing to NVIDIA AI Enterprise, with an annual license fee of $4,500 per GPU.
NVIDIA will collaborate with AI companies such as Microsoft and Hugging Face to ensure that their AI models can run on all compatible NVIDIA chips. Developers using NIM can efficiently run models on their own servers or cloud-based NVIDIA servers without lengthy configuration processes.
Commentators say that NIM software makes AI deployment easier, not only generating revenue for NVIDIA but also providing customers with another reason to continue using NVIDIA chips