Monster chip for AI part of next-gen computing unveiled by NVIDIA

Tech company NVIDIA Corp. today unveiled its vision for the next generation of computing. The plan shifts the focus of the global information economy from servers to a new class of powerful, flexible data centers.

In a keynote delivered in nine simultaneously released episodes recorded from the kitchen of his California home, NVIDIA founder and Chief Executive Officer (CEO) Jensen Huang discussed NVIDIA’s recent Mellanox acquisition, new products based on the company’s much-awaited NVIDIA Ampere GPU architecture and important new software technologies.

At the core of Huang’s talk was a vision for how data centers, the engine rooms of the modern global information economy, are changing, and how NVIDIA and Mellonox, acquired in a deal that closed last month, are together driving those changes.

“The data center is the new computing unit,” Huang said, adding that NVIDIA is accelerating performance gains from silicon, to the ways CPUs and GPUs connect, to the full software stack, and, ultimately, across entire data centers.

Systems for data center-scale computing

The CEO announced that the new GPU architecture was optimized for this new kind of data center-scale computing, unifying AI training and inference, and making possible flexible, elastic acceleration.

NVIDIA A100, the first GPU based on the NVIDIA Ampere architecture, provided “the greatest generational performance leap of NVIDIA’s eight generations of GPUs”, he said. It was built for data analytics, scientific computing and Cloud graphics, and was in full production and shipping to customers worldwide.

The A100, and the NVIDIA Ampere architecture it’s built on, boost performance by up to 20x over its predecessors, Huang said. He detailed five key features of A100, including:

  • More than 54 billion transistors, making it the world’s largest 7-nanometer processor
  • Third-generation Tensor Cores with TF32, a new math format that accelerates single-precision AI training out of the box. NVIDIA’s widely used Tensor Cores are now more flexible, faster and easier to use
  • Structural sparsity acceleration, a new efficiency technique harnessing the inherently sparse nature of AI math for higher performance
  • Multi-instance GPU, or MIG, allowing a single A100 to be partitioned into as many as seven independent GPUs, each with its own resources
  • Third-generation NVLink technology, doubling high-speed connectivity between GPUs, allowing A100 servers to act as one giant GPU

The result of all this: 6x higher performance than NVIDIA’s previous generation Volta architecture for training and 7x higher performance for inference.

You may catch up with NVIDIA’s announcement here.

Image credit: NVIDIA


,

Leave a Reply

Click here to opt out of Google Analytics