Meta on way to building “world’s fastest” AI supercomputer

Meta AI supercomputer

Meta (formerly Facebook) today announced it had designed and built the AI Research SuperCluster (RSC), which, it believes is among the fastest AI supercomputers running today, and will be the fastest artificial intelligence (AI) supercomputer in the world when it’s fully built out in mid-2022.

The Meta team of researchers have already started using RSC to train large models in natural language processing (NLP) and computer vision, said Meta on its official blog.

Meta AI supercomputer will help Meta’s AI researchers build:

  • new and better AI models that can learn from trillions of examples
  • work across hundreds of different languages
  • seamlessly analyze text, images, and video together
  • develop new augmented reality tools and much more.

Our researchers will be able to train the largest models needed to develop advanced AI for computer visionNLPspeech recognition, and more. We hope RSC will help us build entirely new AI systems that can, for example, power real-time voice translations to large groups of people, each speaking a different language, so they can seamlessly collaborate on a research project or play an AR game together. Ultimately, the work done with RSC will pave the way toward building technologies for the next major computing platform — the metaverse, where AI-driven applications and products will play an important role.

– Meta

To fully realize the benefits of self-supervised learning and transformer-based models, various domains, whether vision, speech, language, or for critical use cases like identifying harmful content, will require training increasingly largecomplex, and adaptable models. High-performance computing infrastructure is a critical component in training such large models, and Meta’s AI research team has been building these high-powered systems for many years.

The first generation of this infrastructure, designed in 2017, has 22,000 NVIDIA V100 Tensor Core GPUs in a single cluster that performs 35,000 training jobs a day. Up until now, this infrastructure has set the bar for Meta’s researchers in terms of its performance, reliability, and productivity.

Meta added that development of the RSC was ongoing. Once phase two of the Meta AI supercomputer was completed, it will become the fastest AI supercomputer in the world, performing at nearly 5 exaflops of mixed precision compute.

Through 2022, work on it will be on to increase the number of GPUs from 6,080 to 16,000, which will increase AI training performance by more than 2.5x. The InfiniBand fabric will expand to support 16,000 ports in a two-layer topology with no oversubscription. The storage system will have a target delivery bandwidth of 16 TB/s and exabyte-scale capacity to meet increased demand.

Image credit: Meta

Leave a Reply

Click here to opt out of Google Analytics