The University of Tsukuba is creating Pegasus, an AI supercomputer with massive memory built on the Supermicro SuperBlade platform.

| Supermicro news

 

University of Tsukuba creates Pegasus, an AI Supercomputer with enormous memory built on the Supermicro SuperBlade platform.

The Computational Science Center at the University of Tsukuba is developing an HPC system with vast memory for researchers from various fields.

Introduction

The Computational Science Center at the University of Tsukuba is a multidisciplinary hub collaborating with numerous organizations in the realm of fundamental research across various domains. As part of an ongoing project to provide cutting-edge computational capabilities to a wide array of researchers, the University of Tsukuba, in partnership with NEC (as the primary contractor), has developed a supercomputer that caters to data processing and storage needs, utilizing the Supermicro SuperBlade. This system, one of the world's pioneering setups, harnesses NVIDIA H100 graphics processors and fourth-generation Intel® Xeon® scalable processors.

Challenges

The Computational Science Center at the University of Tsukuba identified the necessity for a new HPC system to meet the burgeoning demands of scientists. With the increasing utilization and development of new applications for research in artificial intelligence, data science, and computational sciences, a system with substantial per-process memory was required to address these initiatives. The prevailing trend is that the number of cores per processor continues to rise while the amount of RAM per core diminishes. The University of Tsukuba required a solution that would fulfill a range of requirements:

  • High-speed processor processing
  • Broad access to memory
  • State-of-the-art GPUs for processing.

Solution

The Computational Science Center at the University of Tsukuba opted for the SuperBlade® solution from Supermicro to create their new supercomputer, which addresses the substantial memory demands through cutting-edge GPU technologies for HPC and AI applications.

In particular, the University of Tsukuba acquired 120 Supermicro 6U SuperBlade chassis (across 24 enclosures, with 5 SuperBlades per enclosure). Each SuperBlade server node (SBI-611E-5T2N) is equipped with fourth-generation Intel® Xeon® Platinum 8468 processors (350 W TDP) and features a single NVIDIA® H100 Tensor Core graphics processor with 80 GB HBM2E and 128 GB of DDR5-4800 MHz memory per module. Additionally, each node incorporates Intel® OptaneTM persistent memory from the 300 series. The network for each chassis is supported by the latest NVIDIA® ConnectX®-7 HCA for connectivity to the state-of-the-art NVIDIA Quantum-2 InfiniBand NDR 400 Gb/s switch.

Below is an image of the SuperBlade used in the new Pegasus Supercomputer at the University of Tsukuba

Key Features:

  • Supermicro SuperBlade chassis
  • SBE-610J2-822 Enclosure
  • 5 Blades per enclosure
  • Ethernet switch
  • 95% power efficiency with redundant Titanium power supply

Blade Specifications:

  • Model: SBI-611E-5T2N
  • Intel Xeon Scalable 4th Gen Processor
  • 128GB DDR5 RAM
  • 2048GB Intel Optane persistent memory
  • NVidia H100 80GB PCIe GPU
  • NVidia ConnectX-7 InfiniBand NDR HCA

The 120-node Pegasus supercomputer at the University of Tsukuba boasts a theoretical peak performance of 6.51 petaflops in double precision, with a measured LINPACK performance of 3.48 petaflops. The overall cluster ranks 190th on the latest (June 2023) Top500 list of the world's fastest supercomputers. To learn more about the overall system performance, visit the following link: https://www.top500.org/system/180170/. Furthermore, Pegasus holds the 12th spot on the Green500 list, demonstrating an impressive efficiency of 40.448 GFlops/W, establishing it as one of the most energy-efficient green supercomputers worldwide.

The total memory in the Pegasus cluster amounts to a staggering 255 TB (terabytes) (15 TB DDR5 + 240 TB persistent memory), with a 7.1 PB filesystem and an I/O throughput of 40 GB/s.

In addition to the hardware components, an extensive software ecosystem is being deployed on the servers, comprising:

  • Ubuntu
  • Intel oneAPI (C++/C/Fortran, oneMKL, MPI, VTune, Trace Analyzer&Collector)
  • NVidia HPC SDK (C++/C/Fortran/Cuda, cuBLAS, cuTENSOR, cuFFT, Open MPI, NVSHMEM, NCCL, profilery, debuger)
  • Open Source SDK (kompilatory GNU, Python, PMDK, Open MPI)
  • Tensorflow, Keras, PyTorch
  • JypyterHub, TensorBoard, Nextcloud, Gfarm

Benefits

The University of Tsukuba has experienced significant improvements in application performance due to four key factors:

  1. Enhanced performance with fourth-generation Intel Xeon Scalable processors.
  2. Utilization of the NVIDIA H100 Tensor Core 80 GB PCIe graphics processor.
  3. Extended memory capacity through Intel Optane Persistent Memory 300.
  4. Substantial energy savings, making it one of the most energy-efficient supercomputers in the world (40,448 GFlops/Watt).

"The Supermicro SuperBlade integrated into NEC solutions delivered an incredible HPC product lineup, achieving higher density in just five cabinets and integrated with top-tier performance and the latest generation of processors, persistent memory, graphics processing, and networking technologies. With this massive memory supercomputer, our university will be able to intensify our research in areas such as large-scale data analysis, new applications of AI in big data, and system software research. This high-density HPC system combines fourth-generation Intel Xeon Scalable processors, NVIDIA H100 Tensor Core 80 GB PCIe graphics processor, and high-speed InfiniBand NDR network, all within the Supermicro SuperBlade, providing us with a fantastic HPC system that we will use for years to come."

  • Prof. Taisuke Boku, Director of the Center for Computational Sciences at the University of Tsukuba.

Learn more about CCS at: https://www.ccs.tsukuba.ac.jp/eng/

Summary

Supermicro - Supermicro is a global leader in high-performance, eco-friendly server technologies and innovations. We deliver application-optimized servers and workstations tailored for blade solutions, mass storage, and graphics processors to customers worldwide. Our products offer proven reliability, excellent design, and one of the industry's widest ranges of product configurations to meet all computational needs. For more information, visit www.supermicro.com.

NEC - NEC has solidified its leadership position in IT and networking technology integration while championing the brand motto "Orchestrating a brighter world." NEC empowers businesses and communities to adapt to rapid changes in society and the market, providing social values of security, protection, fairness, and efficiency to foster a more sustainable world where everyone has the opportunity to fully realize their potential. More information can be found at www.nec.com.

Related Pages:

  1. Supermicro's dedicated AI solutions
  2. Artificial Intelligence (AI) ChatGPT, Bing, Bard - part 1