AI fuels HPC: LRZ’s flagship supercomputer expands AI offering for Bavarian and German scientists

SNG-2 has recently been installed in the LRZ compute centre

SNG-2-Seite-s

At the Leibniz Supercomputing Centre (LRZ) of the Bavarian Academy of Sciences and Humanities the expansion of its current leadership class system has just been installed: SuperMUC-NG Phase 2, or SNG-2 for short, integrates and advances artificial intelligence (AI) in high-performance computing. 
According to the latest Top500 list from November 2023, the machine is the fourth fastest supercomputing system in Germany offering vast high-performance artificial intelligence resources for the broad national AI and HPC userbase via the Gauss Centre for Supercomputing (GCS): “We are extending SuperMUC-NG with AI capabilities to empower ground-breaking scientific discoveries and drastically reduce the time to science for world-class use cases from Bavaria and Germany including life sciences, environmental sciences, material sciences and various other scientific domains” comments Prof. Dr. Dieter Kranzmüller, Chairman of the Board of Directors at the LRZ.

SNG-2 has been designed for the integration of AI methods into HPC workflows and at the same time deliver acceleration on classic modelling and simulation tasks keeping in mind the broad and diverse user community. It is particularly well-suited for highly scalable, compute- and data-intensive workloads. The supercomputer was built and delivered by Intel and Lenovo and contains 240 compute nodes based on Lenovo’s ThinkSystem SD650-I V3 Neptune DWC servers. It features 480 Intel CPUs as well as 960 Intel GPUs hot-water cooled by Lenovo’s Neptune technology. It further includes a  distributed asynchronous object storage (DAOS) system, which leverages 3rd Gen Intel Xeon Scalable processors and Intel Optane persistent memory to accelerate access to large amounts of data.


Maximizing system usage

LRZ’s systems have traditionally shown extremely high usage levels by a very broad user community. In order to guarantee this for SNG-2 and to facilitate user uptake of the new system, LRZ has taken various measures. It has heavily extended on its AI training offering over the last years. In cooperation with Intel, further courses and workshops are in the pipeline that are specifically designed for SNG-2 users in early 2024. On top, LRZ has expanded its Computational X team for user support and mentoring as well as its Big Data and AI team with expert staff for Machine Learning (ML), Large Language Models (LLM) and surrogate models.

Background

While the system is put through its paces, preparations for the friendly-user phase are under way. General user operation is expected for late spring 2024. Phase 1 and Phase 2 of SuperMUC-NG will both remain in service until the follow-up system is operational. SNG-2 was co-funded by the Federal Ministry of Education and Research (BMBF) and the Bavarian Ministry of Science and the Arts (StMWK). Compute time will be made available via calls by the Gauss Centre for Supercomputing (GCS). Interested users must comply with GCS’ specifications.

Visit this site for more tech specs.  Or check out some first visuals in this short clip.