2021-01-10: Strategy BEAST


"Stay up to date with new architectures and components"

Exploring and helping to shape the future of computers. The Leibniz Supercomputing Centre (LRZ) is implementing the high-tech offensive of the Bavarian State Government in its field and is launching the ambitious "Future Computing" program: This includes a test environment with the latest computer technologies, the "Bavarian Energy, Architecture and Software Testbed" or BEAST for short. In addition, the Future Computing program contains offers to qualify colleagues and young HPC professionals and to explore and exploit innovative computer technology and systems for High Performance Computing (HPC) with selected scientific partners. "We want to intensively research the latest computer systems and architectures, their energy requirements and mode of operation," explains the habilitated computer scientist Josef Weidendorfer, who heads the Future Computing initiative at the LRZ. In the meantime, login and storage servers are already available in Garching, as well as two AMD Rome systems and servers with Marvell ThunderX2 processors, both with graphics cards as accelerators. A Cray CS500 system, which works with Fujitsu A64FX processors, is currently being installed. It’s planned over the next few years, that BEAST will be continuously expanded, become an integral part of the research work at the LRZ and will serve to evaluate new computer architectures for Bavaria's largest scientific data center. In an interview, Josef Weidendorfer explains the strategy of Future Computing:

What is special about the Cray CS500 or the Testbed BEAST? Dr. Josef Weidendorfer: The technology is completely new and contains, among other things, the same processors that are used in the Japanese supercomputer Fugaku, currently the fastest computer in the world. This technology is more innovative. For example, it offers a main memory that is four times faster than that of SuperMUC-NG. In a production system like the SuperMUC-NG, which is constantly working on research projects, we cannot experiment with the configurations of operating systems, different accelerators and other hardware adjustments. But that's exactly the plan with BEAST and for Future Computing - to get the latest systems or hardware that you can exhaust, test, plug together according to your own ideas, configure and confront with different applications or codes to observe under which conditions they work and how. We want to research the latest computer systems and architectures, their energy requirements and mode of operation intensively, without disturbing the scientific work at the LRZ supercomputers. Of each piece of hardware, there will be two identical components in BEAST so that possible applications can be compared.

Why is testing necessary? Weidendorfer: With BEAST, we are preparing for the challenges of the next generation of supercomputers and for the successors to SuperMUC-NG. We are investigating which architectures make sense for larger systems and parallelization. This is also important because computer technology is about to take the next development step towards the exascale era. The processing of growing amounts of data, applications such as machine learning and artificial intelligence require, among other things, new chip design and other computer architectures. Conversely, these technologies are likely to establish themselves in the supercomputing systems of the near future, where they will optimize work or memory performance. BEAST will therefore soon also include prototypes of the latest technologies, which we will design and build together with the manufacturers wherever possible. If we experiment with the latest hardware and prototypes today, we can firstly formulate sound requirements and benchmarks for supercomputers to come. Secondly, we can estimate much better which systems will satisfy our users and the scientific community and how services related to HPC will develop and change. Third, the LRZ develops software itself - with the help of BEAST we can better adapt our own creations such as the monitoring tool DCDB or the control system Wintermute and prepare them for other systems. Last but not least, BEAST enables us to support selected user groups in their basic research on modern computer architectures.

Who is allowed to try out the test environment at all? Weidendorfer: BEAST is not one of the classic services LRZ offers. It is primarily available to our colleagues for experiments and own research. They can use it to develop recommendations for future systems and their use, but also to gain experience with new architectures. Then we open the test environment for selected researchers working on next generation hardware. We accompany and support their work and stay up to date with new architectures and components. Crashes that require hardware revitalization are to be expected. While the Linux cluster and SuperMUC-NG are administered by the LRZ, for BEAST we wanted to allow more freedom. User groups should be able to intervene in the operating system, configure processors themselves and make changes in the system that would otherwise be reserved for administrators. The system will be much more demanding, but this is the only way for all participants to learn.

Will students also work with BEAST? Weidendorfer: Of course, with the testbed we want to intensify the existing cooperation with the Ludwig-Maximilians-Universität (LMU) and the Technical University of Munich  (TUM) and inspire students to write their theses on BEAST. To support their lectures on computer architectures, both Munich universities, together with the LRZ, are offering an internship for the first time in the winter semester 2020/21. About 30 students will be given direct access to BEAST systems to understand their suitability for sample codes and to trim them for best possible performance. And we invite employees of the manufacturers to present the hardware and their working methods in more detail. The cooperation with universities and students also allows for in-depth investigations that help us at the LRZ to understand the latest technology even better.

Are manufacturers even interested in such tests? Weidendorfer: That is the long-term hope that resonates in the LRZ's Future Computing Program. In the medium to long term, manufacturers should no longer see us only as customers; we can and want to help shape new computer technology. We have already succeeded in doing this with hot water cooling, and with the contract awarded to Intel and Lenovo for the construction of the SuperMUC-NG, we were able to deepen our cooperation with these manufacturers to such an extent that we now have access to prototypes and can influence them with our ideas and recommendations. As a service provider, we act as an interface between basic research and manufacturers–with the help of research results and our practical experience with components or the various fields of application of supercomputers, manufacturers can optimize products and establish greater customer proximity. Conversely, we can comprehensively explore and test the latest technologies while they are still in the experimental stage. This benefits everyone involved. (vs)