New types of processors in stresstests

Getting to know different computer architectures and processors: This is how the BEAST (Bavarian Energy, Architecture and Software Testbed) test environment at the Leibniz Supercomputing Centre (LRZ) stirs up curiosity. Computer science students from Ludwig-Maximilians-Universität (LMU) and the Technical University of Munich (TUM) can get an idea of the AMD, Marvell and Fujitsu processors (CPU) installed in  BEAST during a training course ­ a system which also supports graphics processor units (GPU) from AMD or NVIDIA for faster data processing: "BEAST offers the latest software and access to a wide variety of systems, which is very exciting for me," says Sergej Breiter, a master's student at LMU and a participant in the training course.

The latest supercomputers are currently being equipped with the aforementioned processors, which is another reason why students are interested in the BEAST internship that TUM and LMU organized together with the LRZ in the winter term 2020/21. A total of 26 computer science students took part. In twelve training sessions from November to February, they solved practical tasks in groups on BEAST's new control units as well as on computing nodes of SuperMUC-NG. "The goal of the BEAST internship is to try out modern high-performance computer technologies, different computer architectures and memory hierarchies, to confront them with programming languages and software, and to compare their experiences," explain the lecturers Dr. Karl Fürlinger (LMU) and Dr. Josef Weidendorfer (TUM), who lends the LRZ-program “Future Computing”. The internship also included presentations by manufacturers, who provided insights into technical strategies and explained construction or functional methods.

Getting to know hardware even better

This mixture of practical training and theory went down well with the computer science students: they examined the performance of the processors with programming extensions such as OpenMP, CUDA or other instructions recommended by manufacturers. They tried speeding up memory and computational functions on the systems, parallelizing systems, and using program libraries to solve larger vector and other mathematical equations. "The most fun part was researching hints on vendor sites or communities about how best to program the systems and how to fix performance problems," Breiter says. "You learn a lot about the hardware in the process." Every intervention of the various experiments had to be documented, and every step of the systems' operation measured and compared against various performance parameters. "The speed of the AMD Rome system was gigantic, it was already fast without optimizations to the codes, and we were able to speed it up even more significantly," says Ludwig Kratzl, an IT specialist in application development in his 5th semester of computer science at TUM. His fellow student Maximilian Bauregger adds, "I was quite surprised that none of us managed to call up the full performance of the very special Fujitsu processor."

Putting four systems through their paces with diverse variables is time-consuming - and yields measured values and data in large quantities: Working groups presented their experiences with the experiments, the systems as well as measurements for discussion before each study session. The random selection of presenters ensures active participation. But in return, there's also exclusive experience with the latest technology: "Most of the time, we only work with Intel processors, because they're the most widely used," says Kratzl. "The BEAST internship broadened my horizons and gave me a lot of insights into better assess computer systems." Bauregger has also benefited from working with four different systems, "Whether I will get to know so many other systems in my professional life is rather unlikely. The experience from the internship certainly helps to better question and solve performance criteria and problems."

Open programming languages push the systems

The systems to be tested were as diverse as the participants' prior knowledge. In the BEAST internship, master's and bachelor's students sat together. "All of them coped well with the rather high complexity," reports lecturer Fürlinger, thoroughly surprised. "This is new hardware with software that is not always fully developed, and not everything always works smoothly." As a result, not everything always ran smoothly during the training course either - "as administrators, we didn't realize what was missing to accomplish some tasks until we were doing them." In addition, the instructors now know that they should reduce the number of experiments and practical assignments. And they now collect feedback from students to improve on the practical parts of the course.

However, Fürlinger and Weidendorfer are also excited by their own observations on the use of the computer technology: Despite the same task, the measured values in the groups sometimes differed significantly; it will be interesting to see whether this remains the case in the upcoming courses as well. The fact that the processor from Fujitsu, which is currently used in the fastest supercomputer Fugaku in Tokyo, requires a lot of time for the users to get used to it, is another detail that is likely to influence the planning of supercomputers. "It's interesting that for the BEAST systems, even with GPUs, the open standard OpenMP for programming often worked better than expected, even if the vendor-recommended programming models still got a bit more out of it," Weidendorfer reports. It's a finding that should interest instructors not only at LRZ and universities.

The BEAST training course will be repeated in the coming semesters. Information on the websites of the chairs Mathematik, Informatik, Statistik (LMU – „Praktikum Quantitative Analyse von Hochleistungssystemen“) and Rechnerarchitektur und Parallele Systeme (TUM– „Evaluierung moderner HPC-Architekturen und -Beschleuniger“).