"Providing researchers with the resources that are best suited for their projects"


The Wafer Scale Engine works in the Cerebras System
CS-2 and is the biggest chip actually

It's ready to go: Delivered to the Leibniz Computing Centre (LRZ) in late summer, the new CS-2 system from Cerebras Systems with HPE Superdome Flex servers is now ready for researchers and science. It is equipped with the largest chip currently available worldwide, with 2.6 trillion transistors and memory, each with a capacity from up to 40 gigabytes. This makes the CS-2 system particularly suitable for artificial intelligence (AI) and machine learning processes. In the meantime, it is already being used for its first research tasks. "Interest in using the system is high," observes Dr. Nicolay Hammer, head of the LRZ's Big Data & Artificial Intelligence (BDAI) team. "Providing scientists with the technological resources that are best suited for their projects is one of our most important tasks." The BDAI team therefore examines incoming concepts and recommends the appropriate resources to researchers: For AI methods, some DGX machines and computers with Intel Skylake processors and Intel Graphic Processing Units (GPU) are available in addition to the Cerebras CS-2. The following article features an interview about research using AI and what scientists can expect from the LRZ.

The new Cerebras system CS-2 at the LRZ goes into operation: What is it capable of? Dr. Nicolay Hammer: The system works with the largest chip that is currently available. The Wafer Scale Engine 2 or WSE-2 contains 850,000 computing cores and integrated storage. This not only provides a lot of computing power, but also very high memory and interconnect bandwidths. Having large amounts of data flowing between the cores and being able to cache it, allows machine learning workflows and AI processes to be executed more efficiently. The CS-2 system consequently enables and accelerates the training of AI models and neural networks.

For what kinds of research projects would it be useful? Hammer: We believe that the high-performance capabilities of the WSE-2 will be particularly useful for applications in pattern recognition, computer vision or natural language processing, NLP for short, will benefit from. This should be particularly interesting for the humanities and linguistics, but also for a wide range of data-intensive projects in the environmental and life sciences, medicine, as well as chemistry and physics. The LRZ uses the Cerebras system to conduct its own basic research on machine learning and deep learning. Together with researchers at the LRZ, we want to analyse, for example, which AI methods the Cerebras system processes well and which ones it does not. Also, we analyse how neural networks can be tailored to suit its requirements.

Can AI also support simulation, the classic application of supercomputing? Hammer: This is also an exciting question that we want to explore together with researchers. In fact, we were able to observe during the research into vaccines against Covid-19 that the combination of AI and supercomputing or pattern recognition and computation can accelerate and make the evaluation for vaccines more precise. On the other hand, another trend is emerging: there are still blank spots in modelling, some phenomena can only be described inadequately with existing formulas, for others formulas are missing at all. Here, AI methods can close gaps or - certainly not with the desired accuracy – replace classical calculation methods in the model. This surrogate modelling could advance simulation enormously and offer many useful additions - by the way, we expect similar effects from quantum computing.

How can researchers use the CS-2 system? Hammer: There is a lot of interest in using the system. Initially, it is the users with whom we cooperate and whose applications we are already familiar with that will use the new system. This will also allow us to gain initial experience with use cases. Scientists with a need for high computing capacities for smart data analysis methods briefly describe their projects to us in a concept, then we decide all together whether the Cerebras system or our other AI resources are a good fit for their requirements. It is best to submit an enquiry via the service desk.

The LRZ offers other systems for AI applications - how do researchers get access? Hammer: Providing researchers with the technological resources that are best suited for their projects is one of our most important tasks. The LRZ supports the migration of programmes and applications, helps improve AI models developed in-house or prepares data for analysis. In general, the LRZ AI Systems and soon also the Cerebras System are available to all researchers in Bavaria. Anyone who wants to use them needs access to the LRZ Linux Cluster and can then also use the LRZ AI systems.

What would be a desired project of Team BDAI for the new Cerebras system? Hammer: At the moment there are no specific projects, we are mainly curious to see how the system performs in everyday research. For this purpose, we are in discussions with research teams and we are collecting operating data in order to derive benchmarks and make valid statements about which applications the CS-2 system is useful for. The ideal project would be one that has a high demand for computing power and integrates an exciting subject area such as biomedicine, robotics etc. with innovative methods or research approaches. I am sure we will have such projects soon.

Could the Cerebras system be integrated into a supercomputer to speed it up? Hammer: It's worth a try, but that's a vision for the future. First, we want to get to know the system better. As a matter of fact, the interplay between very diverse computer systems and clusters is a subject of the LRZ’s basic research. The integration of the CS-2 into a supercomputer could influence the development of new high-performance computing systems for AI applications as well. We observe that AI is now increasingly being combined with classical HPC for simulations, i.e. that data is prepared or precalculated with the help of AI before the actual modelling on the supercomputer, and also that simulation results are further processed with AI methods. (interview: vs)

Dr. Nicolay Hammer, hed of Big Data and AI at LRZ