PRACE Workshop: OpenMP Programming Workshop @ LRZ

Date:

Tuesday, February 11, 2020, 09:00 - Thursday, February 13, 2020, 16:00

Location:

LRZ Building, Garching/Munich, Boltzmannstr. 1, Seminarraum 1

Contents:

With the increasing prevalence of multicore processors, shared-memory programming models are essential. OpenMP is a popular, portable, widely supported, and easy-to-use shared-memory model.

Since its advent in 1997, the OpenMP programming model has proved to be a key driver behind parallel programming for shared-memory architectures. Its powerful and flexible programming model has allowed researchers from various domains to enable parallelism in their applications. Over the more than two decades of its existence, OpenMP has tracked the evolution of hardware and the complexities of software to ensure that it stays as relevant to today’s high performance computing community as it was in 1997.

This workshop will cover a wide range of topics, reaching from the basics of OpenMP programming using the "OpenMP Common Core" to really advanced topics. During each day lectures will be mixed with hands-on sessions on the LRZ system IvyMUC.

Preliminary Agenda

	Day 1	Day 2	Day 3
09:00-10:30	Introduction to the OpenMP common core	Tasking	Tools for Performance and Correctness
10:30-10:45	Coffee Break	Coffee Break	Coffee Break
10:45-12:00	Decomposing code into patterns for parallelization	Tasking	Offloading to Accelerators
12:00-13:00	Lunch Break	Lunch Break	Lunch Break
13:00-14:45	Beyond OpenMP common core with tasking and offloading	Host Performance: NUMA	Other Advanced Features of OpenMP 5.0
14:45-15:00	Coffee Break	Coffee Break	Coffee Break
15:00-17:00	Hands-on time with Parallelware Trainer	Host Performance: SIMD	Roadmap / Outlook (until 16:00)
	17:00-18:00 Guided SuperMUC-NG Tour	19:00 Social Event (tbc.)

Day 1

The first day will cover basic parallel programming with OpenMP using the Parallelware Trainer Software by Appentra Solutions (https://www.appentra.com/products/parallelware-trainer/).

We will present a unique, productivity-oriented approach by introducing its usage based on common motifs in scientific code, and how each one will be parallelized. This will enable attendees to focus on the parallelization of components and how components combine in real applications.

Attendees will use active learning through a carefully selected set of exercises, building knowledge on parallelization of key motifs (e.g. matrix multiplication, map reduce) that are valid across multiple scientific codes in everything from CFD to Molecular Simulation.

Appentra’s Parallelware tools are based on over 10 years of research by co-founder and CEO, Dr. Manuel Arenaz, who will be the lecturer of the first day. Parallelware enables the identification of opportunities for parallelization and the provision of appropriate parallelization methods using state-of-the-art industrial standards. Parallelware Trainer was developed specifically to help improve the experience of HPC training, providing an interactive learning environment that uses examples that are the same or similar to real codes. Parallelware Trainer provides support for OpenMP (including multi-threading, offloading and tasking) and OpenACC (for offloading), providing users with the opportunity to use GPU services with either OpenMP or OpenACC.

Topics covered on Day 1 include:

The OpenMP Common Core
Beyond the OpenMP Common Core
Parallelization with multi-threading, offloading and tasking paradigms
Using Parallelware Trainer: A walk-through with PI example
Practicals: Examples codes PI, MANDELBROT, HEAT and LULESHmk
Worksheet: Parallelizing PI and LULESHmk with OpenMP
Decomposing code into patterns for parallelization

Day 2 and 3

Day 2 and 3 will cover advanced topics like:

Mastering Tasking with OpenMP
Host Performance
Vectorization / SIMD
NUMA Aware Programming, Thread Affinity
Tool Support for Performance and Correctness
Offloading to Accelerators
Other Advanced Features of OpenMP 5.0
Future Roadmap of OpenMP

Developers usually find OpenMP easy to learn. However, they are often disappointed with the performance and scalability of the resulting code. This disappointment stems not from shortcomings of OpenMP but rather with the lack of depth with which it is employed. The lectures on Day 2 and Day 3 will address this critical need by exploring the implications of possible OpenMP parallelization strategies, both in terms of correctness and performance.

We cover tasking with OpenMP and host performance, putting a focus on performance aspects, such as data and thread locality on NUMA architectures, false sharing, and exploitation of vector units. Also tools for performance and correctness will be presented.

Current trends in hardware bring co-processors such as GPUs into the fold. A modern platform is often a heterogeneous system with CPU cores, GPU cores, and other specialized accelerators. OpenMP has responded by adding directives that map code and data onto a device, the target directives. We will also explore these directives as they apply to programming GPUs.

OpenMP 5.0 features will be highlighted and the future roadmap of OpenMP will be presented.

All topics are accompanied with extensive case studies and we discuss the corresponding language features in-depth.

Topics might be still subject to change.

For the hands-on sessions participants need to bring their own laptops with an ssh-client installed.

The course is organized as a PRACE training event by LRZ in collaboration with Appentra Solutions, Intel and RWTH Aachen.

Lecturers

Dr. Manuel Arenaz is CEO at Appentra Solutions and professor of computer science at the University of A Coruña (Spain). Holds a PhD on advanced compiler techniques for automatic parallelization of scientific codes. After 10+ years teaching parallel programming at undergraduate and PhD levels, he strongly believes that the next generation of STEM engineers needs to be educated in HPC technologies to address the digital revolution challenge. Recently, he co-founded Appentra Solutions to commercialize products and services that take advantage of Parallware, a new technology for semantic analysis of scientific HPC codes.

Dr. Michael Klemm holds an M.Sc. and a Doctor of Engineering degree from the Friedrich-Alexander-University Erlangen-Nuremberg, Germany. Michael Klemm is a Principal Engineer in the Compute Ecosystem Engineering organization of the Intel Architecture, Graphics, and Software group at Intel in Germany. His areas of interest include compiler construction, design of programming languages, parallel programming, and performance analysis and tuning. Michael Klemm joined the OpenMP organization in 2009 and was appointed CEO of the OpenMP ARB in 2016.

Dr. Christian Terboven is a senior scientist and leads the HPC group at RWTH Aachen University. His research interests center around Parallel Programming and related Software Engineering aspects. Dr. Terboven has been involved in the Analysis, Tuning and Parallelization of several large-scale simulation codes for various architectures. He is responsible for several research projects in the area of programming models and approaches to improve the productivity and efficiency of modern HPC systems.

Prerequisites

Basic C/C++ or Fortran knowledge.

Content Level:

The content level of the course is broken down as:

Beginner's content:	6,2h	33%
Intermediate content:	6,2h	33%
Advanced content:	6,2h	33%
Community-targeted content:	0,0h	0%

Language:

English

Teachers:

Dr. Manuel Arenaz (Appentra Solutions), Dr.-Ing. Michael Klemm (Intel Corporation), Dr. Christian Terboven (RWTH Aachen University)

Assistants:

Dr. Michele Martone (LRZ)

Registration:

Via the PRACE webpage https://events.prace-ri.eu/event/947/

Contact:

Dr. Volker Weinberg (LRZ)

Information for ...

PRACE Workshop: OpenMP Programming Workshop @ LRZ

Preliminary Agenda

Day 1

Day 2 and 3

Lecturers