ZURUECK HOCH VOR INHALT SUCHEN

» Back to overview
Proposing Institution

FB 16, Research Group Programming Languages / Methodologies , University Kassel
Project Manager

Jonas Posner
Wilhelmshöher Allee 71-73
34121 Kassel
Abstract
Current research in the HPC area includes the increase of programmer productivity, error handling, and flexible resource management. To enable efficient use of mutiple nodes with multicore CPUs, parallel programs should simultaneously exploit both node-internal shared-memory and inter-node distributed-memory parallelism. Fault tolerance is becoming increasingly important as the probability of node failures increases with increasing cluster size. Fault tolerance at application level may be more efficient than at system-level.Flexible resource management includes the ability of an application to incorporate new resources and release existing resources when system requests occur. Thus, the system utilization can be improved. Implementing each of the three topics is time-consuming and error-prone, depending on which programming system is used. To reduce the programming effort, it is beneficial to implement these techniques directly into programming systems. The APGAS library for Java is based on the well-known Partitioned Global Address Space (PGAS) model and adds asynchronism to it by adopting a task-based approach. Tasks are units of computation that are generated at runtime. APGAS supports fault tolerance and elasticity and is therefore well suited as a basis for implementing a combination of all three topics. The goal of our project is the development of a task pool with location flexible tasks, which is both fault-tolerant and elastic. Locality-flexible tasks are subject to system-wide automatic load balancing over all system-wide resources. We implement various task pool schemes directly in APGAS and compare them with each other. Users of the resulting APGAS variant do not have to worry about load balancing, fault tolerance and elasticity and can thus concentrate on writing sequential tasks to solve their actual problem.For validation and time measurements, experiments will refer to simple benchmarks and branch-and-bound methods.

Impressum, Conny Wendler