Recent Achievements

Performance analysis on the Tomahawk2

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

The Orchestration Path employs the Tomahawk2 MPSoC as a demonstrator platform for the hardware/software stack. Cfaed members ported several applications (data bases, computational fluid dynamics, linear algebra) to the T2 and developed M3, an operating system for heterogeneous manycores. To enable graphical runtime performance analysis of these applications and the operating system, the first event tracing infrastructure for the T2 has been developed at the Center for Information Services and High Performance Computing (ZIH). Various types of events like memory transfers, messages, system calls, and user-defined source code regions are recorded for applications based on taskC and M3 and for internals of the M3 operating system as well. The Vampir tool, developed at ZIH and well known in the high performance computing community, is used to visualize the performance data. The performance measurement and analysis on Tomahawk2 allows to check performance assumptions, enables targeted tuning, and aids in design space exploration and performance modeling.

A first analysis framework to handle dynamic application behavior in dataflow applications

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

At the Chair for Compiler Construction cfaed members affiliated with the orchestration path have devised a series of approaches to analyse and efficiently cope with dynamic software application behaviors. The problem addressed is that current approaches for mapping and scheduling applications described as Kahn Process Networks (KPN) and Dynamic Data Flow (DDF) rely on assumptions on the program behavior specific to an execution. Thus, a near-optimal mapping, computed for a given input data set, may become sub-optimal at run-time. This happens when a different data set induces a significantly different behavior.

The approach to analyse and handle this leverages inherent mathematical structures of the dataflow models of computation and the hardware architectures. On the side of the dataflow models, it relies on the monoid structure of histories and traces. This structure helps formalize the behavior of multiple executions of a given dynamic application. It uses metrics to define a formal framework for comparing the executions. On the side of the the hardware, the approach takes advantage of symmetries in the architecture to reduce the search space for the mapping problem. (Download paper.)

 

A Programming Environment for Particle-based Numerical Simulations

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

Domain-specific languages (DSLs) are of utmost importance in scientific high- performance computing to reduce development costs, raise the level of abstraction and, thus, make scientific programmer’s life easier. The parallel particle-mesh environment (PPME) is a DSL and a programming environment for numerical simulations based on the particle method. The main goals of PPME are 1) to provide a high-level, easy to learn language for developing particle-based simulations and 2) to leverage the domain- specific knowledge encoded in PPME programs for optimizing the generated code. To achieve these goals, PPME relies on a projectional editing approach rather than a conventional text editor. This, on the one hand, enables users to describe their problems using conventional mathematical notation and, on the other hand, it provides a typed graph-based data structure that paves the ground for automatic analyses and optimizations. PPME follows a generative approach: it generates parallel Fortran code that links with the parallel particle-mesh library (PPM), which has been developed by the MOSAIC group. PPM provides efficient implementations of the particle and mesh abstractions, discrete numerics, as well as an abstraction layer on the underlying high-performance computing hardware.

Screenshot of a Gray-Scott Reaction-Diffusion System implemented in PPME

Screenshot of a Gray-Scott Reaction-Diffusion System implemented in PPME

Recently, a first version has been released, in collaboration with the Chair of Scientific Computing for Systems Biology and the MOSAIC group, both headed by Ivo Sbalzarini (Biological Systems Path).

 

Outstanding Paper Award at HPCS 2014

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

With their paper „Scalable High-Quality 1D Partitioning” Matthias Lieber and Wolfgang Nagel received the outstanding paper award at the 2014 International Conference on High Performance Computing and Simulation (HPCS 2014).

Best Therory Paper Award at ETAPS 2014

Theroy and Toolsupport for Computing Conditional Probabilities in Markovian Models

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

bestpaper

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



EATCS best theory paper award: Christel Baier, Joachim Klein, Sascha Klüppelholz and Steffen Märcker.Computing Conditional Probabilities in Markovian Models Efficiently. Proceedings of the 20th International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS'14) Volume 8413 of Lecture Notes in Computer Science, pages 515-530, Springer, 2014. (extended version)

Tomahawk 2 back in the lab

Published on in ORCHESTRATION (RECENT ACHIEVEMENTS)

The Tomahawk 2 chip that had its tape out in march came back to our labs. The chip is a heterogeneous multi processor system on chip (MPSoC). A dynamic task scheduling unit, the CoreManager sits at its center, controlling the computation load of numerous processing elements (PEs). A special extension to the programming language C, namely taskC has been developed to provide an easy to use interface for the programmer that abstracts the distributed parallel computation from the available hardware. In taskC the programmer only defines tasks that can potentially are run in parallel to each other. This means that the programmer does not have to know about the underlying hardware. The CoreManager kows about the chips topology and can place generated tasks to available PEs at runtime even if the topology (e.g. the number of PEs) changes during execution. Furthermore, it can even change the topology on its own by changing PE frequency or turning of PEs completely to save power.

For Orchestration path the Tomahawk architecture can also be viewed from a different angle. The tiled architecture consists of a number of tiles each hosting one processor and a network on chip connecting every tile providing a scalable uniform connection throughout the whole chip. With the set of tiles being inhomogeneous this platform makes a good playground for trying out how to deal with heterogeneity in system integration whilst new technologies from other paths are not ready.