Tutorial: "Algorithmic specification, tools and algorithms for programming heterogeneous platforms"

Collocated with the 24th International Conference on Parallel Architectures and Compilation Techniques (PACT), 2015 

October 18, 2015. The Sir Francis Drake Hotel on Union Square in San Francisco, CA USA 

Presentation slides available

san francisco panorama


Prof. Dr. Jeronimo Castrillon, TU Dresden, Germany, jeronimo.castrillon@tu-dresden.de

1. Overview

This tutorial demonstrates tools and compilers for parallel systems originating from the embedded domain. The tutorial offers a balanced mixture of established tools, such as LabVIEW, recent promising tools from a startup, such as Silexica, and an academic discussion of the underlying methods and algorithms for mapping to parallel, heterogeneous hardware. The tutorial touches several aspects of interest to the PACT community, namely, parallel programming languages, computational models, runtime systems, compilers and tools for parallel computing.


Today multicore and manycore systems with complex architectural features are widespread in all computing devices. Heterogeneous processing elements, memories, and reconfigurable hardware make programming these machines a cumbersome task. Many abstractions in the form of libraries, directives and language extensions, have been proposed in the last 10 years to cope with this complexity. A common problem of these abstractions is that they incur in a performance penalty. This tutorial presents methods and tools that aim at improving the productivity of software development while, at the same time, trying to retain high-performance. The technologies presented in this tutorial focus on ways for extracting, expressing and exploiting parallelism, based on well-defined Models of Computation, covering dataflow and process networks. The first presentation introduces LabVIEW, a graphical system development framework that enables domain experts to efficiently develop applications for distributed heterogeneous platforms (including processors and FPGAs). The system is demonstrated with use cases from the areas of control and communications. The second and third presentations discuss industrial and academic tools and methods that aim at automatically generating efficient implementations of applications described at a level similar to that of LabVIEW onto heterogeneous embedded multi-processor platforms. In particular, the second talk will describe commercial tools that help at parallelizing sequential C code and then obtaining a mapping of the parallel tasks to processors of the multi-core. The third talk will focus on algorithms to optimize dataflow networks for heterogeneous platforms. It will also discuss a HW/SW stack to handle heterogeneity at different layers, developed within the German cluster of excellence Cfaed – Center for Advancing Electronics Dresden.


Intended Audience
This tutorial is intended for people in academia and industry interested in established and future methods to obtain efficient implementation of algorithms onto complex programmable hardware platforms. This includes (i) software developers working with graphical programming tools, such as LabVIEW, interested in distributed heterogeneous computing and (ii) programmers working with heterogeneous multi-cores interested in tools and methodologies to automatically generate code from high-level descriptions.


Half-day event with three presentations, including two tool demonstrations

2. Detailed content

Slot 1: Platform-based design of heterogeneous multiprocessor systems for real-time signal processing applications

Hugo A. Andrade, National Instruments, Berkeley, CA USA, hugo.andrade@ni.com

Heterogeneous multiprocessor architectures are becoming the de-facto standard for embedded, real-time signal processing systems.  Driven by pressures of cost, power, “performance,” and time to market, embedded system designers are adopting heterogeneous computing architectures to optimally meet both the technical and business challenges.  Unfortunately, the tools that exist to program these complex computing architectures are not ideal, given the heterogeneity of the underlying system. In fact, the typical approach is to use one tool flow for each processing target increasing the complexity, the learning curve, and the system integration time potentially extending project duration and increasing development costs.  To address the challenges of designing and developing heterogeneous multiprocessing systems, National Instruments developed a unique design flow.  LabVIEW Communications System Design Suite employs a unique unified design flow that empowers users to use the design language of their choice, a common interconnect methodology, and a scalable hardware platform to expedite the design and implementation of complex heterogeneous multiprocessing systems.  

Hugo A. Andrade is a Principal Architect, Software Marketing, at National Instruments, where he has been employed since 1989. Hugo earned BS degrees in ECE and CS and an MS degree in ECE from the University of Texas at Austin. While at National Instruments, among other projects, he has led standardization efforts on instrument control software, and led the research and early development of LabVIEW FPGA. He has been a visiting industrial fellow at the University of California, Berkeley (2007), and was the founding manager and tech lead of the NI Berkeley R&D site (LabVIEW Advanced Product Development Group).  Currently he focuses on identifying, evaluating and researching new technologies and advanced tools for system level design and synthesis for heterogeneous platforms, with applications in CPS and IoT, and serves as liaison to academic and industrial research labs in the area. Hugo holds over 60 patents in the areas of instrumentation software, hardware/software interfacing, reconfigurable computing, graphical programming, models of computation, and system level design, and is a Ph.D. student in the Electrical and Computer Engineering department at the University of Texas at Austin, focusing his research on dependability aspects of heterogeneous, reconfigurable platforms.

About NI
NI provides powerful, flexible technology solutions that accelerate productivity and drive rapid innovation. From daily tasks to grand challenges, NI helps engineers and scientists overcome complexity to exceed even their own expectations. Customers in nearly every industry—from healthcare and automotive to consumer electronics and particle physics—use NI’s integrated hardware and software platform to improve our world.

Slot 2: Performance and power aware C code parallelization for embedded devices

Prof. Dr. Rainer Leupers, Silexica Software Solutions GmbH, Germany, leupers@silexica.com

Rainer Leupers received the M.Sc. (Dipl.-Inform.) and Ph.D. (Dr. rer. nat.) degrees in Computer Science with honors from the Technical University of Dortmund, Germany, in 1992 and 1997. From 1997-2001 he was the chief engineer at the Embedded Systems chair at TU Dortmund. In 2002, Dr. Leupers joined RWTH Aachen University as a professor for Software for Systems on Silicon. Since then, he has also been a visiting faculty member at the ALARI Institute in Lugano. His research and teaching activities comprise software development tools, processor architectures, and system-level electronic design automation, with focus on application-specific multicore systems. He published numerous books and technical papers, and he served in committees of leading international conferences, including DAC, DATE, and ICCAD. He was a co-chair of the MPSoC Forum and SCOPES. Dr. Leupers received several scientific awards, including Best Paper Awards at DATE 2000, 2008 and DAC 2002, as well as multiple industry awards. He holds several patents on processor design automation technologies and has been a co-founder of LISATek (now with Synopsys) and Silexica. He has served as consultant for various companies, as an expert for the European Commission, and in the management boards of compound research projects like UMIC, TETRACOM, HiPEAC, and ARTIST.

About Silexica
Silexica provides IDE solutions for efficient multicore software development. The SLX tools product line, consisting of SLX Parallelizer, SLX Mapper, SLX Generator, and SLX Explorer, hides away the target hardware complexity from the software programmer. SLX software development tools automate task mapping, starting either from legacy C code or from parallel streaming-oriented application models. Advanced code analysis and compilation technologies enable automated and safe migration of legacy software to multicores as well as efficient implementation of new parallel applications or system software. Unique code optimization techniques ensure highest performance, hardware utilization, and meeting real-time constraints. Power and energy optimization can be minimized, too. SLX tools address a variety of embedded computing application domains, such as wireless communications and automotive. Silexica also provides services and OEM support in embedded multicore programming.


Slot 3: Dataflow programming for heterogeneous computing systems 

Prof. Dr. Jeronimo Castrillon, TU Dresden, Germany, jeronimo.castrillon@tu-dresden.de

This talk starts by introducing new types of heterogeneous systems and their challenges for hardware/software programming stacks. These systems are currently investigated in the context of the German cluster of excellence Cfaed – “Center for Advancing Electronics Dresden”. We will then look at dataflow modeling concepts, with emphasis on the dynamic models that are needed to express today’s changing workloads. Finally, the talk will introduce methods and algorithms for mapping sets of applications modeled in this way to heterogeneous systems. 

Jeronimo Castrillon received the Electronics Engineering degree with honors from the Pontificia Bolivariana University in Colombia in 2004, the master degree from the ALaRI Institute in Switzerland in 2006 and the Ph.D. degree (Dr.-Ing.) on Electric Engineering and Information Technology with honors from the RWTH Aachen University in Germany in 2013. From early 2009 to April 2013 Dr. Castrillon was the chief engineer of the chair for Software for Systems on Silicon at the RWTH Aachen University, where he was enrolled as research staff since late 2006. From April 2013 to April 2014 Dr. Castrillon was senior scientific staff in the same institution. In June 2014, Dr. Castrillon joined the department of computer science of the TU Dresden as professor for compiler construction in the context of the German excellence cluster “Center for Advancing Electronics Dresden” (CfAED). His research interests lie on methodologies, languages, tools and algorithms for programming complex computing systems. Dr. Castrillon has several international publications and has served as program chair, track chair and technical program committee in international workshops and conferences (e.g., DATE, FPL, ViPES, EUC, MCSoC and Rapido).

3. Schedule

14:00 - 14:30

Platform-based design of heterogeneous multiprocessor systems for real-time signal processing applications

Hugo A. Andrade, National Instruments, Berkeley, CA USA

14:30 - 15:00

Performance and power aware C code parallelization for embedded devices

Prof. Dr. Rainer Leupers, Silexica Software Solutions GmbH, Germany

15:00 - 15:30

Dataflow programming for heterogeneous computing systems

Prof. Dr. Jeronimo Castrillon, TU Dresden, Germany

15:30 - 16:00  Coffee break
16:00 - 16:45

 Tool demonstration: Programming with LabVIEW

Hugo A. Andrade, National Instruments, Berkeley, CA USA

16:45 - 17:30

Tool demonstration: Multi-core programming with the Silexica Toolsuite

Miguel A. Aguilar, Silexica Software Solutions GmbH, Germany


4. About PACT

PACT 2015 is the 24th International Conference on Parallel Architectures and Compilation Techniques, taking place October 18-21, San Francisco, CA, USA. PACT aims to bring together researchers from architecture, compilers, applications and  languages to present and discuss innovative research of common interest.