• Workshop on DevOps Support for Cloud FPGA platforms (cFDevOps)

Organizers: Chris Kachris (inaccel), Christoph Hagleitner (IBM Research Europe), Christian Plessl (Paderborn Center for Parallel Computing), Dionysios Diamantopoulos (IBM Research Europe), Burkhard Ringlein (IBM Research Europe)


With the slowdown of Moore's law as we know it, the Cloud is resorting to heterogeneous, accelerated computing to satisfy the ever-increasing demand for performance and power efficiency. In just a few years, FPGAs have emerged as compute accelerators next to GPUs and are part of the standard offerings from many Cloud vendors. However, the development environment, deployment procedures, security measures, and monitoring tools are different for each platform and the portability of the FPGA kernel designs remains limited.

In this workshop, leading platform providers, designers of development environments, and developers are going to present the state-of-the-art for Cloud FPGA platforms and explore opportunities and directions for future improvements from the developer's point of view. Instead of focusing on the performance and optimization of a specific application, the goal of this workshop is to highlight the challenges, which a Cloud application developer faces when designing, implementing, deploying, scaling and debugging Cloud services on Cloud FPGA platforms. A focus area for this years edition are end-to-end toolchains, compilation and debugging tools for heterogeneous platforms.

Link to Workshop Programm: cFDevOps21

Date: 30.08.2021
Time: 2:30 PM - 6:30 PM (CEST)


  • Building Shared Libraries using oneAPI Toolkits for Intel FPGAs

Organizer: Michael Spinali (Intel)


This presentation will demonstrate how to use Intel OneAPI toolkits to offload compute functions to Intel FPGAs in 3rd party programming tools using FPGA bitstreams that are compiled as a shared library. The oneAPI application implements a RADIX-2 complex, variable-length FFT and is compiled into a C-standard shared library. This standardized shared library was used in a LabVIEW program to offload the processing of an FFT in a typical test and measurement application to varying Intel FPGA-based platforms. For context, LabVIEW is a programming tool developed by NI that is used to interface with Test and Measurement equipment (T&M), and the Fast Fourier Transform (FFT) is a computationally efficient algorithm commonly used in wireless T&M applications to analyze the frequency domain representation of a discrete signal.

Date: 30.08.2021
Time: 5:00 PM - 6:00 PM (CEST)


  • Workshop on Reconfigurable Computing for Machine Learning – RC4ML’2021

Organizers: Christos-Savvas Bouganis (Imperial College London), Theocharis Theocharides (University of Cyprus), Christos Kyrkou (University of Cyprus), Nele Mentens (Leiden University and KU Leuven), Marco Domenico Santambrogio (Politecnico di Milano)


Machine Learning (ML), and especially Deep Learning (DL) has gathered significant visibility recently as an Artificial Intelligence (AI) paradigm, with success in a wide range of applications such as image and speech recognition, autonomous systems, self-driving cars, cyber-physical systems, and many more. The objective of RCML is to promote discussion and stimulate research and ideas towards the potential role of reconfigurable hardware in this important and fast-evolving domain.


The event is based around a number of keynotes from experts in the field, who have been invited by the organising committee, with the aim to capture the latest developments in this fast-evolving area.

Time (CEST)





Prof. Wayne Luk, Imperial College London


Prof. Deming Chen, Univ of Illinois, Urbana-Champaign


Coffee break


Prof. Vanderlei Bonato, The University of Sao Paulo


Prof. Viktor Prasanna, University of Southern California


Prof. Stjepan Picek, TU Delft


Closing remarks

Date: 31.08.2021
Time: 2:30 PM - 6:30 PM (CEST)



  • Introduction to the Xilinx Kria SOM with PYNQ

Organizers: Cathal McCabe (Xilinx), Mario Ruiz (Xilinx)


The Kria portfolio of adaptive system-on-modules (SOMs) are production-ready small form factor embedded boards that enable rapid deployment in edge-based applications. Coupled with a complete software stack and pre-built, production-grade accelerated applications, Kria adaptive SOMs are a new method of bringing adaptive computing to AI and software developers. This tutorial will introduce the new Xilinx Kria portfolio and the Kria Vision kit development platform and demonstrate how it can be used with PYNQ, an open-source Python and Jupyter framework for Xilinx platforms.

Date: 30.08.2021
Time: 2:30 PM - 4:30 PM (CEST)


  • Software-Defined Hardware: Digital Design in the 21st Century with Chisel

Organizer: Martin Schoeberl (Technical University of Denmark)


To develop future more complex digital circuits in less time we need a better hardware description language than VHDL or Verilog. Chisel is a hardware construction language intended to speed up the development of digital hardware and hardware generators.

Chisel is a hardware construction language implemented as a domain-specific language in Scala. Therefore, the full power of a modern programming language is available to describe hardware and, more important, hardware generators. Chisel has been developed at UC Berkeley and successfully used for several tape outs of RISC-V by UC Berkeley students and a chip for a tensor processing unit by Google. Here at the Technical University of Denmark we use Chisel in the T-CREST project and in teaching digital electronics and advanced computer architecture.

In this tutorial we will give an overview of Chisel to describe circuits, how to use the Chisel tester functionality to test and simulate digital circuits, present how to synthesize circuits for an FPGA, and present advanced functionality of Chisel for the description of circuit generators.

The aim of the course is to get a basic understanding of a modern hardware description language and be able to describe simple circuits in Chisel. This course will give a basis to explore more advanced concepts of circuit generators written in Chisel/Scala. The intended audience is hardware designers with some background in VHDL or Verilog, but Chisel is also a good entry language for software programmers entering into hardware design (e.g., porting software algorithms to FPGAs for speedup).

Date: 31.08.2021
Time: 2:30 PM - 8:30 PM (CEST)


  • AI Optimized Intel® Stratix® 10 NX FPGA

Organizers: Eriko Nurvitadhi (Intel), Rama Venkata (Intel), Andrew M Boutros (Intel), Tim Vanderhoek (Intel)


The Intel® Stratix® 10 NX FPGA is Intel’s first AI-optimized FPGA. It introduces a new type of AI-optimized tensor arithmetic block called the AI Tensor Block and is designed for high-bandwidth, low-latency, artificial intelligence (AI) applications. The Intel® Stratix® 10 NX FPGA delivers accelerated AI compute solutions with up to 143 INT8 TOPS at ~1 TOPS/W, in package 3D stacked HBM2 high-bandwidth DRAM, and up to 57.8G PAM4 transceivers. In this tutorial, we will first provide an overview of the Intel Stratix 10 NX FPGA followed by example designs, such as Text-To-Speech application and PE array design. We also offer an application evaluation and comparison against GPUs. Using an approach such as the soft AI processor overlay we developed in our recently published research [FPT’20], we will show how the Intel Stratix 10 NX FPGA can be programmed purely in software to deliver excellent performance in real-time AI workloads.


      • Part 1: Introduction
      • Part 2: Stratix 10 NX FPGA
        • Overview & Platforms
        • Text-to-Speech Application Study
      • Part 3: PE Array Example Design for Stratix 10 NX
      • Part 4: AI Soft Processor on Stratix 10 NX
        • Intro and Motivation
        • Optimized AI Soft Processor for Stratix 10 NX
      • Part 5: Demo/Lab

Date: 31.08.2021
Time: 3:30 PM - 6:00 PM (CEST)