Domain specific languages (DSLs) and tools

Domain specific languages (DSLs) provide means for programming with first-class domain-specific concepts and programming models—typically at the cost of not providing the computational power of GPLs. This way, DSL-based programs are, typically, much more declarative than programs written in a general purpose language, focusing more on the “what” of a problem to be solved rather than on the exact “how” in form of a specific algorithm. Examples of DSLs in widespread use are the Structured Query Language (SQL), HTML 5, or Modelica, a language for object-oriented physical modelling.

Domain specific languages generally offer two major key benefits. First, domain experts can program in an adequate DSL without actually being trained in general programming and computer architecture. Hence, they lower the entrance barrier to programming for people not educated in computer science or a related subject. Second, from the restriction to a particular domain, highly specialized and efficient software tooling can be provided. For instance, compilers can automatically perform performance optimizations on DSL-based code which are not possible in GPLs: since additional information on the problem domain is available, search spaces may be pruned, and the knowledge of possible operations, their interaction with each other and their impact on data structures enables corresponding optimizations.

At the chair for compiler construction we aim at leveraging the additional domain-specific information provided by DSLs for new domain-specific optimizations. We further want to make DSL programming a more sophisticated experience for domain experts to enable them to get the best from their code at the least possible effort. Together with our project partners, we do DSL-related research in multiple domains from the areas of embedded computing as well as cloud and scientific computing. Some of these projects are discussed below.

Parallel Particle Mesh Environment

The "Parallel Particle Mesh Environment (PPME)" is a domain-specific language and development environment for implementing scientific simulations based on particle and mesh methods. It generates Fortran code that links with the PPM library, which is developed by the MOSAIC group. PPM already includes a preprocessor-based language to ease writing clients to the library. However, this language does not provide any syntactic checking, optimisations, editing or debugging capabilities.

Thus, to address these problems, the main objectives of PPME are
•   providing a high-level and simple to use fronted for PPM client applications
•   avoiding the need for explicit parallelisation code
•   providing editing capabilities that guide the development, e.g., by checking constraints
•   using domain-specific knowledge to do advanced optimisations, e.g., to reduce the simulation time or to increase precision
•   providing sufficient extensibility to support different target languages and new features

An open source prototype of PPME is hosted at BitBucket. It uses JetBrain's "Meta Programming System (MPS)" as an implementation platform. Using MPS, we can support a very high level of specification and abstraction which comes close to a natural mathematical notation. Current IDE features include static type checking and name resolution, automatic code generation, support of physical units and automatic precision analysis of equations. The screenshot below gives a first impression on the features of PPME; it shows the implementation of a Gray-Scott reaction-diffusion system.

References

Sven Karol, Pietro Incardona, Yaser Afshar, Ivo Sbalzarini, Jeronimo Castrillon, "Towards a Next-Generation Parallel Particle-Mesh Language", Proceedings of the 3rd Workshop on Domain-Specific Language Design and Implementation (DSLDI), pp. 15–18, jul 2015.
Sven Karol, Tobias Nett, Pietro Incardona, Nesrine Khouzami, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Language and Development Environment for Parallel Particle Methods" , Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017 (P. Wriggers and M. Bischoff and E. Oñate and D.R.J. Owen and T. Zohdi) , Sep 2017
Tobias Nett, Sven Karol, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Domain-Specific Language and Editor for Parallel Particle Methods" , In ACM Transactions on Mathematical Software (TOMS), pp. 33, 2018

Tensors in Computational Fluid Dynamics

Numerical methods have tremendously accelerated the pace of science and engineering. The field of fluid dynamics, in particular, relies heavily on numerical and computational methods and has a wide range of applications, e.g. weather forecasts, climate simulation, vehicle and aircraft design. While modern computers have enabled simulations of fluid flows with unprecedented precision and performance, the increasing complexity of parallel and heterogenous architectures makes it difficult for numerical scientists and practitioners to use hardware platforms to their full potential.

This makes the domain of computational fluid dynamics (CFD) an ideal target for deploying DSLs. In a DSL, domain-experts can express their problems at the right level of abstraction in a natural and concise way, leading to increased productivity. Moreover, a DSL compiler can exploit domain-specific knowledge to automatically generate highly efficient code for parallel or heterogeneous platforms from abstract expressions in the DSL.

At the Chair for Compiler Construction we have developed the CFDlang DSL aimed at fluid simulations that rely heavily on tensor expressions. CFDlang is easy to use and integrates well with existing numerical codes written in Fortran or C/C++. The code generated by the CFDlang compiler performs as well, and often better than optimized code that has been hand-tuned by a numerical expert in a laborious and time-consuming process.

In collaboration with researchers at MINES ParisTech and ENS Paris, we have also studied methods that are generally applicable to the problem of optimizing tensor-processing codes from different application domains. This has led us to the development of TeML, the Tensor optimizations Meta-Language. TeML has been demonstrated capable of producing better optimization results than state-of-the-art tools such as Pluto.


(a) Simulated fluid flow (1000s of CPU hours).	(b) Performance, more is better.

References

Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of the 16the ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 17), ACM, pp. 169-175, Oct 2017.
Norman A. Rink, Immo Huismann, Adilla Susungi, Jeronimo Castrillon, Jörg Stiller, Jochen Fröhlich, Claude Tadonki, "CFDlang: High-level code generation for high-order methods in fluid dynamics", Proceedings of the 3rd International Workshop on Real World Domain Specific Languages (RWDSL 2018), ACM, pp. 5:1-5:10, Feb 2018.
Til Jasper Ullrich, "Detection and exploitation of data-parallelism in assignments of multi-dimensional tensors", Bachelor's thesis, TU Dresden, Dresden, Germany, Aug 2018.
Adilla Susungi, Norman A. Rink, Albert Cohen, Jeronimo Castrillon, Claude Tadonki, "Meta-programming for cross-domain tensor optimizations" , Proceedings of 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 18), ACM, pp. 79–92, Nov 2018.

MLIR-based DSL compilers

MLIR (Multi-Level Intermediate Representation) is a flexible, extensible framework designed to represent and transform code across different levels of abstraction within the compiler stack. Developed by LLVM, it provides a unified infrastructure for defining custom operations, types, and transformations that target various domains such as machine learning (torch-mlir), high-performance computing (GPU Dialect), and hardware synthesis (CIRCT). MLIR achieves this by enabling developers to define dialects, each representing a distinct set of operations tailored for specific applications or hardware architectures.

The MLIR infrastructure supports structured transformations and optimizations across dialects, facilitating seamless lowering from high-level abstractions down to low-level machine code. This layered approach simplifies the development of compilers and domain-specific tools, making MLIR a powerful choice for complex compilation workflows, particularly those involving custom hardware or heterogeneous systems.

In our Chair for Compiler Construction, we utilize MLIR in many different areas, including but not limited to:

Numeric Representation: base2 supports two numeric types, which are Fixed-Point and IEEE-754 Floating Point numbers, along with the support for arithmetic operations.
Data-Flow Graph Representation: dfg-mlir is designed to describe KPN (Kahn Process Network) in MLIR. Nodes (processes/operators) and edges (channels) can be instantiated in the graphs (regions). Various bakcends are supported, i.e. OpenMP for CPU parallelized execution, a backend targeting Alveo FPGA (Field-Programmable Gate Array) cards with help of Olympus and Bambu HLS and a general FPGA backend based on CIRCT (soon deprecated).
CIM and CNM (Compute In/Near Memory) Technologies: TBD

References

Karl F. A. Friebel, Jiahong Bi, Jeronimo Castrillon, "BASE2: An IR for Binary Numeral Types", In Proceeding: 13th International Symposium on Highly-Efficient Accelerators and Reconfigurable Technologies (HEART 2023), Association for Computing Machinery, pp. 19–26, New York, NY, USA, Jun 2023.
Jiahong Bi, "A Lowering for High-Level Data Flows to Reconfigurable Hardware", Master's thesis, TU Dresden, Dec 2023.
Christian Pilato, Subhadeep Banik, Jakub Beránek, Fabien Brocheton, Jeronimo Castrillon, Riccardo Cevasco, Radim Cmar, Serena Curzel, Fabrizio Ferrandi, Karl F. A. Friebel, Antonella Galizia, Matteo Grasso, Paulo Silva, Jan Martinovic, Gianluca Palermo, Michele Paolino, Andrea Parodi, Antonio Parodi, Fabio Pintus, Raphael Polig, David Poulet, Francesco Regazzoni, Burkhard Ringlein, Roberto Rocco, Katerina Slaninova, Tom Slooff, Stephanie Soldavini, Felix Suchert, Mattia Tibaldi, Beat Weiss, Christoph Hagleitner, "A System Development Kit for Big Data Applications on FPGA-based Clusters: The EVEREST Approach", Proceedings of the 2024 Design, Automation and Test in Europe Conference (DATE), 6pp, Mar 2024.
Francesca Palumbo, Maria Katiuscia Zedda, Tiziana Fanni, Alessandra Bagnato, Luca Castello, Jeronimo Castrillon, Roberto Del Ponte, Yansha Deng, Bart Driessen, Mauro Fadda, Tristan Halna du Fretay, Julio de Oliveira Filho, Veena Rao, Francesco Regazzoni, Alfonso Rodríguez, Melanie Schranz, Giulia Sedda, "MYRTUS: Multi-layer 360 dYnamic orchestration and interopeRable design environmenT for compute-continUum Systems", Proceedings of the 21st ACM International Conference on Computing Frontiers (CF'24), Association for Computing Machinery (ACM), pp. 101–106, New York, NY, USA, May 2024.
Stephanie Soldavini, Felix Suchert, Serena Curzel, Michele Fiorito, Karl Friedrich Alexander Friebel, Fabrizio Ferrandi, Radim Cmar, Jeronimo Castrillon, Christian Pilato, "Etna: MLIR-Based System-Level Design and Optimization for Transparent Application Execution on CPU-FPGA Nodes", Proceedings of the 32nd IEEE International Symposium On Field-Programmable Custom Computing Machines (FCCM) (extended abstract), 1pp, May 2024.
Hamid Farzaneh, João Paulo Cardoso de Lima, Mengyuan Li, Asif Ali Khan, Xiaobo Sharon Hu, Jeronimo Castrillon, "C4CAM: A Compiler for CAM-based In-memory Accelerators", Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'24), Volume 3, Association for Computing Machinery, pp. 164–177, New York, NY, USA, May 2024.
Jiahong Bi, Guilherme Korol, Jeronimo Castrillon, "Leveraging the MLIR infrastructure for the computing continuum", In Proceeding: CPS Workshop 2024, Sep 2024.
Asif Ali Khan, Hamid Farzaneh, Karl F. A. Friebel, Clément Fournier, Lorenzo Chelini, Jeronimo Castrillon, "CINM (Cinnamon): A Compilation Infrastructure for Heterogeneous Compute In-Memory and Compute Near-Memory Paradigms" (to appear), Proceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS'25), Association for Computing Machinery, Mar 2025.

Go back