Dr. Norman Rink

Dr. Norman Rink (2014 - 2020)
	E-mail	norman.rink@tu-dresden.de

Publications

2020
Asif Ali Khan, Norman A. Rink, Fazal Hameed, Jeronimo Castrillon, "Optimizing Tensor Contractions for Embedded Devices with Racetrack and DRAM Memories", In ACM Transactions on Embedded Computing Systems (TECS), Association for Computing Machinery, vol. 19, no. 6, New York, NY, USA, Sep 2020. [doi] [Bibtex & Downloads]

Optimizing Tensor Contractions for Embedded Devices with Racetrack and DRAM Memories

Reference

Asif Ali Khan, Norman A. Rink, Fazal Hameed, Jeronimo Castrillon, "Optimizing Tensor Contractions for Embedded Devices with Racetrack and DRAM Memories", In ACM Transactions on Embedded Computing Systems (TECS), Association for Computing Machinery, vol. 19, no. 6, New York, NY, USA, Sep 2020. [doi]

Abstract
Tensor contraction is a fundamental operation in many algorithms with a plethora of applications ranging from quantum chemistry over fluid dynamics and image processing to machine learning. The performance of tensor computations critically depends on the efficient utilization of on-chip/off-chip memories. In the context of low-power embedded devices, efficient management of the memory space becomes even more crucial, in order to meet energy constraints. This work aims at investigating strategies for performance- and energy-efficient tensor contractions on embedded systems, using racetrack memory (RTM)-based scratch-pad memory (SPM) and DRAM-based off-chip memory. Compiler optimizations such as the loop access order and data layout transformations paired with architectural optimizations such as prefetching and preshifting are employed to reduce the shifting overhead in RTMs. Optimizations for off-chip memory such as memory access order, data mapping and the choice of a suitable memory access granularity are employed to reduce the contention in the off-chip memory. Experimental results demonstrate that the proposed optimizations improve the SPM performance and energy consumption by 32% and 73% respectively compared to an iso-capacity SRAM. The overall DRAM dynamic energy consumption improvements due to memory optimizations amount to 80%.

Bibtex

@Article{khan_tecs20,
author = {Asif Ali Khan and Norman A. Rink and Fazal Hameed and Jeronimo Castrillon},
title = {Optimizing Tensor Contractions for Embedded Devices with Racetrack and DRAM Memories},
journal = {ACM Transactions on Embedded Computing Systems (TECS)},
year = {2020},
month = sep,
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {19},
number = {6},
issn = {1539-9087},
url = {https://doi.org/10.1145/3396235},
doi = {10.1145/3396235},
articleno = {44},
numpages = {26},
abstract = {Tensor contraction is a fundamental operation in many algorithms with a plethora of applications ranging from quantum chemistry over fluid dynamics and image processing to machine learning. The performance of tensor computations critically depends on the efficient utilization of on-chip/off-chip memories. In the context of low-power embedded devices, efficient management of the memory space becomes even more crucial, in order to meet energy constraints. This work aims at investigating strategies for performance- and energy-efficient tensor contractions on embedded systems, using racetrack memory (RTM)-based scratch-pad memory (SPM) and DRAM-based off-chip memory. Compiler optimizations such as the loop access order and data layout transformations paired with architectural optimizations such as prefetching and preshifting are employed to reduce the shifting overhead in RTMs. Optimizations for off-chip memory such as memory access order, data mapping and the choice of a suitable memory access granularity are employed to reduce the contention in the off-chip memory. Experimental results demonstrate that the proposed optimizations improve the SPM performance and energy consumption by 32\% and 73\% respectively compared to an iso-capacity SRAM. The overall DRAM dynamic energy consumption improvements due to memory optimizations amount to 80\%.},
}

Downloads

2009_Khan_TECS [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2649

×

2019
Sebastian Ertel, Justus Adam, Norman A. Rink, Andrés Goens, Jeronimo Castrillon, "STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism", Proceedings of the 12th ACM SIGPLAN International Symposium on Haskell, ACM, pp. 146–161, New York, NY, USA, Aug 2019. [doi] [Bibtex & Downloads]

STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism

Reference

Sebastian Ertel, Justus Adam, Norman A. Rink, Andrés Goens, Jeronimo Castrillon, "STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism", Proceedings of the 12th ACM SIGPLAN International Symposium on Haskell, ACM, pp. 146–161, New York, NY, USA, Aug 2019. [doi]

Abstract
Dataflow execution models are used to build highly scalable parallel systems. A programming model that targets parallel dataflow execution must answer the following question: How can parallelism between two dependent nodes in a dataflow graph be exploited? This is difficult when the dataflow language or programming model is implemented by a monad, as is common in the functional community, since expressing dependence between nodes by a monadic bind suggests sequential execution.
Even in monadic constructs that explicitly separate state from computation, problems arise due to the need to reason about opaquely defined state. Specifically, when abstractions of the chosen programming model do not enable adequate reasoning about state, it is difficult to detect parallelism between composed stateful computations.
In this paper, we propose a programming model that enables the composition of stateful computations and still exposes opportunities for parallelization. We also introduce smap, a higher-order function that can exploit parallelism in stateful computations. We present an implementation of our programming model and smap in Haskell and show that basic concepts from functional reactive programming can be built on top of our programming model with little effort. We compare these implementations to a state-of-the-art approach using monad-par and LVars to expose parallelism explicitly and reach the same level of performance, showing that our programming model successfully extracts parallelism that is present in an algorithm. Further evaluation shows that smap is expressive enough to implement parallel reductions and our programming model resolves short-comings of the stream-based programming model for current state-of-the-art big data processing systems.

Bibtex

@InProceedings{ertel_haskell19,
author = {Ertel, Sebastian and Adam, Justus and Rink, Norman A. and Goens, Andr{\'e}s and Castrillon, Jeronimo},
title = {{STCLang}: State Thread Composition as a Foundation for Monadic Dataflow Parallelism},
booktitle = {Proceedings of the 12th ACM SIGPLAN International Symposium on Haskell},
year = {2019},
series = {Haskell 2019},
pages = {146--161},
address = {New York, NY, USA},
month = aug,
publisher = {ACM},
abstract = {Dataflow execution models are used to build highly scalable parallel systems. A programming model that targets parallel dataflow execution must answer the following question: How can parallelism between two dependent nodes in a dataflow graph be exploited? This is difficult when the dataflow language or programming model is implemented by a monad, as is common in the functional community, since expressing dependence between nodes by a monadic bind suggests sequential execution.
Even in monadic constructs that explicitly separate state from computation, problems arise due to the need to reason about opaquely defined state. Specifically, when abstractions of the chosen programming model do not enable adequate reasoning about state, it is difficult to detect parallelism between composed stateful computations.
In this paper, we propose a programming model that enables the composition of stateful computations and still exposes opportunities for parallelization. We also introduce smap, a higher-order function that can exploit parallelism in stateful computations. We present an implementation of our programming model and smap in Haskell and show that basic concepts from functional reactive programming can be built on top of our programming model with little effort. We compare these implementations to a state-of-the-art approach using monad-par and LVars to expose parallelism explicitly and reach the same level of performance, showing that our programming model successfully extracts parallelism that is present in an algorithm. Further evaluation shows that smap is expressive enough to implement parallel reductions and our programming model resolves short-comings of the stream-based programming model for current state-of-the-art big data processing systems.},
acmid = {3342600},
doi = {10.1145/3331545.3342600},
isbn = {978-1-4503-6813-1},
keywords = {conf},
location = {Berlin, Germany},
numpages = {16},
url = {http://doi.acm.org/10.1145/3331545.3342600}
}

Downloads

1908_Ertel_Haskell [PDF]

Related Paths
HAEC, Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2476

×
Asif Ali Khan, Norman A. Rink, Fazal Hameed, Jeronimo Castrillon, "Optimizing Tensor Contractions for Embedded Devices with Racetrack Memory Scratch-Pads", Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory of Embedded Systems (LCTES), ACM, pp. 5–18, New York, NY, USA, Jun 2019. [doi] [Bibtex & Downloads]

Optimizing Tensor Contractions for Embedded Devices with Racetrack Memory Scratch-Pads

Reference

Asif Ali Khan, Norman A. Rink, Fazal Hameed, Jeronimo Castrillon, "Optimizing Tensor Contractions for Embedded Devices with Racetrack Memory Scratch-Pads", Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory of Embedded Systems (LCTES), ACM, pp. 5–18, New York, NY, USA, Jun 2019. [doi]

Abstract
Tensor contraction is a fundamental operation in many algorithms with a plethora of applications ranging from quantum chemistry over fluid dynamics and image processing to machine learning. The performance of tensor computations critically depends on the efficient utilization of on-chip memories. In the context of low-power embedded devices, efficient management of the memory space becomes even more crucial, in order to meet energy constraints. This work aims at investigating strategies for performance- and energy-efficient tensor contractions on embedded systems, using racetrack memory (RTM)-based scratch-pad memory (SPM). Compiler optimizations such as the loop access order and data layout transformations paired with architectural optimizations such as prefetching and preshifting are employed to reduce the shifting overhead in RTMs. Experimental results demonstrate that the proposed optimizations improve the SPM performance and energy consumption by 24% and 74% respectively compared to an iso-capacity SRAM.

Bibtex

@InProceedings{kahn_lctes19,
author = {Asif Ali Khan and Norman A. Rink and Fazal Hameed and Jeronimo Castrillon},
title = {Optimizing Tensor Contractions for Embedded Devices with Racetrack Memory Scratch-Pads},
booktitle = {Proceedings of the 20th ACM SIGPLAN/SIGBED International Conference on Languages, Compilers, Tools and Theory of Embedded Systems (LCTES)},
series = {LCTES 2019},
pages = {5--18},
numpages = {12},
numpages = {14},
isbn = {978-1-4503-6724-0/19/06},
doi = {10.1145/3316482.3326351},
url = {http://doi.acm.org/10.1145/3316482.3326351},
acmid = {3326351},
year = {2019},
month = jun,
location = {Phoenix, AZ, USA},
publisher = {ACM},
address = {New York, NY, USA},
abstract = {Tensor contraction is a fundamental operation in many algorithms with a plethora of applications ranging from quantum chemistry over fluid dynamics and image processing to machine learning. The performance of tensor computations critically depends on the efficient utilization of on-chip memories. In the context of low-power embedded devices, efficient management of the memory space becomes even more crucial, in order to meet energy constraints. This work aims at investigating strategies for performance- and energy-efficient tensor contractions on embedded systems, using racetrack memory (RTM)-based scratch-pad memory (SPM). Compiler optimizations such as the loop access order and data layout transformations paired with architectural optimizations such as prefetching and preshifting are employed to reduce the shifting overhead in RTMs. Experimental results demonstrate that the proposed optimizations improve the SPM performance and energy consumption by 24% and 74% respectively compared to an iso-capacity SRAM.},
acmid = {3326351},
}

Downloads

1906_Khan_LCTES [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2419

×
Norman A. Rink, Jeronimo Castrillon, "TeIL: a type-safe imperative Tensor Intermediate Language", Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY), ACM, pp. 57–68, New York, NY, USA, Jun 2019. [doi] [Bibtex & Downloads]

TeIL: a type-safe imperative Tensor Intermediate Language

Reference

Norman A. Rink, Jeronimo Castrillon, "TeIL: a type-safe imperative Tensor Intermediate Language", Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY), ACM, pp. 57–68, New York, NY, USA, Jun 2019. [doi]

Abstract
Each of the popular tensor frameworks from the machine learning domain comes with its own language for expressing tensor kernels. Since these tensor languages lack precise specifications, it is impossible to understand and reason about tensor kernels that exhibit unexpected behaviour. In this paper, we give examples of such kernels.
The tensor languages are superficially similar to the well-known functional array languages, for which formal definitions often exist. However, the tensor languages are inherently imperative. In this paper we present TeIL, an imperative tensor intermediate language with precise formal semantics. For the popular tensor languages, TeIL can serve as a common ground on the basis of which precise reasoning about kernels becomes possible. Based on TeIL's formal semantics we develop a type-safety result in the Coq proof assistant.

Bibtex

@InProceedings{rink_array19,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {{TeIL}: a type-safe imperative {Tensor Intermediate Language}},
booktitle = {Proceedings of the 6th ACM SIGPLAN International Workshop on Libraries, Languages, and Compilers for Array Programming (ARRAY)},
year = {2019},
series = {ARRAY 2019},
pages = {57--68},
address = {New York, NY, USA},
month = jun,
publisher = {ACM},
doi = {10.1145/3315454.3329959},
url = {http://doi.acm.org/10.1145/3315454.3329959},
acmid = {3329959},
isbn = {978-1-4503-6717-2/19/06},
location = {Phoenix, AZ, USA},
numpages = {12},
abstract = {Each of the popular tensor frameworks from the machine learning domain comes with its own language for expressing tensor kernels. Since these tensor languages lack precise specifications, it is impossible to understand and reason about tensor kernels that exhibit unexpected behaviour. In this paper, we give examples of such kernels.
The tensor languages are superficially similar to the well-known functional array languages, for which formal definitions often exist. However, the tensor languages are inherently imperative. In this paper we present TeIL, an imperative tensor intermediate language with precise formal semantics. For the popular tensor languages, TeIL can serve as a common ground on the basis of which precise reasoning about kernels becomes possible. Based on TeIL's formal semantics we develop a type-safety result in the Coq proof assistant.},
}

Downloads

1906_Rink_Array [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2449

×
Sebastian Ertel, Justus Adam, Norman A. Rink, Andrés Goens, Jeronimo Castrillon, "Category-Theoretic Foundations of ``STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism''", In CoRR, vol. abs/1906.12098, Jun 2019. [Bibtex & Downloads]

Category-Theoretic Foundations of ``STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism''

Reference

Sebastian Ertel, Justus Adam, Norman A. Rink, Andrés Goens, Jeronimo Castrillon, "Category-Theoretic Foundations of ``STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism''", In CoRR, vol. abs/1906.12098, Jun 2019.

Bibtex

@Article{ertel_haskellsup19,
author = {Sebastian Ertel and Justus Adam and Norman A. Rink and Andr{\'{e}}s Goens and Jeronimo Castrillon},
title = {Category-Theoretic Foundations of ``STCLang: State Thread Composition as a Foundation for Monadic Dataflow Parallelism''},
journal = {CoRR},
year = {2019},
volume = {abs/1906.12098},
month = jun,
archiveprefix = {arXiv},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1906-12098},
eprint = {1906.12098},
url = {http://arxiv.org/abs/1906.12098}
}

Downloads

1906_Ertel_Haskellsupp [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2486

×

2018
Adilla Susungi, Norman A. Rink, Albert Cohen, Jeronimo Castrillon, Claude Tadonki, "Meta-programming for Cross-Domain Tensor Optimizations", Proceedings of 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'18), ACM, pp. 79–92, New York, NY, USA, Nov 2018. [doi] [Bibtex & Downloads]

Meta-programming for Cross-Domain Tensor Optimizations

Reference

Adilla Susungi, Norman A. Rink, Albert Cohen, Jeronimo Castrillon, Claude Tadonki, "Meta-programming for Cross-Domain Tensor Optimizations", Proceedings of 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'18), ACM, pp. 79–92, New York, NY, USA, Nov 2018. [doi]

Bibtex

@InProceedings{rink_gpce18,
author = {Adilla Susungi and Norman A. Rink and Albert Cohen and Jeronimo Castrillon and Claude Tadonki},
title = {Meta-programming for Cross-Domain Tensor Optimizations},
booktitle = {Proceedings of 17th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'18)},
year = {2018},
series = {GPCE 2018},
pages = {79--92},
numpages = {14},
address = {New York, NY, USA},
month = nov,
publisher = {ACM},
keywords = {conf},
location = {Boston, MA, USA},
isbn = {978-1-4503-6045-6},
url = {http://doi.acm.org/10.1145/3278122.3278131},
doi = {10.1145/3278122.3278131},
acmid = {3278131},
}

Downloads

1811_Rink_GPCE [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2206

×
Norman A. Rink, Immo Huismann, Adilla Susungi, Jeronimo Castrillon, Jörg Stiller, Jochen Fröhlich, Claude Tadonki, "CFDlang: High-level Code Generation for High-order Methods in Fluid Dynamics", Proceedings of the 3rd International Workshop on Real World Domain Specific Languages (RWDSL 2018), ACM, pp. 5:1–5:10, New York, NY, USA, Feb 2018. [doi] [Bibtex & Downloads]

CFDlang: High-level Code Generation for High-order Methods in Fluid Dynamics

Reference

Norman A. Rink, Immo Huismann, Adilla Susungi, Jeronimo Castrillon, Jörg Stiller, Jochen Fröhlich, Claude Tadonki, "CFDlang: High-level Code Generation for High-order Methods in Fluid Dynamics", Proceedings of the 3rd International Workshop on Real World Domain Specific Languages (RWDSL 2018), ACM, pp. 5:1–5:10, New York, NY, USA, Feb 2018. [doi]

Abstract
Numerical simulations continue to enable fast and enormous progress in science and engineering. Writing efficient numerical codes is a difficult challenge that encompasses a variety of tasks from designing the right algorithms to exploiting the full potential of a platform's architecture. Domain-specific languages (DSLs) can ease these tasks by offering the right abstractions for expressing numerical problems. With the aid of domain knowledge, efficient code can then be generated automatically from abstract expressions. In this work, we present the CFDlang DSL for expressing tensor operations that constitute the performance-critical code sections in a class of real numerical applications from fluid dynamics. We demonstrate that CFDlang can be used to generate code automatically that performs as well, if not better, than carefully hand-optimized code.

Bibtex

@InProceedings{rink_rwdsl18,
author = {Norman A. Rink and Immo Huismann and Adilla Susungi and Jeronimo Castrillon and J{\"o}rg Stiller and Jochen Fr{\"o}hlich and Claude Tadonki},
title = {CFDlang: High-level Code Generation for High-order Methods in Fluid Dynamics},
booktitle = {Proceedings of the 3rd International Workshop on Real World Domain Specific Languages (RWDSL 2018)},
year = {2018},
series = {RWDSL2018},
pages = {5:1--5:10},
address = {New York, NY, USA},
month = feb,
publisher = {ACM},
abstract = {Numerical simulations continue to enable fast and enormous progress in science and engineering. Writing efficient numerical codes is a difficult challenge that encompasses a variety of tasks from designing the right algorithms to exploiting the full potential of a platform's architecture. Domain-specific languages (DSLs) can ease these tasks by offering the right abstractions for expressing numerical problems. With the aid of domain knowledge, efficient code can then be generated automatically from abstract expressions. In this work, we present the CFDlang DSL for expressing tensor operations that constitute the performance-critical code sections in a class of real numerical applications from fluid dynamics. We demonstrate that CFDlang can be used to generate code automatically that performs as well, if not better, than carefully hand-optimized code.},
acmid = {3183900},
articleno = {5},
doi = {10.1145/3183895.3183900},
isbn = {978-1-4503-6355-6},
location = {Vienna, Austria},
numpages = {10},
url = {http://doi.acm.org/10.1145/3183895.3183900}
}

Downloads

1802_Rink_RWDSL [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2074

×
Norman A. Rink, "Modeling of languages for tensor manipulation", In CoRR, vol. abs/1801.08771, 2018. [Bibtex & Downloads]

Modeling of languages for tensor manipulation

Reference

Norman A. Rink, "Modeling of languages for tensor manipulation", In CoRR, vol. abs/1801.08771, 2018.

Bibtex

@article{Rink18,
author = {Norman A. Rink},
title = {Modeling of languages for tensor manipulation},
journal = {CoRR},
volume = {abs/1801.08771},
year = {2018},
url = {http://arxiv.org/abs/1801.08771},
archivePrefix = {arXiv},
eprint = {1801.08771},
biburl = {https://dblp.org/rec/bib/journals/corr/abs-1801-08771},
bibsource = {dblp computer science bibliography, https://dblp.org}
}

Downloads

No Downloads available for this publication

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=2287

×

2017
Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi] [Bibtex & Downloads]

Towards Compositional and Generative Tensor Optimizations

Reference

Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi]

Bibtex

@InProceedings{rink_gpce17,
author = {Adilla Susungi and Norman A. Rink and Jeronimo Castrillon and Immo Huismann and Albert Cohen and Claude Tadonki and J{\"o}rg Stiller and Jochen Fr{\"o}hlich},
title = {Towards Compositional and Generative Tensor Optimizations},
booktitle = {Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17)},
series = {GPCE 2017},
year = {2017},
pages = {169--175},
month = oct,
isbn = {978-1-4503-5524-7},
location = {Vancouver, BC, Canada},
pages = {169--175},
numpages = {7},
url = {http://doi.acm.org/10.1145/3136040.3136050},
doi = {10.1145/3136040.3136050},
acmid = {3136050},
publisher = {ACM},
address = {New York, NY, USA},
}

Downloads

1710_Rink_GPCE [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1574

×
Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award) [Bibtex & Downloads]

Extending a Compiler Backend for Complete Memory Error Detection

Reference

Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award)

Abstract
Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.

Bibtex

@InProceedings{rink_automotive17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Extending a Compiler Backend for Complete Memory Error Detection},
booktitle = {Lecture Notes in Informatics: Automotive - Safety \& Security 2017},
editor = {Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Pl{\"o}dereder},
year = {2017},
pages = {61--74},
month = may,
abstract = {Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.},
file = {:/Users/jeronimocastrillon/Documents/Academic/mypapers/1705_rink_automotive.pdf:PDF},
isbn = {978-3-88579-663-3},
issn = {1617-5468},
url = {https://dl.gi.de/bitstream/handle/20.500.12116/147/paper04.pdf?sequence=1&isAllowed=y},

}

Downloads

1705_rink_automotive [PDF]

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1322

×
Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi] [Bibtex & Downloads]

Trading Fault Tolerance for Performance in AN Encoding

Reference

Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi]

Bibtex

@InProceedings{rink_cf17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Trading Fault Tolerance for Performance in {AN} Encoding},
booktitle = {Proceedings of the ACM International Conference on Computing Frontiers (CF'17)},
year = {2017},
isbn = {978-1-4503-4487-6},
location = {Siena, Italy},
pages = {183--190},
numpages = {8},
url = {http://doi.acm.org/10.1145/3075564.3075565},
doi = {10.1145/3075564.3075565},
acmid = {3075565},
publisher = {ACM},
address = {New York, NY, USA},
month = may,
}

Downloads

1705_Rink_cf [PDF]

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1421

×
Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017. [Bibtex & Downloads]

flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication

Reference

Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017.

Abstract
Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.

Bibtex

@InProceedings{rees:2017,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {{flexMEDiC}: flexible {M}emory {E}rror {D}etection by Combined data encoding and duplication},
booktitle = {Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017},
year = {2017},
month = mar,
pages = {15--22},
abstract = {Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.},
}

Downloads

1703_Rink_REES [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1321

×

2016
Norman A. Rink, Jeronimo Castrillon, "Comprehensive Backend Support for Local Memory Fault Tolerance", Technical report, Technische Universität Dresden, pp. 11, Dec 2016. [Bibtex & Downloads]

Comprehensive Backend Support for Local Memory Fault Tolerance

Reference

Norman A. Rink, Jeronimo Castrillon, "Comprehensive Backend Support for Local Memory Fault Tolerance", Technical report, Technische Universität Dresden, pp. 11, Dec 2016.

Bibtex

@TechReport{rink_techrep16,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Comprehensive Backend Support for Local Memory Fault Tolerance},
institution = {Technische Universit{\"a}t Dresden},
year = {2016},
month = dec,
issn = {1430-211X},
pages = {11},
url = {https://cfaed.tu-dresden.de/files/user/nrink/tech-report-ro.pdf}
}

Downloads

tech-report-ro [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1308

×
Sven Karol, Norman A. Rink, Bálint Gyapjas, Jeronimo Castrillon, "Fault Tolerance with Aspects: a Feasibility Study", Proceedings of the 15th International Conference on Modularity, ACM, pp. 66–69, New York, NY, USA, Mar 2016. [doi] [Bibtex & Downloads]

Fault Tolerance with Aspects: a Feasibility Study

Reference

Sven Karol, Norman A. Rink, Bálint Gyapjas, Jeronimo Castrillon, "Fault Tolerance with Aspects: a Feasibility Study", Proceedings of the 15th International Conference on Modularity, ACM, pp. 66–69, New York, NY, USA, Mar 2016. [doi]

Bibtex

@inproceedings{karol2016faulttolerance,
author={Karol, Sven and Rink, Norman A. and Gyapjas, B\'{a}lint and Castrillon, Jeronimo},
title={Fault Tolerance with Aspects: a Feasibility Study},
booktitle={Proceedings of the 15th International Conference on Modularity},
series={MODULARITY 2016},
year={2016},
pages={66--69},
address={New York, NY, USA},
month={mar},
publisher={ACM},
doi={10.1145/2889443.2889453},
isbn={978-1-4503-3995-7/16/03},
location={M{\'a}laga, Spain},

}

Downloads

1603_Karol_Modularity_preprint [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=651

×

2015
Norman A. Rink, Jeronimo Castrillon, "Improving Code Generation for Software-based Error Detection", Proceedings of the 1st International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with ESWEEK 2015, pp. 16–30, Amsterdam, The Netherlands, Oct 2015. ([link]) [Bibtex & Downloads]

Improving Code Generation for Software-based Error Detection

Reference

Norman A. Rink, Jeronimo Castrillon, "Improving Code Generation for Software-based Error Detection", Proceedings of the 1st International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with ESWEEK 2015, pp. 16–30, Amsterdam, The Netherlands, Oct 2015. ([link])

Bibtex

@InProceedings{rink_ress15,
Title={Improving Code Generation for Software-based Error Detection},
Author={Rink, Norman A. and Castrillon, Jeronimo},
Booktitle={Proceedings of the 1st International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with ESWEEK 2015},
Year={2015},
Series={REES 2015},
Address={Amsterdam, The Netherlands},
Month=oct,
Pages={16--30},

}

Downloads

1510_Rink_REES [PDF]

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=464

×
Norman A. Rink, Dmitrii Kuvaiskii, Jeronimo Castrillon, Christof Fetzer, "Compiling for Resilience: the Performance Gap", Chapter in Parallel Computing: On the Road to Exascale (ParCo 2015). Extended from Proceedings of the Mini-Symposium on Energy and Resilience in Parallel Programming (ERPP 2015) (Gerhard R. Joubert and Hugh Leather and Mark Parsons and Frans Peters and Mark Sawyer), IOS Press, vol. 27, pp. 721–730, Edinburgh, Scotland, Sep 2015. [doi] [Bibtex & Downloads]

Compiling for Resilience: the Performance Gap

Reference

Norman A. Rink, Dmitrii Kuvaiskii, Jeronimo Castrillon, Christof Fetzer, "Compiling for Resilience: the Performance Gap", Chapter in Parallel Computing: On the Road to Exascale (ParCo 2015). Extended from Proceedings of the Mini-Symposium on Energy and Resilience in Parallel Programming (ERPP 2015) (Gerhard R. Joubert and Hugh Leather and Mark Parsons and Frans Peters and Mark Sawyer), IOS Press, vol. 27, pp. 721–730, Edinburgh, Scotland, Sep 2015. [doi]

Abstract
In order to perform reliable computations on unreliable hardware, software-based protection mechanisms have been proposed. In this paper we present a compiler infrastructure for software-based code hardening based on encoding. We analyze the trade-off between performance and fault coverage. We look at different code generation strategies that improve the performance of hardened programs by up to 2x while incurring little fault coverage degradation.

Bibtex

@InCollection{rink_erpp2015,
author={Rink, Norman A. and Kuvaiskii, Dmitrii and Castrillon, Jeronimo and Fetzer, Christof},
title={Compiling for Resilience: the Performance Gap},
booktitle={Parallel Computing: On the Road to Exascale (ParCo 2015). Extended from Proceedings of the Mini-Symposium on Energy and Resilience in Parallel Programming (ERPP 2015)},
publisher={IOS Press},
year={2015},
editor={Gerhard R. Joubert and Hugh Leather and Mark Parsons and Frans Peters and Mark Sawyer},
volume={27},
series={ParCo 2015},
pages={721--730},
address={Edinburgh, Scotland},
month=sep,
abstract={In order to perform reliable computations on unreliable hardware, software-based protection mechanisms have been proposed. In this paper we present a compiler infrastructure for software-based code hardening based on encoding. We analyze the trade-off between performance and fault coverage. We look at different code generation strategies that improve the performance of hardened programs by up to 2x while incurring little fault coverage degradation.},
doi={10.3233/978-1-61499-621-7-721},
}

Downloads

No Downloads available for this publication

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=782

×

Theoretical Physics publications

D. Dorigoni, N. A. Rink. "A ladder of topologically non-trivial non-BPS states", J. Geom. Phys. 86, pp. 31–42, 2014.

10.1016/j.geomphys.2014.06.006 (arXiv:1404.6053)

N. A. Rink. "Vortices and the Abel-Jacobi map", J. Geom. Phys. 76, pp. 242–255, 2014.

10.1016/j.geomphys.2013.10.017 (arXiv:1304.3386)

N. A. Rink. "Non-abelian vortices on CP¹ and Grassmannians", J. Math. Phys. 54, 043503, 2013.

10.1063/1.4798468 (arXiv:1211.1662)

N. S. Manton, N. A. Rink. "Geometry and energy of non-abelian vortices", J. Math. Phys. 52, 043511, 2011.

10.1063/1.3574357 (arXiv:1012.3014)

N. S. Manton, N. A. Rink. "Vortices on hyperbolic surfaces", J. Phys. A 43, 434024, 2010.

10.1088/1751-8113/43/43/434024 (arXiv:0912.2058)

Dr. Norman Rink (2014 - 2020)

2020

2019

2018

2017

2016

2015

Theoretical Physics publications