Publications

Author:

Year:

Journal:

20 Publications found with the applied filter.

2017
Fazal Hameed, Christian Menard, Jeronimo Castrillon, "Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache", Proceedings of the International Symposium on Memory Systems (MemSys'17), ACM, pp. 141–151, New York, NY, USA, Oct 2017. [doi] [Bibtex & Downloads]

Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache

Reference

Fazal Hameed, Christian Menard, Jeronimo Castrillon, "Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache", Proceedings of the International Symposium on Memory Systems (MemSys'17), ACM, pp. 141–151, New York, NY, USA, Oct 2017. [doi]

Bibtex

@InProceedings{hameed_memsys17,
author = {Fazal Hameed and Christian Menard and Jeronimo Castrillon},
title = {Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache},
booktitle = {Proceedings of the International Symposium on Memory Systems (MemSys'17)},
series = {MEMSYS '17},
year = {2017},
month = oct,
isbn = {978-1-4503-5335-9},
location = {Alexandria, Virginia},
pages = {141--151},
numpages = {11},
url = {http://doi.acm.org/10.1145/3132402.3132414},
doi = {10.1145/3132402.3132414},
acmid = {3132414},
publisher = {ACM},
address = {New York, NY, USA},
}

Downloads

1710_Hameed_Memsys [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1476

×
Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi] [Bibtex & Downloads]

Towards Compositional and Generative Tensor Optimizations

Reference

Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi]

Bibtex

@InProceedings{rink_gpce17,
author = {Adilla Susungi and Norman A. Rink and Jeronimo Castrillon and Immo Huismann and Albert Cohen and Claude Tadonki and J{\"o}rg Stiller and Jochen Fr{\"o}hlich},
title = {Towards Compositional and Generative Tensor Optimizations},
booktitle = {Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17)},
series = {GPCE 2017},
year = {2017},
pages = {169--175},
month = oct,
isbn = {978-1-4503-5524-7},
location = {Vancouver, BC, Canada},
pages = {169--175},
numpages = {7},
url = {http://doi.acm.org/10.1145/3136040.3136050},
doi = {10.1145/3136040.3136050},
acmid = {3136050},
publisher = {ACM},
address = {New York, NY, USA},
}

Downloads

1710_Rink_GPCE [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1574

×
Sebastian Ertel, Justus Adam, Jeronimo Castrillon, "POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems", Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017) (Lawrence Rauchwerger), Springer, Cham, pp. 281–282, Oct 2017. [doi] [Bibtex & Downloads]

POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems

Reference

Sebastian Ertel, Justus Adam, Jeronimo Castrillon, "POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems", Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017) (Lawrence Rauchwerger), Springer, Cham, pp. 281–282, Oct 2017. [doi]

Bibtex

@InProceedings{ertel_lcpc17,
author = {Sebastian Ertel and Justus Adam and Jeronimo Castrillon},
title = {POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems},
booktitle = {Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017)},
year = {2017},
editor = {Lawrence Rauchwerger},
publisher = {Springer, Cham},
location = {Texas A{\&}M University, College Station, Texas},
month = oct,
isbn = {978-3-030-35224-0},
pages = {281--282},
doi = {10.1007/978-3-030-35225-7},
url = {https://link.springer.com/book/10.1007%2F978-3-030-35225-7},
}

Downloads

1710_Ertel_LCPC [PDF]

Related Paths
HAEC

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1584

×
Sven Karol, Tobias Nett, Pietro Incardona, Nesrine Khouzami, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Language and Development Environment for Parallel Particle Methods", Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017 (P. Wriggers and M. Bischoff and E. Oñate and D.R.J. Owen and T. Zohdi), Sep 2017. [Bibtex & Downloads]

A Language and Development Environment for Parallel Particle Methods

Reference

Sven Karol, Tobias Nett, Pietro Incardona, Nesrine Khouzami, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Language and Development Environment for Parallel Particle Methods", Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017 (P. Wriggers and M. Bischoff and E. Oñate and D.R.J. Owen and T. Zohdi), Sep 2017.

Bibtex

@InProceedings{karol_particles17,
author = {Sven Karol and Tobias Nett and Pietro Incardona and Nesrine Khouzami and Jeronimo Castrillon and Ivo F. Sbalzarini},
title = {A Language and Development Environment for Parallel Particle Methods},
booktitle = {Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017},
year = {2017},
editor = {P. Wriggers and M. Bischoff and E. O{\~n}ate and D.R.J. Owen and T. Zohdi},
url = {https://www.semanticscholar.org/paper/A-Language-and-Development-Environment-for-Paralle-Karol-Nett/2b79bd3836aeb8e2fb2a2b5d9949f9efb1bdfab7?tab=abstract},
month = sep,
}

Downloads

1709_Karol_particles [PDF]

Related Paths
Biological Systems Path, Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1472

×
Jeronimo Castrillon, Tei-Wei Kuo, Heike E. Riel, Matthias Lieber, "Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)", In Dagstuhl Reports (Jerónimo Castrillón-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol. 7, no. 2, pp. 1–22, Dagstuhl, Germany, Aug 2017. [doi] [Bibtex & Downloads]

Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)

Reference

Jeronimo Castrillon, Tei-Wei Kuo, Heike E. Riel, Matthias Lieber, "Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)", In Dagstuhl Reports (Jerónimo Castrillón-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol. 7, no. 2, pp. 1–22, Dagstuhl, Germany, Aug 2017. [doi]

Bibtex

@Article{castrillnmazo_et_al:DR:2017:7349,
author = {Jeronimo Castrillon and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber},
title = ,
journal = {Dagstuhl Reports},
year = {2017},
volume = {7},
number = {2},
month = aug,
pages = {1--22},
address = {Dagstuhl, Germany},
annote = {Keywords: 3D integration, compilers, emerging post-CMOS circuit materials and technologies, hardware/software co-design, heterogeneous hardware, nanoelectronics},
doi = {10.4230/DagRep.7.2.1},
editor = {Jer{\'o}nimo Castrill{\'o}n-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber},
issn = {2192-5283},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
url = {http://drops.dagstuhl.de/opus/volltexte/2017/7349},
urn = {urn:nbn:de:0030-drops-73499}
}

Downloads

No Downloads available for this publication

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1575

×
Andrés Goens, Sergio Siccha, Jeronimo Castrillon, "Symmetry in Software Synthesis", In ACM Transactions on Architecture and Code Optimization (TACO),, ACM, vol. 14, no. 2, pp. 20:1–20:26, New York, NY, USA, Jul 2017. [doi] [Bibtex & Downloads]

Symmetry in Software Synthesis

Reference

Andrés Goens, Sergio Siccha, Jeronimo Castrillon, "Symmetry in Software Synthesis", In ACM Transactions on Architecture and Code Optimization (TACO),, ACM, vol. 14, no. 2, pp. 20:1–20:26, New York, NY, USA, Jul 2017. [doi]

Abstract
With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this paper, we present a formal framework that can determine the inherent symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-size benchmarks, our approach consistently yields reductions of the overall execution time of algorithms, accelerating them by a factor up to 10 in our experiments, or improving the quality of the results.

Bibtex

@article{goens_taco17symmetry,
author = {Goens, Andr{\'e}s and Siccha, Sergio and Castrillon, Jeronimo},
title = {Symmetry in Software Synthesis},
journal = {ACM Transactions on Architecture and Code Optimization (TACO),},
issue_date = {July 2017},
volume = {14},
number = {2},
month = jul,
year = {2017},
issn = {1544-3566},
pages = {20:1--20:26},
articleno = {20},
numpages = {26},
url = {http://doi.acm.org/10.1145/3095747},
doi = {10.1145/3095747},
acmid = {3095747},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Scalability, automation, clusters, design-space exploration, group theory, heterogeneous, inverse-semigroups, mapping, metaheuristics, network-on-chip, symmetry},
eprint = "arXiv:1704.06623",
abstract = {With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this paper, we present a formal framework that can determine the inherent symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-size benchmarks, our approach consistently yields reductions of the overall execution time of algorithms, accelerating them by a factor up to 10 in our experiments, or improving the quality of the results.}
}

Downloads

1704_Goens_TACO-arxiv [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1453

×
Christian Menard, Matthias Jung, Jeronimo Castrillon, Norbert Wehn, "System Simulation with gem5 and SystemC: The Keystone for Full Interoperability", Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS), pp. 62–69, Jul 2017. [doi] [Bibtex & Downloads]

System Simulation with gem5 and SystemC: The Keystone for Full Interoperability

Reference

Christian Menard, Matthias Jung, Jeronimo Castrillon, Norbert Wehn, "System Simulation with gem5 and SystemC: The Keystone for Full Interoperability", Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS), pp. 62–69, Jul 2017. [doi]

Abstract
SystemC TLM based virtual prototypes have become the main tool in industry and research for concurrent hardware and software development, as well as hardware design space exploration. However, there exists a lack of accurate, free, changeable and realistic SystemC models of modern CPUs. Therefore, many researchers use the cycle accurate open source system simulator gem5, which has been developed in parallel to the SystemC standard. In this paper we present a coupling of gem5 with SystemC that offers full interoperability between both simulation frameworks, and therefore enables a huge set of possibilities for system level design space exploration. Furthermore, we show that the coupling itself only induces a relatively small overhead to the total execution time of the simulation.

Bibtex

@InProceedings{menard_samos17,
author = {Christian Menard and Matthias Jung and Jeronimo Castrillon and Norbert Wehn},
title = {System Simulation with gem5 and SystemC: The Keystone for Full Interoperability},
booktitle = {Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS)},
year = {2017},
month = jul,
location = {Pythagorion, Greece},
pages = {62--69},
organization = {IEEE},
doi = {10.1109/SAMOS.2017.8344612},
url = {https://ieeexplore.ieee.org/document/8344612/},
isbn = {978-1-5386-3437-0},
abstract = {SystemC TLM based virtual prototypes have become the main tool in industry and research for concurrent hardware and software development, as well as hardware design space exploration. However, there exists a lack of accurate, free, changeable and realistic SystemC models of modern CPUs. Therefore, many researchers use the cycle accurate open source system simulator gem5, which has been developed in parallel to the SystemC standard. In this paper we present a coupling of gem5 with SystemC that offers full interoperability between both simulation frameworks, and therefore enables a huge set of possibilities for system level design space exploration. Furthermore, we show that the coupling itself only induces a relatively small overhead to the total execution time of the simulation.},
}

Downloads

1707_Menard_SAMOS [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1475

×
Andrés Goens, Robert Khasanov, Marcus Hähnel, Till Smejkal, Hermann Härtig, Jeronimo Castrillon, "TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17), ACM, pp. 11–20, New York, NY, USA, Jun 2017. [doi] [Bibtex & Downloads]

TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings

Reference

Andrés Goens, Robert Khasanov, Marcus Hähnel, Till Smejkal, Hermann Härtig, Jeronimo Castrillon, "TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17), ACM, pp. 11–20, New York, NY, USA, Jun 2017. [doi]

Bibtex

@InProceedings{goens_scopes17,
author = {Andr\'{e}s Goens and Robert Khasanov and Marcus H{\"a}hnel and Till Smejkal and Hermann H{\"a}rtig and Jeronimo Castrillon},
title = {TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings},
booktitle = {Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17)},
year = {2017},
month = jun,
series = {SCOPES '17},
isbn = {978-1-4503-5039-6},
location = {Sankt Goar, Germany},
pages = {11--20},
numpages = {10},
url = {http://doi.acm.org/10.1145/3078659.3078663},
doi = {10.1145/3078659.3078663},
acmid = {3078663},
publisher = {ACM},
address = {New York, NY, USA}
}

Downloads

1706_Goens_SCOPES [PDF]

Related Paths
HAEC, Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1451

×
Gerald Hempel, Andrés Goens, Josefine Asmus, Jeronimo Castrillon, Ivo F. Sbalzarini, "Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17), ACM, pp. 21–30, New York, NY, USA, Jun 2017. [doi] [Bibtex & Downloads]

Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering

Reference

Gerald Hempel, Andrés Goens, Josefine Asmus, Jeronimo Castrillon, Ivo F. Sbalzarini, "Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17), ACM, pp. 21–30, New York, NY, USA, Jun 2017. [doi]

Bibtex

@InProceedings{hempel_scopes17,
author = {Gerald Hempel and Andr\'{e}s Goens and Josefine Asmus and Jeronimo Castrillon and Ivo F. Sbalzarini},
title = {Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering},
booktitle = {Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17)},
year = {2017},
series = {SCOPES '17},
pages = {21--30},
address = {New York, NY, USA},
month = jun,
publisher = {ACM},
acmid = {3078667},
doi = {10.1145/3078659.3078667},
isbn = {978-1-4503-5039-6},
location = {Sankt Goar, Germany},
numpages = {10},
url = {http://doi.acm.org/10.1145/3078659.3078667}
}

Downloads

1706_Hempel_SCOPES [PDF]

Related Paths
Biological Systems Path, Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1452

×
Johanna Sepúlveda, Vania Marangozova-Martin, Jeronimo Castrillon, "Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface", Elsevier, Jun 2017. [doi] [Bibtex & Downloads]

Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface

Reference

Johanna Sepúlveda, Vania Marangozova-Martin, Jeronimo Castrillon, "Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface", Elsevier, Jun 2017. [doi]

Bibtex

@Article{sepulveda_alchemy17_preface,
author = {Sep{\'u}lveda, Johanna and Marangozova-Martin, Vania and Castrillon, Jeronimo},
title = {Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface},
year = {2017},
month = jun,
doi = {10.1016/j.procs.2017.05.276},
file = {:/Users/jeronimocastrillon/Documents/Academic/mypapers/1706_sepulveda_alchemy.pdf:PDF},
url = {http://www.sciencedirect.com/science/article/pii/S1877050917309286},
publisher = {Elsevier}
}

Downloads

No Downloads available for this publication

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1479

×
Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award) [Bibtex & Downloads]

Extending a Compiler Backend for Complete Memory Error Detection

Reference

Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award)

Abstract
Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.

Bibtex

@InProceedings{rink_automotive17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Extending a Compiler Backend for Complete Memory Error Detection},
booktitle = {Lecture Notes in Informatics: Automotive - Safety \& Security 2017},
editor = {Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Pl{\"o}dereder},
year = {2017},
pages = {61--74},
month = may,
abstract = {Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.},
file = {:/Users/jeronimocastrillon/Documents/Academic/mypapers/1705_rink_automotive.pdf:PDF},
isbn = {978-3-88579-663-3},
issn = {1617-5468},
url = {https://dl.gi.de/bitstream/handle/20.500.12116/147/paper04.pdf?sequence=1&isAllowed=y},
comment={Best paper award}
}

Downloads

1705_rink_automotive [PDF]

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1322

×
Markus Haehnel, Frehiwot Melak Arega, Waltenegus Dargie, Robert Khasanov, Jeronimo Castrillon, "Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures", Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 432-437, May 2017. [doi] [Bibtex & Downloads]

Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures

Reference

Markus Haehnel, Frehiwot Melak Arega, Waltenegus Dargie, Robert Khasanov, Jeronimo Castrillon, "Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures", Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 432-437, May 2017. [doi]

Bibtex

@InProceedings{khasanov_dcperf17,
author = {Markus Haehnel and Frehiwot Melak Arega and Waltenegus Dargie and Robert Khasanov and Jeronimo Castrillon},
title = {Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures},
booktitle = {Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)},
year = {2017},
month = may,
volume={},
number={},
pages={432-437},
doi={10.1109/INFCOMW.2017.8116415},
ISSN={},
url = {http://ieeexplore.ieee.org/document/8116415/},
location = {Atlanta, USA}
}

Downloads

1705_Khasanov_DCPerf [PDF]

Related Paths
HAEC

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1385

×
Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi] [Bibtex & Downloads]

Trading Fault Tolerance for Performance in AN Encoding

Reference

Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi]

Bibtex

@InProceedings{rink_cf17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Trading Fault Tolerance for Performance in {AN} Encoding},
booktitle = {Proceedings of the ACM International Conference on Computing Frontiers (CF'17)},
year = {2017},
isbn = {978-1-4503-4487-6},
location = {Siena, Italy},
pages = {183--190},
numpages = {8},
url = {http://doi.acm.org/10.1145/3075564.3075565},
doi = {10.1145/3075564.3075565},
acmid = {3075565},
publisher = {ACM},
address = {New York, NY, USA},
month = may,
}

Downloads

1705_Rink_cf [PDF]

Related Paths
Orchestration Path, Resilience Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1421

×
Lars Schütze, Jeronimo Castrillon, "Analyzing State-of-the-Art Role-based Programming Languages", Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17), ACM, pp. 9:1–9:6, New York, NY, USA, Apr 2017. [doi] [Bibtex & Downloads]

Analyzing State-of-the-Art Role-based Programming Languages

Reference

Lars Schütze, Jeronimo Castrillon, "Analyzing State-of-the-Art Role-based Programming Languages", Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17), ACM, pp. 9:1–9:6, New York, NY, USA, Apr 2017. [doi]

Bibtex

@InProceedings{schuetze_lassy17,
author = {Lars Sch{\"u}tze and Jeronimo Castrillon},
title = {Analyzing State-of-the-Art Role-based Programming Languages},
booktitle = {Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17)},
series = {Programming '17},
year = {2017},
month = apr,
isbn = {978-1-4503-4836-2},
location = {Brussels, Belgium},
pages = {9:1--9:6},
articleno = {9},
numpages = {6},
url = {http://doi.acm.org/10.1145/3079368.3079386},
doi = {10.1145/3079368.3079386},
acmid = {3079386},
publisher = {ACM},
address = {New York, NY, USA},
}

Downloads

1704_Schuetze_lassy [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1388

×
Fazal Hameed, Jeronimo Castrillon, "Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization", Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE), EDA Consortium, pp. 362–367, Mar 2017. [doi] [Bibtex & Downloads]

Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization

Reference

Fazal Hameed, Jeronimo Castrillon, "Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization", Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE), EDA Consortium, pp. 362–367, Mar 2017. [doi]

Abstract
State-of-the-art DRAM cache employs a small Tag-Cache and its performance is dependent upon two important parameters namely bank-level-parallelism and Tag-Cache hit rate. These parameters depend upon the row buffer organization. Recently, it has been shown that a small row buffer organization delivers better performance via improved bank-level-parallelism than the traditional large row buffer organization along with energy benefits. However, small row buffers do not fully exploit the temporal locality of tag accesses, leading to reduced Tag- Cache hit rates. As a result, the DRAM cache needs to be re-designed for small row buffer organization to achieve additional performance benefits. In this paper, we propose a novel tag-store mechanism that improves the Tag-Cache hit rate by 70% compared to existing DRAM tag-store mechanisms employing small row buffer organization. In addition, we enhance the DRAM cache controller with novel policies that take into account the locality characteristics of cache accesses. We evaluate our novel tag-store mechanism and controller policies in an 8-core system running the SPEC2006 benchmark and compare their performance and energy consumption against recent proposals. Our architecture improves the average performance by 21.2% and 11.4% respectively compared to large and small row buffer organizations via simultaneously improving both parameters. Compared to DRAM cache with large row buffer organization, we report an energy improvement of 62%.

Bibtex

@InProceedings{hameed_date17,
author = {Fazal Hameed and Jeronimo Castrillon},
title = {Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization},
booktitle = {Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE)},
year = {2017},
series = {DATE '17},
pages = {362--367},
month = mar,
publisher = {EDA Consortium},
abstract = {State-of-the-art DRAM cache employs a small Tag-Cache and its performance is dependent upon two important parameters namely bank-level-parallelism and Tag-Cache hit rate. These parameters depend upon the row buffer organization. Recently, it has been shown that a small row buffer organization delivers better performance via improved bank-level-parallelism than the traditional large row buffer organization along with energy benefits. However, small row buffers do not fully exploit the temporal locality of tag accesses, leading to reduced Tag- Cache hit rates. As a result, the DRAM cache needs to be re-designed for small row buffer organization to achieve additional performance benefits. In this paper, we propose a novel tag-store mechanism that improves the Tag-Cache hit rate by 70\% compared to existing DRAM tag-store mechanisms employing small row buffer organization. In addition, we enhance the DRAM cache controller with novel policies that take into account the locality characteristics of cache accesses. We evaluate our novel tag-store mechanism and controller policies in an 8-core system running the SPEC2006 benchmark and compare their performance and energy consumption against recent proposals. Our architecture improves the average performance by 21.2\% and 11.4\% respectively compared to large and small row buffer organizations via simultaneously improving both parameters. Compared to DRAM cache with large row buffer organization, we report an energy improvement of 62\%.},
isbn = {978-3-9815370-8-6},
doi={10.23919/DATE.2017.7927017},
url = {http://ieeexplore.ieee.org/document/7927017/},
location = {Lausanne, Switzerland}
}

Downloads

1703_Hameed_DATE [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1254

×
Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017. [Bibtex & Downloads]

flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication

Reference

Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017.

Abstract
Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.

Bibtex

@InProceedings{rees:2017,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {{flexMEDiC}: flexible {M}emory {E}rror {D}etection by Combined data encoding and duplication},
booktitle = {Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017},
year = {2017},
month = mar,
pages = {15--22},
abstract = {Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.},
}

Downloads

1703_Rink_REES [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1321

×
Jeronimo Castrillon, "Programming for adaptive and energy-efficient computing", In International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote), Mar 2017. [Bibtex & Downloads]

Programming for adaptive and energy-efficient computing

Reference

Jeronimo Castrillon, "Programming for adaptive and energy-efficient computing", In International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote), Mar 2017.

Bibtex

@Misc{castrillon2017hp3c,
author = {Castrillon, Jeronimo},
title = {Programming for adaptive and energy-efficient computing},
howpublished = {International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote)},
month = mar,
year = {2017},
location = {Kuala Lumpur, Malaysia}
}

Downloads

170323_castrill_hp3c [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1473

×
Andrés Goens, Jeronimo Castrillon, "Optimizing for Data-Parallelism in Kahn Process Networks", In Proceeding: ACM SRC at International Symposium on Code Generationand Optimization (CGO), Feb 2017. [Bibtex & Downloads]

Optimizing for Data-Parallelism in Kahn Process Networks

Reference

Andrés Goens, Jeronimo Castrillon, "Optimizing for Data-Parallelism in Kahn Process Networks", In Proceeding: ACM SRC at International Symposium on Code Generationand Optimization (CGO), Feb 2017.

Bibtex

@inproceedings{goens17cgo,
author = {Andr\'{e}s Goens and Jeronimo Castrillon},
title = {Optimizing for Data-Parallelism in Kahn Process Networks},
year = {2017},
month = feb,
booktitle= {ACM SRC at International Symposium on
Code Generationand Optimization (CGO)},
location = {Austin, TX, USA},
}

Downloads

1701_Goens_SRCCGO [PDF]

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1544

×
Jeronimo Castrillon, "On Mapping to Multi/Manycores", In 10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017. [Bibtex & Downloads]

On Mapping to Multi/Manycores

Reference

Jeronimo Castrillon, "On Mapping to Multi/Manycores", In 10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017.

Bibtex

@Misc{castrillon2017multiprog,
author = {Castrillon, Jeronimo},
title = {On Mapping to Multi/Manycores},
howpublished = {10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk)},
month = jan,
year = {2017},
location = {Stockholm, Sweden}
}

Downloads

No Downloads available for this publication

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1319

×
Jeronimo Castrillon, "Flexible and Scalable Dataflow Programming for Manycores", In Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017. [Bibtex & Downloads]

Flexible and Scalable Dataflow Programming for Manycores

Reference

Jeronimo Castrillon, "Flexible and Scalable Dataflow Programming for Manycores", In Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017.

Bibtex

@Misc{castrillon2017hipeactut,
author = {Castrillon, Jeronimo},
title = {Flexible and Scalable Dataflow Programming for Manycores},
howpublished = {Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk)},
month = jan,
year = {2017},
location = {Stockholm, Sweden}
}

Downloads

No Downloads available for this publication

Related Paths
Orchestration Path

Permalink

https://cfaed.tu-dresden.de/publications?pubId=1320

×

2017