20 Publications found with the applied filter.
2017
- Fazal Hameed, Christian Menard, Jeronimo Castrillon, "Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache", Proceedings of the International Symposium on Memory Systems (MemSys'17), ACM, pp. 141–151, New York, NY, USA, Oct 2017. [doi] [Bibtex & Downloads]
Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache
Reference
Fazal Hameed, Christian Menard, Jeronimo Castrillon, "Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache", Proceedings of the International Symposium on Memory Systems (MemSys'17), ACM, pp. 141–151, New York, NY, USA, Oct 2017. [doi]
Bibtex
@InProceedings{hameed_memsys17,
author = {Fazal Hameed and Christian Menard and Jeronimo Castrillon},
title = {Efficient STT-RAM Last-Level-Cache Architecture to replace DRAM Cache},
booktitle = {Proceedings of the International Symposium on Memory Systems (MemSys'17)},
series = {MEMSYS '17},
year = {2017},
month = oct,
isbn = {978-1-4503-5335-9},
location = {Alexandria, Virginia},
pages = {141--151},
numpages = {11},
url = {http://doi.acm.org/10.1145/3132402.3132414},
doi = {10.1145/3132402.3132414},
acmid = {3132414},
publisher = {ACM},
address = {New York, NY, USA},
}Downloads
1710_Hameed_Memsys [PDF]
Related Paths
Permalink
- Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi] [Bibtex & Downloads]
Towards Compositional and Generative Tensor Optimizations
Reference
Adilla Susungi, Norman A. Rink, Jeronimo Castrillon, Immo Huismann, Albert Cohen, Claude Tadonki, Jörg Stiller, Jochen Fröhlich, "Towards Compositional and Generative Tensor Optimizations", Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17), ACM, pp. 169–175, New York, NY, USA, Oct 2017. [doi]
Bibtex
@InProceedings{rink_gpce17,
author = {Adilla Susungi and Norman A. Rink and Jeronimo Castrillon and Immo Huismann and Albert Cohen and Claude Tadonki and J{\"o}rg Stiller and Jochen Fr{\"o}hlich},
title = {Towards Compositional and Generative Tensor Optimizations},
booktitle = {Proceedings of 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE'17)},
series = {GPCE 2017},
year = {2017},
pages = {169--175},
month = oct,
isbn = {978-1-4503-5524-7},
location = {Vancouver, BC, Canada},
pages = {169--175},
numpages = {7},
url = {http://doi.acm.org/10.1145/3136040.3136050},
doi = {10.1145/3136040.3136050},
acmid = {3136050},
publisher = {ACM},
address = {New York, NY, USA},
}Downloads
1710_Rink_GPCE [PDF]
Related Paths
Permalink
- Sebastian Ertel, Justus Adam, Jeronimo Castrillon, "POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems", Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017) (Lawrence Rauchwerger), Springer, Cham, pp. 281–282, Oct 2017. [doi] [Bibtex & Downloads]
POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems
Reference
Sebastian Ertel, Justus Adam, Jeronimo Castrillon, "POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems", Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017) (Lawrence Rauchwerger), Springer, Cham, pp. 281–282, Oct 2017. [doi]
Bibtex
@InProceedings{ertel_lcpc17,
author = {Sebastian Ertel and Justus Adam and Jeronimo Castrillon},
title = {POSTER: Towards Fine-grained Dataflow Parallelism in Big Data Systems},
booktitle = {Proceedings of the 30th International Workshop on Languages and Compilers for Parallel Computing (LCPC 2017)},
year = {2017},
editor = {Lawrence Rauchwerger},
publisher = {Springer, Cham},
location = {Texas A{\&}M University, College Station, Texas},
month = oct,
isbn = {978-3-030-35224-0},
pages = {281--282},
doi = {10.1007/978-3-030-35225-7},
url = {https://link.springer.com/book/10.1007%2F978-3-030-35225-7},
}Downloads
1710_Ertel_LCPC [PDF]
Related Paths
Permalink
- Sven Karol, Tobias Nett, Pietro Incardona, Nesrine Khouzami, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Language and Development Environment for Parallel Particle Methods", Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017 (P. Wriggers and M. Bischoff and E. Oñate and D.R.J. Owen and T. Zohdi), Sep 2017. [Bibtex & Downloads]
A Language and Development Environment for Parallel Particle Methods
Reference
Sven Karol, Tobias Nett, Pietro Incardona, Nesrine Khouzami, Jeronimo Castrillon, Ivo F. Sbalzarini, "A Language and Development Environment for Parallel Particle Methods", Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017 (P. Wriggers and M. Bischoff and E. Oñate and D.R.J. Owen and T. Zohdi), Sep 2017.
Bibtex
@InProceedings{karol_particles17,
author = {Sven Karol and Tobias Nett and Pietro Incardona and Nesrine Khouzami and Jeronimo Castrillon and Ivo F. Sbalzarini},
title = {A Language and Development Environment for Parallel Particle Methods},
booktitle = {Proceedings of the 5th International Conference on Particle-based Methods. Fundamentals and Applications PARTICLES 2017},
year = {2017},
editor = {P. Wriggers and M. Bischoff and E. O{\~n}ate and D.R.J. Owen and T. Zohdi},
url = {https://www.semanticscholar.org/paper/A-Language-and-Development-Environment-for-Paralle-Karol-Nett/2b79bd3836aeb8e2fb2a2b5d9949f9efb1bdfab7?tab=abstract},
month = sep,
}Downloads
1709_Karol_particles [PDF]
Related Paths
Biological Systems Path, Orchestration Path
Permalink
- Jeronimo Castrillon, Tei-Wei Kuo, Heike E. Riel, Matthias Lieber, "Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)", In Dagstuhl Reports (Jerónimo Castrillón-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol. 7, no. 2, pp. 1–22, Dagstuhl, Germany, Aug 2017. [doi] [Bibtex & Downloads]
Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)
Reference
Jeronimo Castrillon, Tei-Wei Kuo, Heike E. Riel, Matthias Lieber, "Wildly Heterogeneous Post-CMOS Technologies Meet Software (Dagstuhl Seminar 17061)", In Dagstuhl Reports (Jerónimo Castrillón-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber), Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, vol. 7, no. 2, pp. 1–22, Dagstuhl, Germany, Aug 2017. [doi]
Bibtex
@Article{castrillnmazo_et_al:DR:2017:7349,
author = {Jeronimo Castrillon and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber},
title = ,
journal = {Dagstuhl Reports},
year = {2017},
volume = {7},
number = {2},
month = aug,
pages = {1--22},
address = {Dagstuhl, Germany},
annote = {Keywords: 3D integration, compilers, emerging post-CMOS circuit materials and technologies, hardware/software co-design, heterogeneous hardware, nanoelectronics},
doi = {10.4230/DagRep.7.2.1},
editor = {Jer{\'o}nimo Castrill{\'o}n-Mazo and Tei-Wei Kuo and Heike E. Riel and Matthias Lieber},
issn = {2192-5283},
publisher = {Schloss Dagstuhl--Leibniz-Zentrum fuer Informatik},
url = {http://drops.dagstuhl.de/opus/volltexte/2017/7349},
urn = {urn:nbn:de:0030-drops-73499}
}Downloads
No Downloads available for this publication
Related Paths
Permalink
- Andrés Goens, Sergio Siccha, Jeronimo Castrillon, "Symmetry in Software Synthesis", In ACM Transactions on Architecture and Code Optimization (TACO),, ACM, vol. 14, no. 2, pp. 20:1–20:26, New York, NY, USA, Jul 2017. [doi] [Bibtex & Downloads]
Symmetry in Software Synthesis
Reference
Andrés Goens, Sergio Siccha, Jeronimo Castrillon, "Symmetry in Software Synthesis", In ACM Transactions on Architecture and Code Optimization (TACO),, ACM, vol. 14, no. 2, pp. 20:1–20:26, New York, NY, USA, Jul 2017. [doi]
Abstract
With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this paper, we present a formal framework that can determine the inherent symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-size benchmarks, our approach consistently yields reductions of the overall execution time of algorithms, accelerating them by a factor up to 10 in our experiments, or improving the quality of the results.
Bibtex
@article{goens_taco17symmetry,
author = {Goens, Andr{\'e}s and Siccha, Sergio and Castrillon, Jeronimo},
title = {Symmetry in Software Synthesis},
journal = {ACM Transactions on Architecture and Code Optimization (TACO),},
issue_date = {July 2017},
volume = {14},
number = {2},
month = jul,
year = {2017},
issn = {1544-3566},
pages = {20:1--20:26},
articleno = {20},
numpages = {26},
url = {http://doi.acm.org/10.1145/3095747},
doi = {10.1145/3095747},
acmid = {3095747},
publisher = {ACM},
address = {New York, NY, USA},
keywords = {Scalability, automation, clusters, design-space exploration, group theory, heterogeneous, inverse-semigroups, mapping, metaheuristics, network-on-chip, symmetry},
eprint = "arXiv:1704.06623",
abstract = {With the surge of multi- and manycores, much research has focused on algorithms for mapping and scheduling on these complex platforms. Large classes of these algorithms face scalability problems. This is why diverse methods are commonly used for reducing the search space. While most such approaches leverage the inherent symmetry of architectures and applications, they do it in a problem-specific and intuitive way. However, intuitive approaches become impractical with growing hardware complexity, like Network-on-Chip interconnect or heterogeneous cores. In this paper, we present a formal framework that can determine the inherent symmetry of architectures and applications algorithmically and leverage these for problems in software synthesis. Our approach is based on the mathematical theory of groups and a generalization called inverse semigroups. We evaluate our approach in two state-of-the-art mapping frameworks. Even for the platforms with a handful of cores of today and moderate-size benchmarks, our approach consistently yields reductions of the overall execution time of algorithms, accelerating them by a factor up to 10 in our experiments, or improving the quality of the results.}
}Downloads
1704_Goens_TACO-arxiv [PDF]
Related Paths
Permalink
- Christian Menard, Matthias Jung, Jeronimo Castrillon, Norbert Wehn, "System Simulation with gem5 and SystemC: The Keystone for Full Interoperability", Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS), pp. 62–69, Jul 2017. [doi] [Bibtex & Downloads]
System Simulation with gem5 and SystemC: The Keystone for Full Interoperability
Reference
Christian Menard, Matthias Jung, Jeronimo Castrillon, Norbert Wehn, "System Simulation with gem5 and SystemC: The Keystone for Full Interoperability", Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS), pp. 62–69, Jul 2017. [doi]
Abstract
SystemC TLM based virtual prototypes have become the main tool in industry and research for concurrent hardware and software development, as well as hardware design space exploration. However, there exists a lack of accurate, free, changeable and realistic SystemC models of modern CPUs. Therefore, many researchers use the cycle accurate open source system simulator gem5, which has been developed in parallel to the SystemC standard. In this paper we present a coupling of gem5 with SystemC that offers full interoperability between both simulation frameworks, and therefore enables a huge set of possibilities for system level design space exploration. Furthermore, we show that the coupling itself only induces a relatively small overhead to the total execution time of the simulation.
Bibtex
@InProceedings{menard_samos17,
author = {Christian Menard and Matthias Jung and Jeronimo Castrillon and Norbert Wehn},
title = {System Simulation with gem5 and SystemC: The Keystone for Full Interoperability},
booktitle = {Proceedings of the IEEE International Conference on Embedded Computer Systems Architectures Modeling and Simulation (SAMOS)},
year = {2017},
month = jul,
location = {Pythagorion, Greece},
pages = {62--69},
organization = {IEEE},
doi = {10.1109/SAMOS.2017.8344612},
url = {https://ieeexplore.ieee.org/document/8344612/},
isbn = {978-1-5386-3437-0},
abstract = {SystemC TLM based virtual prototypes have become the main tool in industry and research for concurrent hardware and software development, as well as hardware design space exploration. However, there exists a lack of accurate, free, changeable and realistic SystemC models of modern CPUs. Therefore, many researchers use the cycle accurate open source system simulator gem5, which has been developed in parallel to the SystemC standard. In this paper we present a coupling of gem5 with SystemC that offers full interoperability between both simulation frameworks, and therefore enables a huge set of possibilities for system level design space exploration. Furthermore, we show that the coupling itself only induces a relatively small overhead to the total execution time of the simulation.},
}Downloads
1707_Menard_SAMOS [PDF]
Related Paths
Permalink
- Andrés Goens, Robert Khasanov, Marcus Hähnel, Till Smejkal, Hermann Härtig, Jeronimo Castrillon, "TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17), ACM, pp. 11–20, New York, NY, USA, Jun 2017. [doi] [Bibtex & Downloads]
TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings
Reference
Andrés Goens, Robert Khasanov, Marcus Hähnel, Till Smejkal, Hermann Härtig, Jeronimo Castrillon, "TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17), ACM, pp. 11–20, New York, NY, USA, Jun 2017. [doi]
Bibtex
@InProceedings{goens_scopes17,
author = {Andr\'{e}s Goens and Robert Khasanov and Marcus H{\"a}hnel and Till Smejkal and Hermann H{\"a}rtig and Jeronimo Castrillon},
title = {TETRiS: a Multi-Application Run-Time System for Predictable Execution of Static Mappings},
booktitle = {Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES'17)},
year = {2017},
month = jun,
series = {SCOPES '17},
isbn = {978-1-4503-5039-6},
location = {Sankt Goar, Germany},
pages = {11--20},
numpages = {10},
url = {http://doi.acm.org/10.1145/3078659.3078663},
doi = {10.1145/3078659.3078663},
acmid = {3078663},
publisher = {ACM},
address = {New York, NY, USA}
}Downloads
1706_Goens_SCOPES [PDF]
Related Paths
Permalink
- Gerald Hempel, Andrés Goens, Josefine Asmus, Jeronimo Castrillon, Ivo F. Sbalzarini, "Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17), ACM, pp. 21–30, New York, NY, USA, Jun 2017. [doi] [Bibtex & Downloads]
Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering
Reference
Gerald Hempel, Andrés Goens, Josefine Asmus, Jeronimo Castrillon, Ivo F. Sbalzarini, "Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering", Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17), ACM, pp. 21–30, New York, NY, USA, Jun 2017. [doi]
Bibtex
@InProceedings{hempel_scopes17,
author = {Gerald Hempel and Andr\'{e}s Goens and Josefine Asmus and Jeronimo Castrillon and Ivo F. Sbalzarini},
title = {Robust Mapping of Process Networks to Many-Core Systems Using Bio-Inspired Design Centering},
booktitle = {Proceedings of the 20th International Workshop on Software and Compilers for Embedded Systems (SCOPES '17)},
year = {2017},
series = {SCOPES '17},
pages = {21--30},
address = {New York, NY, USA},
month = jun,
publisher = {ACM},
acmid = {3078667},
doi = {10.1145/3078659.3078667},
isbn = {978-1-4503-5039-6},
location = {Sankt Goar, Germany},
numpages = {10},
url = {http://doi.acm.org/10.1145/3078659.3078667}
}Downloads
1706_Hempel_SCOPES [PDF]
Related Paths
Biological Systems Path, Orchestration Path
Permalink
- Johanna Sepúlveda, Vania Marangozova-Martin, Jeronimo Castrillon, "Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface", Elsevier, Jun 2017. [doi] [Bibtex & Downloads]
Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface
Reference
Johanna Sepúlveda, Vania Marangozova-Martin, Jeronimo Castrillon, "Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface", Elsevier, Jun 2017. [doi]
Bibtex
@Article{sepulveda_alchemy17_preface,
author = {Sep{\'u}lveda, Johanna and Marangozova-Martin, Vania and Castrillon, Jeronimo},
title = {Architecture, Languages, Compilation and Hardware support for Emerging ManYcore systems (ALCHEMY): Preface},
year = {2017},
month = jun,
doi = {10.1016/j.procs.2017.05.276},
file = {:/Users/jeronimocastrillon/Documents/Academic/mypapers/1706_sepulveda_alchemy.pdf:PDF},
url = {http://www.sciencedirect.com/science/article/pii/S1877050917309286},
publisher = {Elsevier}
}Downloads
No Downloads available for this publication
Permalink
- Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award) [Bibtex & Downloads]
Extending a Compiler Backend for Complete Memory Error Detection
Reference
Norman A. Rink, Jeronimo Castrillon, "Extending a Compiler Backend for Complete Memory Error Detection", In Proceeding: Lecture Notes in Informatics: Automotive - Safety & Security 2017 (Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Plödereder), pp. 61–74, May 2017. (Best paper award)
Abstract
Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.
Bibtex
@InProceedings{rink_automotive17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Extending a Compiler Backend for Complete Memory Error Detection},
booktitle = {Lecture Notes in Informatics: Automotive - Safety \& Security 2017},
editor = {Peter Dencker and Herbert Klenk and Hubert Kelle and Erhard Pl{\"o}dereder},
year = {2017},
pages = {61--74},
month = may,
abstract = {Technological advances drive hardware to ever smaller feature sizes, causing devices to become more vulnerable to faults. Applications can be protected against errors resulting from faults by adding error detection and recovery measures in software. This is popularly achieved by applying automatic program transformations. However, transformations applied to intermediate program representations are fundamentally incapable of protecting against vulnerabilities that are introduced during compilation. In particular, the compiler backend may introduce additional memory accesses. This report presents an extended compiler backend that protects these accesses against faults in the memory system. It is demonstrated that this enables the detection of all single bit flips in memory. On a subset of SPEC CINT2006 the runtime overhead caused by the extended backend amounts to 1.50x for the 32-bit processor architecture i386, and 1.13x for the 64-bit architecture x86 64.},
file = {:/Users/jeronimocastrillon/Documents/Academic/mypapers/1705_rink_automotive.pdf:PDF},
isbn = {978-3-88579-663-3},
issn = {1617-5468},
url = {https://dl.gi.de/bitstream/handle/20.500.12116/147/paper04.pdf?sequence=1&isAllowed=y},
comment={Best paper award}
}Downloads
1705_rink_automotive [PDF]
Related Paths
Orchestration Path, Resilience Path
Permalink
- Markus Haehnel, Frehiwot Melak Arega, Waltenegus Dargie, Robert Khasanov, Jeronimo Castrillon, "Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures", Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 432-437, May 2017. [doi] [Bibtex & Downloads]
Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures
Reference
Markus Haehnel, Frehiwot Melak Arega, Waltenegus Dargie, Robert Khasanov, Jeronimo Castrillon, "Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures", Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), pp. 432-437, May 2017. [doi]
Bibtex
@InProceedings{khasanov_dcperf17,
author = {Markus Haehnel and Frehiwot Melak Arega and Waltenegus Dargie and Robert Khasanov and Jeronimo Castrillon},
title = {Application Interference Analysis: Towards Energy-efficient Workload Management on Heterogeneous Micro-Server Architectures},
booktitle = {Proceedings of the 7th International Workshop on Big Data in Cloud Performance (DCPerf'17), IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS)},
year = {2017},
month = may,
volume={},
number={},
pages={432-437},
doi={10.1109/INFCOMW.2017.8116415},
ISSN={},
url = {http://ieeexplore.ieee.org/document/8116415/},
location = {Atlanta, USA}
}Downloads
1705_Khasanov_DCPerf [PDF]
Related Paths
Permalink
- Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi] [Bibtex & Downloads]
Trading Fault Tolerance for Performance in AN Encoding
Reference
Norman A. Rink, Jeronimo Castrillon, "Trading Fault Tolerance for Performance in AN Encoding", Proceedings of the ACM International Conference on Computing Frontiers (CF'17), ACM, pp. 183–190, New York, NY, USA, May 2017. [doi]
Bibtex
@InProceedings{rink_cf17,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {Trading Fault Tolerance for Performance in {AN} Encoding},
booktitle = {Proceedings of the ACM International Conference on Computing Frontiers (CF'17)},
year = {2017},
isbn = {978-1-4503-4487-6},
location = {Siena, Italy},
pages = {183--190},
numpages = {8},
url = {http://doi.acm.org/10.1145/3075564.3075565},
doi = {10.1145/3075564.3075565},
acmid = {3075565},
publisher = {ACM},
address = {New York, NY, USA},
month = may,
}Downloads
1705_Rink_cf [PDF]
Related Paths
Orchestration Path, Resilience Path
Permalink
- Lars Schütze, Jeronimo Castrillon, "Analyzing State-of-the-Art Role-based Programming Languages", Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17), ACM, pp. 9:1–9:6, New York, NY, USA, Apr 2017. [doi] [Bibtex & Downloads]
Analyzing State-of-the-Art Role-based Programming Languages
Reference
Lars Schütze, Jeronimo Castrillon, "Analyzing State-of-the-Art Role-based Programming Languages", Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17), ACM, pp. 9:1–9:6, New York, NY, USA, Apr 2017. [doi]
Bibtex
@InProceedings{schuetze_lassy17,
author = {Lars Sch{\"u}tze and Jeronimo Castrillon},
title = {Analyzing State-of-the-Art Role-based Programming Languages},
booktitle = {Proceedings of the First International Conference on the Art, Science and Engineering of Programming (Programming'17)},
series = {Programming '17},
year = {2017},
month = apr,
isbn = {978-1-4503-4836-2},
location = {Brussels, Belgium},
pages = {9:1--9:6},
articleno = {9},
numpages = {6},
url = {http://doi.acm.org/10.1145/3079368.3079386},
doi = {10.1145/3079368.3079386},
acmid = {3079386},
publisher = {ACM},
address = {New York, NY, USA},
}Downloads
1704_Schuetze_lassy [PDF]
Permalink
- Fazal Hameed, Jeronimo Castrillon, "Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization", Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE), EDA Consortium, pp. 362–367, Mar 2017. [doi] [Bibtex & Downloads]
Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization
Reference
Fazal Hameed, Jeronimo Castrillon, "Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization", Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE), EDA Consortium, pp. 362–367, Mar 2017. [doi]
Abstract
State-of-the-art DRAM cache employs a small Tag-Cache and its performance is dependent upon two important parameters namely bank-level-parallelism and Tag-Cache hit rate. These parameters depend upon the row buffer organization. Recently, it has been shown that a small row buffer organization delivers better performance via improved bank-level-parallelism than the traditional large row buffer organization along with energy benefits. However, small row buffers do not fully exploit the temporal locality of tag accesses, leading to reduced Tag- Cache hit rates. As a result, the DRAM cache needs to be re-designed for small row buffer organization to achieve additional performance benefits. In this paper, we propose a novel tag-store mechanism that improves the Tag-Cache hit rate by 70% compared to existing DRAM tag-store mechanisms employing small row buffer organization. In addition, we enhance the DRAM cache controller with novel policies that take into account the locality characteristics of cache accesses. We evaluate our novel tag-store mechanism and controller policies in an 8-core system running the SPEC2006 benchmark and compare their performance and energy consumption against recent proposals. Our architecture improves the average performance by 21.2% and 11.4% respectively compared to large and small row buffer organizations via simultaneously improving both parameters. Compared to DRAM cache with large row buffer organization, we report an energy improvement of 62%.
Bibtex
@InProceedings{hameed_date17,
author = {Fazal Hameed and Jeronimo Castrillon},
title = {Rethinking On-chip DRAM Cache for Simultaneous Performance and Energy Optimization},
booktitle = {Proceedings of the 2017 Design, Automation and Test in Europe conference (DATE)},
year = {2017},
series = {DATE '17},
pages = {362--367},
month = mar,
publisher = {EDA Consortium},
abstract = {State-of-the-art DRAM cache employs a small Tag-Cache and its performance is dependent upon two important parameters namely bank-level-parallelism and Tag-Cache hit rate. These parameters depend upon the row buffer organization. Recently, it has been shown that a small row buffer organization delivers better performance via improved bank-level-parallelism than the traditional large row buffer organization along with energy benefits. However, small row buffers do not fully exploit the temporal locality of tag accesses, leading to reduced Tag- Cache hit rates. As a result, the DRAM cache needs to be re-designed for small row buffer organization to achieve additional performance benefits. In this paper, we propose a novel tag-store mechanism that improves the Tag-Cache hit rate by 70\% compared to existing DRAM tag-store mechanisms employing small row buffer organization. In addition, we enhance the DRAM cache controller with novel policies that take into account the locality characteristics of cache accesses. We evaluate our novel tag-store mechanism and controller policies in an 8-core system running the SPEC2006 benchmark and compare their performance and energy consumption against recent proposals. Our architecture improves the average performance by 21.2\% and 11.4\% respectively compared to large and small row buffer organizations via simultaneously improving both parameters. Compared to DRAM cache with large row buffer organization, we report an energy improvement of 62\%.},
isbn = {978-3-9815370-8-6},
doi={10.23919/DATE.2017.7927017},
url = {http://ieeexplore.ieee.org/document/7927017/},
location = {Lausanne, Switzerland}
}Downloads
1703_Hameed_DATE [PDF]
Related Paths
Permalink
- Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017. [Bibtex & Downloads]
flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication
Reference
Norman A. Rink, Jeronimo Castrillon, "flexMEDiC: flexible Memory Error Detection by Combined data encoding and duplication", Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017, pp. 15–22, Mar 2017.
Abstract
Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.
Bibtex
@InProceedings{rees:2017,
author = {Norman A. Rink and Jeronimo Castrillon},
title = {{flexMEDiC}: flexible {M}emory {E}rror {D}etection by Combined data encoding and duplication},
booktitle = {Proceedings of the 2nd International Workshop on Resiliency in Embedded Electronic Systems (REES), co-located with DATE 2017},
year = {2017},
month = mar,
pages = {15--22},
abstract = {Errors in memory are known to be a major cause of system failures. Moreover, it has recently been found that single-error correcting, double-error detecting (SECDED) codes, which are widely used in ECC memory modules, are incapable of handling large fractions of errors that occur in practice. This calls for more powerful error detection measures. However, the higher the number of bit flips that can still be detected as an error, the larger the memory overhead. Cost considerations and the varying needs for reliability of different applications may not always warrant laying down extra hardware to accommodate overheads. Software-implemented error detection offers a flexible alternative. In this work we propose the software-implemented flexMEDiC scheme for detecting errors in the memory system, including main memory, on-chip caches, and load-store queues. It is shown that single and double bit flips are detected by flexMEDiC, and evidence is given that suggests that up to five bit flips within a single data word can still be detected as errors. The average runtime overhead incurred by flexMEDiC is 1.55x.},
}Downloads
1703_Rink_REES [PDF]
Related Paths
Permalink
- Jeronimo Castrillon, "Programming for adaptive and energy-efficient computing", In International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote), Mar 2017. [Bibtex & Downloads]
Programming for adaptive and energy-efficient computing
Reference
Jeronimo Castrillon, "Programming for adaptive and energy-efficient computing", In International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote), Mar 2017.
Bibtex
@Misc{castrillon2017hp3c,
author = {Castrillon, Jeronimo},
title = {Programming for adaptive and energy-efficient computing},
howpublished = {International Conference on High Performance Compilation, Computing and Communications (HP3C-2017) (keynote)},
month = mar,
year = {2017},
location = {Kuala Lumpur, Malaysia}
}Downloads
170323_castrill_hp3c [PDF]
Permalink
- Andrés Goens, Jeronimo Castrillon, "Optimizing for Data-Parallelism in Kahn Process Networks", In Proceeding: ACM SRC at International Symposium on
Code Generationand Optimization (CGO), Feb 2017. [Bibtex & Downloads]
Optimizing for Data-Parallelism in Kahn Process Networks
Reference
Andrés Goens, Jeronimo Castrillon, "Optimizing for Data-Parallelism in Kahn Process Networks", In Proceeding: ACM SRC at International Symposium on Code Generationand Optimization (CGO), Feb 2017.
Bibtex
@inproceedings{goens17cgo,
author = {Andr\'{e}s Goens and Jeronimo Castrillon},
title = {Optimizing for Data-Parallelism in Kahn Process Networks},
year = {2017},
month = feb,
booktitle= {ACM SRC at International Symposium on
Code Generationand Optimization (CGO)},
location = {Austin, TX, USA},
}Downloads
1701_Goens_SRCCGO [PDF]
Related Paths
Permalink
- Jeronimo Castrillon, "On Mapping to Multi/Manycores", In 10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017. [Bibtex & Downloads]
On Mapping to Multi/Manycores
Reference
Jeronimo Castrillon, "On Mapping to Multi/Manycores", In 10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017.
Bibtex
@Misc{castrillon2017multiprog,
author = {Castrillon, Jeronimo},
title = {On Mapping to Multi/Manycores},
howpublished = {10th International Workshop on Programmability and Architectures for Heterogeneous Multicores (MULTIPROG-2017), held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk)},
month = jan,
year = {2017},
location = {Stockholm, Sweden}
}Downloads
No Downloads available for this publication
Related Paths
Permalink
- Jeronimo Castrillon, "Flexible and Scalable Dataflow Programming for Manycores", In Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017. [Bibtex & Downloads]
Flexible and Scalable Dataflow Programming for Manycores
Reference
Jeronimo Castrillon, "Flexible and Scalable Dataflow Programming for Manycores", In Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk), Jan 2017.
Bibtex
@Misc{castrillon2017hipeactut,
author = {Castrillon, Jeronimo},
title = {Flexible and Scalable Dataflow Programming for Manycores},
howpublished = {Tutorial for heterogeneous multicore design automation: current and future, held in conjunction with the 12th International Conference on High-Performance and Embedded Architectures and Compilers (HiPEAC) (invited talk)},
month = jan,
year = {2017},
location = {Stockholm, Sweden}
}Downloads
No Downloads available for this publication
Related Paths
Permalink