Anderson Faustino da Silva

E-mail

Phone

Visitor's Address

anderson.faustino_da_silva@tu-dresden.de

+49 (0)351 463 43726

Helmholtzstrasse 18,3rd floor, BAR III60

01069 Dresden
Germany

Curriculum Vitae

Anderson Faustino da Silva is a professor at the State University of Maringá, specializing in code generation and machine learning compilers. He holds a Bachelor's degree in Computer Science from the Western Paraná State University (1999) and earned both a Master's and PhD in Systems and Computing Engineering from COPPE/UFRJ (2003, 2006). Since May 2024, he has been a guest researcher at the Chair for Compiler Construction, focusing on machine learning compilers.

Publications

2026
Anderson Faustino da Silva, Sérgio Queiroz de Medeiros, Marcelo Borges Nogueira, Jeronimo Castrillon, Fernando Magno Quintão Pereira, "On the Precision of Dynamic Program Fingerprints based on Performance Counters", Proceedings of the 24th IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2026), Association for Computing Machinery, pp. 507–519, New York, NY, USA, Feb 2026. [doi] [Bibtex & Downloads]

On the Precision of Dynamic Program Fingerprints based on Performance Counters

Reference

Anderson Faustino da Silva, Sérgio Queiroz de Medeiros, Marcelo Borges Nogueira, Jeronimo Castrillon, Fernando Magno Quintão Pereira, "On the Precision of Dynamic Program Fingerprints based on Performance Counters", Proceedings of the 24th IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2026), Association for Computing Machinery, pp. 507–519, New York, NY, USA, Feb 2026. [doi]

Bibtex

@InProceedings{dasilva_cgo26,
author = {Anderson Faustino da Silva and S\'{e}rgio Queiroz de Medeiros and Marcelo Borges Nogueira and Jeronimo Castrillon and Fernando Magno Quint\~{a}o Pereira},
booktitle = {Proceedings of the 24th IEEE/ACM International Symposium on Code Generation and Optimization (CGO 2026)},
title = {On the Precision of Dynamic Program Fingerprints based on Performance Counters},
doi = {10.1109/CGO68049.2026.11395222},
location = {Sydney, Australia},
pages = {507--519},
publisher = {Association for Computing Machinery},
series = {CGO' 26},
url = {https://ieeexplore.ieee.org/abstract/document/11395222},
address = {New York, NY, USA},
month = feb,
year = {2026},
}

Downloads

2602_daSilva_CGO [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3860

×

2025
Anderson Faustino da Silva, Hamid Farzaneh, Joao Paulo Cardoso De Lima, Asif Ali Khan, Jeronimo Castrillon, "LearnCNM2Predict: Transfer Learning-based Performance Model for CNM Systems", Proceedings of the 25st IEEE International Conference on Embedded Computer Systems: Architectures Modeling and Simulation (SAMOS), Springer-Verlag, Berlin, Heidelberg, Jul 2025. (Best paper award candidate) [Bibtex & Downloads]

LearnCNM2Predict: Transfer Learning-based Performance Model for CNM Systems

Reference

Anderson Faustino da Silva, Hamid Farzaneh, Joao Paulo Cardoso De Lima, Asif Ali Khan, Jeronimo Castrillon, "LearnCNM2Predict: Transfer Learning-based Performance Model for CNM Systems", Proceedings of the 25st IEEE International Conference on Embedded Computer Systems: Architectures Modeling and Simulation (SAMOS), Springer-Verlag, Berlin, Heidelberg, Jul 2025. (Best paper award candidate)

Abstract
Compute-near-memory (CNM) architectures have emerged as a promising solution to address the von Neumann bottleneck by relocating computation closer to memory and utilizing dedicated logic near memory arrays or banks. Despite their early stage of development, these architectures have demonstrated significant performance improvements over traditional CPU and GPU systems in various application domains. CNM architectures tend to excel in memory-bound workloads that exhibit high levels of data-level parallelism. However, identifying which kernels can take advantage of CNM execution poses a considerable challenge for software developers. This paper introduces a transfer learning approach for predicting performance on CNM systems. Our method harnesses knowledge from previously analyzed applications to enhance prediction accuracy for new, unseen applications, thereby reducing the necessity for extensive training data for each application. We have developed a feature extraction framework that captures CNM-specific computation and memory access patterns, which are crucial for determining performance. Experimental results demonstrate that our transfer learning model achieves high prediction accuracy across diverse application domains, showcasing robust generalization even in scenarios with limited training data.

Bibtex

@InProceedings{dasilva_samos25,
author = {Anderson Faustino da Silva and Hamid Farzaneh and Joao Paulo Cardoso De Lima and Asif Ali Khan and Jeronimo Castrillon},
booktitle = {Proceedings of the 25st IEEE International Conference on Embedded Computer Systems: Architectures Modeling and Simulation (SAMOS)},
date = {2025-07},
title = {{LearnCNM2Predict}: Transfer Learning-based Performance Model for CNM Systems},

location = {Samos, Greece},
organization = {IEEE},
publisher = {Springer-Verlag},
address = {Berlin, Heidelberg},
month = jul,
numpages = {17},
year = {2025},
abstract = {Compute-near-memory (CNM) architectures have emerged as a promising solution to address the von Neumann bottleneck by relocating computation closer to memory and utilizing dedicated logic near memory arrays or banks. Despite their early stage of development, these architectures have demonstrated significant performance improvements over traditional CPU and GPU systems in various application domains. CNM architectures tend to excel in memory-bound workloads that exhibit high levels of data-level parallelism. However, identifying which kernels can take advantage of CNM execution poses a considerable challenge for software developers. This paper introduces a transfer learning approach for predicting performance on CNM systems. Our method harnesses knowledge from previously analyzed applications to enhance prediction accuracy for new, unseen applications, thereby reducing the necessity for extensive training data for each application. We have developed a feature extraction framework that captures CNM-specific computation and memory access patterns, which are crucial for determining performance. Experimental results demonstrate that our transfer learning model achieves high prediction accuracy across diverse application domains, showcasing robust generalization even in scenarios with limited training data.},
}

Downloads

2507_daSilva_SAMOS [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3839

×
Anderson Faustino da Silva, Jeronimo Castrillon, Fernando Magno Quintão Pereira, "A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 13–24, New York, NY, USA, Mar 2025. [doi] [Bibtex & Downloads]

A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers

Reference

Anderson Faustino da Silva, Jeronimo Castrillon, Fernando Magno Quintão Pereira, "A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 13–24, New York, NY, USA, Mar 2025. [doi]

Abstract
Classifying programs based on their tasks is essential in fields such as plagiarism detection, malware analysis, and software auditing. Traditionally, two classification approaches exist: static classifiers analyze program syntax, while dynamic classifiers observe their execution. Although dynamic analysis is regarded as more precise, it is often considered impractical due to high overhead, leading the research community to largely dismiss it. In this paper, we revisit this perception by comparing static and dynamic analyses using the same classification representation: opcode histograms. We show that dynamic histograms—generated from instructions actually executed—are only marginally (4-5%) more accurate than static histograms in non-adversarial settings. However, if an adversary is allowed to obfuscate programs, the accuracy of the dynamic classifier is twice higher than the static one, due to its ability to avoid observing dead-code. Obtaining dynamic histograms with a state-of-the-art Valgrind-based tool incurs an 85x slowdown; however, once we account for the time to produce the representations for static analysis of executables, the overall slowdown reduces to 4x: a result significantly lower than previously reported in the literature.

Bibtex

@InProceedings{dasilva_cc25,
author = {Anderson Faustino da Silva and Jeronimo Castrillon and Fernando Magno Quint\~{a}o Pereira},
booktitle = {Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025)},
title = {A Comparative Study on the Accuracy and the Speed of Static and Dynamic Program Classifiers},
doi = {10.1145/3708493.3712680},
isbn = {9798400714078},
location = {Las Vegas, NV, USA},
pages = {13--24},
publisher = {Association for Computing Machinery},
series = {CC 2025},
url = {https://doi.org/10.1145/3708493.3712680},
abstract = {Classifying programs based on their tasks is essential in fields such as plagiarism detection, malware analysis, and software auditing. Traditionally, two classification approaches exist: static classifiers analyze program syntax, while dynamic classifiers observe their execution. Although dynamic analysis is regarded as more precise, it is often considered impractical due to high overhead, leading the research community to largely dismiss it. In this paper, we revisit this perception by comparing static and dynamic analyses using the same classification representation: opcode histograms. We show that dynamic histograms---generated from instructions actually executed---are only marginally (4-5\%) more accurate than static histograms in non-adversarial settings. However, if an adversary is allowed to obfuscate programs, the accuracy of the dynamic classifier is twice higher than the static one, due to its ability to avoid observing dead-code. Obtaining dynamic histograms with a state-of-the-art Valgrind-based tool incurs an 85x slowdown; however, once we account for the time to produce the representations for static analysis of executables, the overall slowdown reduces to 4x: a result significantly lower than previously reported in the literature.},
address = {New York, NY, USA},
month = mar,
numpages = {11},
year = {2025},
}

Downloads

2503_daSilva_CC [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3805

×
Alexander Brauckmann, Anderson Faustino da Silva, Gabriel Synnaeve, Michael F. P. O'Boyle, Jeronimo Castrillon, Hugh Leather, "DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 92–103, New York, NY, USA, Mar 2025. [doi] [Bibtex & Downloads]

DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses

Reference

Alexander Brauckmann, Anderson Faustino da Silva, Gabriel Synnaeve, Michael F. P. O'Boyle, Jeronimo Castrillon, Hugh Leather, "DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses", Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025), Association for Computing Machinery, pp. 92–103, New York, NY, USA, Mar 2025. [doi]

Abstract
Data flow analysis is fundamental to modern program optimization and verification, serving as a critical foundation for compiler transformations. As machine learning increasingly drives compiler tasks, the need for models that can implicitly understand and correctly reason about data flow properties becomes crucial for maintaining soundness. State-of-the-art machine learning methods, especially graph neural networks (GNNs), face challenges in generalizing beyond training scenarios due to their limited ability to perform large propagations. We present DFA-Net, a neural network architecture tailored for compilers that systematically generalizes. It emulates the reasoning process of compilers, facilitating the generalization of data flow analyses from simple to complex programs. The architecture decomposes data flow analyses into specialized neural networks for initialization, transfer, and meet operations, explicitly incorporating compiler-specific knowledge into the model design. We evaluate DFA-Net on a data flow analysis benchmark from related work and demonstrate that our compiler-specific neural architecture can learn and systematically generalize on this task. DFA-Net demonstrates superior performance over traditional GNNs in data flow analysis, achieving F1 scores of 0.761 versus 0.009 for data dependencies and 0.989 versus 0.196 for dominators at high complexity levels, while maintaining perfect scores for liveness and reachability analyses where GNNs struggle significantly.

Bibtex

@InProceedings{brauckmann_cc25,
author = {Alexander Brauckmann and Anderson Faustino da Silva and Gabriel Synnaeve and Michael F. P. O'Boyle and Jeronimo Castrillon and Hugh Leather},
booktitle = {Proceedings of the 34th ACM SIGPLAN International Conference on Compiler Construction (CC 2025)},
title = {DFA-Net: A Compiler-Specific Neural Architecture for Robust Generalization in Data Flow Analyses},
doi = {10.1145/3708493.3712687},
isbn = {9798400714078},
location = {Las Vegas, NV, USA},
pages = {92–103},
publisher = {Association for Computing Machinery},
series = {CC 2025},
url = {https://doi.org/10.1145/3708493.3712687},
abstract = {Data flow analysis is fundamental to modern program optimization and verification, serving as a critical foundation for compiler transformations. As machine learning increasingly drives compiler tasks, the need for models that can implicitly understand and correctly reason about data flow properties becomes crucial for maintaining soundness. State-of-the-art machine learning methods, especially graph neural networks (GNNs), face challenges in generalizing beyond training scenarios due to their limited ability to perform large propagations. We present DFA-Net, a neural network architecture tailored for compilers that systematically generalizes. It emulates the reasoning process of compilers, facilitating the generalization of data flow analyses from simple to complex programs. The architecture decomposes data flow analyses into specialized neural networks for initialization, transfer, and meet operations, explicitly incorporating compiler-specific knowledge into the model design. We evaluate DFA-Net on a data flow analysis benchmark from related work and demonstrate that our compiler-specific neural architecture can learn and systematically generalize on this task. DFA-Net demonstrates superior performance over traditional GNNs in data flow analysis, achieving F1 scores of 0.761 versus 0.009 for data dependencies and 0.989 versus 0.196 for dominators at high complexity levels, while maintaining perfect scores for liveness and reachability analyses where GNNs struggle significantly.},
address = {New York, NY, USA},
month = mar,
numpages = {11},
year = {2025},
}

Downloads

2503_Brauckmann_CC [PDF]

Permalink

https://cfaed.tu-dresden.de/publications?pubId=3804

×

Anderson Faustino da Silva

2026

2025