The dataflow monotone framework [2] is the classical mathematical model used to define and compute dataflow analyses (e.g., reaching definitions, live variables, constant propagation). It was introduced in the early 1970s by Gary Kildall and remains the foundation of most compiler analyses today. Kildall presented this framework in his 1973 work on global program optimization [1]. He observed that many compiler analyses can be formulated as systems of equations over program points. In this formulation, program properties are modeled as elements of a lattice, which enables the ordering of information (in terms of precision) and its combination via join or meet operations. The resulting systems of equations are solved through iterative fixed-point computation until convergence. Rather than designing a specialized algorithm for each analysis, Kildall introduced a general framework: by instantiating a lattice and a set of transfer functions, one obtains a concrete analysis. Modern systems—such as MLIR and abstract interpretation frameworks—are still deeply rooted in Kildall’s ideas. References [1] Gary Kildall: A unified approach to global program optimization. POPL 1973. https://lnkd.in/d3r7rr82 (Available on the ACM Digital Library) [2] Anders Møller and Michael I. Schwartzbach: Lecture Notes on Static Analysis. Aarhus University 2025
Compilers Lab
Research
Belo Horizonte, Minas Gerais 8,059 followers
Research in several areas related to the design and implementation of static analyses and code optimizations.
About us
The Compilers Lab is part of the Department of Computer Science of the Federal University of Minas Gerais. Members of the Compilers Lab do research in several areas related to the design and implementation of static analyses and code optimizations. The main research lines are listed below. - Software Security - Automatic Parallelization - Static Analyses The lab has enjoyed several partnerships with companies, including Intel (2013-2015), NVIDIA (2011-2013), LG Electronics (2015-2017), Maxtrack (2015-2016), Cyral (2019-today) and Prodemge (2016-2017), for instance. Alumni work today at places such as Microsoft, ARM, Google, Apple, Cyral and ShiftLeft. We are always looking for enthusiastic students, and more partnerships with other companies.
- Website
-
http://lac.dcc.ufmg.br/
External link for Compilers Lab
- Industry
- Research
- Company size
- 11-50 employees
- Headquarters
- Belo Horizonte, Minas Gerais
- Type
- Government Agency
- Founded
- 2015
- Specialties
- Compiladores, Otimização de código de máquina, Paralelização de algoritmos, and Teoria de compilação
Locations
-
Primary
Get directions
Rua Reitor Píres Albuquerque, ICEx
Belo Horizonte, Minas Gerais 31270-901, BR
Employees at Compilers Lab
Updates
-
Zhen Yan (Tsinghua University) and his colleagues have released VeriEQ [1]. VeriEQ addresses the problem of Behavioral Deviation Bugs (BDBs) in Verilog simulators and synthesizers. These are tools that translate or simulate hardware descriptions. If they are buggy (e.g., due to optimization or implementation flaws) then they can silently produce incorrect results. To find bugs, VeriEQ uses metamorphic testing: it generates multiple Verilog programs that are semantically equivalent (they compute the same function), feed them to the same tool, and check if outputs match. A mismatch indicates a bug. To make this practical, VeriEQ introduces three innovations. First, it analyzes 54 historical BDBs to create a template-driven generator that biases seed code toward bug-revealing patterns (shallow depth, x-initialized inputs, case statements, shifts, bit-width mismatches). Second, it defines semantics-aware equivalence rules that respect Verilog's bit-width and signedness inference (e.g., (a+0)=a holds only when the constant's width and signedness match the context). Third, it uses inlined deviation checking: all equivalent modules are instantiated in a single testbench with shared stimuli, comparing outputs in parallel for 138–4161% speedup. The final result is impressive: on four open-source tools, VeriEQ found 33 previously unknown bugs (29 BDBs) in 48 hours. VeriEQ is particularly interesting to our group because we have been working on different approaches to find bugs in EDA tools, as part of a partnership we have with Cadence. As a result of this cooperation, João Victor Amorim Vieira and Luiza de Melo Gomes have released ChiGen [2], a tool that generates random Verilog designs. Like VeriEQ, ChiGen has also found dozens of bugs in actual EDA tools. ChiGen is trained on a large collection of Verilog benchmarks called ChiBench [3]. That's a collection of 50 thousand designs mined from github repositories that have permissible licenses. ChiBench can be freely used by anyone trying to analyze, understand or optimize Verilog designs. References: [1] Zhen Yan, Yuanliang Chen, Fuchen Ma, Zehong Yu, Dalong Shi and Hu Jiang. VeriEQ: Finding Verilog Simulators and Synthesizers Bugs with Equivalence Circuit Transformation. OOPSLA 2026. 127:1-127:29 [2] João Victor Amorim Vieira, Luiza de Melo Gomes, Rafael Sumitani, Raissa Maciel, Augusto Mafra, Mirlaine Crepalde, Fernando Pereira: Bottom-Up Generation of Verilog Designs for Testing EDA Tools. CoRR abs/2504.06295 (2025) - https://lnkd.in/d2kS2Upg [3] Rafael Sumitani, João Victor Amorim, Augusto Mafra, Mirlaine Crepalde, Fernando Magno Quintão Pereira: ChiBench: a Benchmark Suite for Testing Electronic Design Automation Tools. CoRR abs/2406.06550 (2024) - https://lnkd.in/dsGE_wB4 Code: [ChiGen] https://lnkd.in/eARXe2F5
-
-
Leandro Lacerda, a PhD student in Computer Science at Departamento de Ciência da Computação - UFMG, has recently contributed a new feature to the LLVM Project that enables elapsed-time measurement between GPU events in the LLVM Offload stack. This contribution adds support for elapsed-time queries both externally, through the LLVM Offload API, and internally, through the plugin interface and backend implementations. The new functionality makes it possible to measure the execution time of kernel launches and other queue-submitted work in a reliable and consistent way across GPU backends. The main implementation challenge was in the AMDGPU backend. Before this patch, AMDGPU events supported synchronization and completion queries, but they were not associated with a real timed marker that could later be queried for elapsed time. Leandro’s work addressed this limitation by making event recording enqueue a real barrier packet on the HSA queue and retain its completion signal, allowing the runtime to query hardware dispatch timestamps. The patch also added the corresponding support for CUDA. This contribution is particularly relevant to Leandro’s PhD research project, which investigates GPU math functions and requires robust benchmarking infrastructure to support reliable and reproducible performance evaluation across libraries and hardware platforms. It also improves the LLVM runtime layer used by both the Offload API and OpenMP offloading. Read more about the contribution here: https://lnkd.in/d3JFrSY6
-
-
Local register allocation is the problem of assigning machine registers to variables within a single basic block of a control-flow graph (a straight-line sequence of instructions with one entry and one exit). The allocator decides which values stay in registers and which must be spilled to memory, aiming to minimize loads and stores. This problem is important because, when registers are assumed to be identical and indivisible, local register allocation can be solved efficiently (in polynomial time). This makes it attractive for performance-sensitive compilation pipelines. Because it is fast and effective, variants of local register allocation are widely used in just-in-time (JIT) compilers, where compilation speed is critical. They also appear in trace compilers, where programs are optimized along linear execution traces. Since these traces resemble straight-line paths (or trees of paths), local allocation techniques map naturally to them, providing good code quality at low compilation cost. An implementation of local register allocation in LLVM: https://lnkd.in/dnVab6uC A survey on register allocation: https://lnkd.in/d4e6p6Xf #compiler #programming #llvm
-
-
Value-guided code specialization is a static optimization technique where the compiler analyzes a specific dataset to inform its code generation strategy. By exploiting static knowledge of the data's structure, the compiler produces highly efficient, specialized routines that outperform general-purpose implementations by eliminating unnecessary runtime checks and branching. Examples of this approach include: * The SEPE hash function generator [1]: By identifying patterns in the key domain at specialization time (e.g., specific formats like SSN numbers), SEPE hardcodes the processing of only the relevant bytes, bypassing redundant operations. * The QIGen non-uniform quantizer [2]: Because the bit-width of each weight group is known at compile time, QIGen generates the exact unpacking logic required for each section. This removes the need for the processor to branch on bit-width variations during execution. In both instances, specialization provides a performance boost by stripping away the overhead required to handle edge cases or data formats that never appear within the targeted domain. References [1] Renato Barreto Hoffmann Filho, Leonardo Gibrowski Faé, Dalvan Griebler, Xinliang (David) Li, Fernando Pereira: Automatic Synthesis of Specialized Hash Functions. CGO 2025: 317-330 Paper: https://lnkd.in/d7y3Qshn Code: https://lnkd.in/dn6f58PG Tommaso Pegolotti, Dan Alistarh, Markus Püschel: QIGen: A Kernel Generator for Inference on Nonuniformly Quantized Large Language Models. CGO 2026: 589-602 Paper: https://lnkd.in/dHnMUYSv Code: https://lnkd.in/d97hXGQU #research #academia #compiler #programming
-
-
The paper WAX: Optimizing Data Center Applications With Stale Profile introduces a new way to make large computer programs run faster in data centers, even when the profile data used to optimize them is slightly out of date. These datacenter programs tend to change very often, as they are going through constant development. Consequently, profile data becomes stale real fast: what worked for the binary of last week might no longer work for the binary of today. One can solve this issue by projecting profile info from one program onto another. However, mapping a profile from a program onto another version of that same program is a difficult problem. Recently, Elisa Fröhlich has proposed a solution based on similarity search to map basic blocks of two binaries [1]. The WAX approach [2] goes beyond that: in this work, Tawhid Bhuiyan and his colleagues from Columbia University and Microsoft show that it is possible to leverage source code knowledge (available through debug information) to make the mapping much more precise. References [1] Automatic Propagation of Profile Information through the Optimization Pipeline, by Elisa Fröhlich, Angelica Moreira, Ph.D. and Fernando Pereira Paper: https://lnkd.in/ewMM_k99 Code: https://lnkd.in/dPf3i6dF [2] Wax: Optimizing Data Center Applications With Stale Profile, by Tawhid Bhuiyan, Sumya H., Angelica Moreira, Ph.D. and Tanvir Ahmed Khan Paper: https://lnkd.in/dDumAMUd Code: https://lnkd.in/dyYnNS3h
-
-
Many programming languages have gradually evolved from a purely object-oriented, imperative style toward a more data-oriented, functional style. This is evident in the evolution of both C++ and Java, which bears striking similarities. Both languages began with a focus on simple objects and classes; later, they incorporated parametric polymorphism (through templates in C++ and generics in Java); subsequently, they added lambda expressions to support functional idioms. Currently, both are moving toward formal pattern matching. While native pattern matching is not yet part of the C++ standard, it is the driving force behind the "match expression" proposed in P2688 [1]. References: [1] https://lnkd.in/gkdXcZGh
-
-
Church Booleans encode the Boolean values true and false, e.g.: True = \a.\b.a False = \a.\b.b Using this convention, we can define all Boolean operations (AND, OR, NOT, etc, etc). We can even write programs with them. However, in practice, this style works more smoothly in dynamically typed languages. In statically typed languages, the type system can get in the way. For instance, try defining XOR in Standard ML/NJ and observe what happens. More generally, Church encoding is a technique introduced by Alonzo Church for representing data (such as booleans, numbers, and pairs) purely using functions in the λ-calculus: https://lnkd.in/d4fSduus #programming #mathematics #academia #logic
-
-
Thompson's "Trusting Trust" attack is a circular exploit where a malicious compiler recognizes when it is compiling a specific program, such as a login utility [1,3], and injects a backdoor into the binary. To remain invisible, the compiler is also programmed to recognize when it is compiling its own source code; in this case, it injects the malicious "backdoor-inserting" logic into the new version of the compiler binary. Because this secondary payload exists only in the executable and not in the source code, developers will not find anything wrong if they audit the source code, yet every version of the compiler they build from that "clean" source will continue to compromise any software it processes. The Gödel-based certification process [2, 4] proposed by Guilherme Silva works by mapping the structure of a program's source code and its compiled binary into a unique integer (a Gödel number) using a series of injective prime-power products. Each programming construct, such as an assignment or a loop, and its position in the abstract syntax tree are assigned specific prime factors, ensuring that any modification, deletion, or reordering of instructions results in a different final integer. The Göde-based certification prevents the "trusting trust" attack because it establishes a mathematical contract that is independent of the compiler's internal logic. Hence, if a malicious compiler attempts to inject a backdoor into the binary, the resulting Gödel number derived from that binary will not match the number derived from the original source code. Because the certification rules are simple enough to be verified by a small, auditable "checker" or even by hand for critical sections, the attacker cannot hide unauthorized code changes within the translation process without breaking the structural identity of the program. References: [1] Russ Cox: Running the “Reflections on Trusting Trust” Compiler - Revisiting Ken Thompson’s sourceless backdoor. ACM Queue Volume 23, Issue 6, 2026 - This paper shows how to run Thompson's Hack on a real-world setting. https://lnkd.in/dHXa6zqX [2] Guilherme Silva, Fernando Pereira: Certified Compilation based on Gödel Numbers. CoRR abs/2508.12054 (2025) https://lnkd.in/dyRVtGcs Code: [3] An earlier description of Thompson's Hack, also by Russ Cox: https://lnkd.in/eZ9KR9At [4] The Charon Compiler implements the Gödel-based certification approach: https://lnkd.in/eNJ4bTk7
-
-
Global Value Numbering (GVN) is a classic compiler optimization that detects when different expressions compute the same value. The idea is to assign a "value number" to each expression such that two expressions get the same number if they are guaranteed to produce the same result. Using these numbers, the compiler can eliminate redundant computations (a form of common subexpression elimination) and sometimes simplify code further. Want to know how it works? LLVM has a production-quality GVN pass. Here are some pointers: * Header: https://lnkd.in/dtW2pki8 * Implementation: https://lnkd.in/ddy__7n3 In LLVM, Global Value Numbering (GVN) assigns a value number to each SSA value based on the computation it represents, so that different instructions that compute the same result receive the same number. It builds canonical representations of expressions (opcode + operands, normalized when possible), uses hashing to detect previously seen equivalent expressions, and then reuses existing values instead of recomputing them. Notice that value numbers are not restricted to expressions within a single basic block. Whole program slices can receive value numbers. Take a look into the paper below to see how that's done: Rafael Alvarenga, Daniel Augusto Costa de Sá, Rodrigo Rocha, Fernando Pereira: Idempotent Slices with Applications to Code-Size Reduction. https://lnkd.in/eyZ9f5Ea
-