Optimization Passes
- In LLVM’s middle-end, dozens of optimization passes are sequentially applied to the IR to improve code quality.
- Each pass takes IR as input and performs either analysis (extracting information) or transformation (actually modifying code).
- Passes are generally categorized into:
- Analysis Passes – provide information about the program.
- Transform Passes – rewrite or optimize the IR.
- LLVM’s Pass Manager orchestrates passes efficiently, providing required analysis results, caching them, and reusing information across passes.
Compiler Optimization Levels
- Users can select predefined pass pipelines using optimization levels:
-O1, -O2, -O3, -Os.
-O2 performs medium-level optimizations.
-O3 applies aggressive optimizations.
- Higher levels increase compile time but usually improve runtime performance.
- These levels are tuned by LLVM developers to provide good trade-offs in most cases.
- Researchers continue to explore custom pass sequences for domain-specific performance improvements.
Key Optimization Passes
- mem2reg (Memory to Register Promotion)
- Promotes stack variables (
alloca) into registers, converting code into SSA form.
- Eliminates redundant
alloca/store instructions and introduces φ-nodes as needed.
- Typically runs early, since most optimizations expect SSA form.
- Instruction Combination (InstCombine)
- A peephole optimization pass that simplifies sequences of instructions.
- Performs algebraic simplifications such as merging arithmetic operations and removing redundancies.
- Does not alter CFG, focusing instead on local instruction-level optimizations.
- Dead Code Elimination (DCE)
- Removes unused code, including:
- Dead Instruction Elimination – instructions whose results are never used.
- Dead Store Elimination – memory writes that are never read.
- Often runs multiple times to clean up after other optimizations.
- Global Value Numbering (GVN)
- Eliminates redundant computations by recognizing when expressions compute the same value.
- Can remove redundant loads from memory by reusing previously loaded values.
- Inlining (Function Inlining)
- Replaces function calls with the function’s body to eliminate call overhead.
- Aggressively applied at
-O3 for small or frequently called functions.
- May increase code size but enables further optimizations (e.g., constant propagation, DCE).
- Loop Invariant Code Motion (LICM)
- Moves computations that are constant across loop iterations outside the loop.
- Saves repeated work by computing once before the loop.
- Also promotes memory-only variables to registers.
- Loop Optimizations (Unroll / Unswitch / Vectorize)
- Unroll: expands loop iterations inline to reduce loop overhead.
- Unswitch: pulls loop-invariant conditions outside the loop, splitting into specialized loops.
- Vectorize: converts loop operations into SIMD instructions for parallel execution.
- SimplifyCFG
- Simplifies the control flow graph (CFG).
- Removes empty blocks, collapses branches with constant conditions, merges redundant paths.
- Helps reduce branching overhead and exposes further optimization opportunities.