Optimization Passes

  • In LLVM’s middle-end, dozens of optimization passes are sequentially applied to the IR to improve code quality.
  • Each pass takes IR as input and performs either analysis (extracting information) or transformation (actually modifying code).
  • Passes are generally categorized into:
    • Analysis Passes – provide information about the program.
    • Transform Passes – rewrite or optimize the IR.
  • LLVM’s Pass Manager orchestrates passes efficiently, providing required analysis results, caching them, and reusing information across passes.

Compiler Optimization Levels

  • Users can select predefined pass pipelines using optimization levels: -O1, -O2, -O3, -Os.
    • -O2 performs medium-level optimizations.
    • -O3 applies aggressive optimizations.
  • Higher levels increase compile time but usually improve runtime performance.
  • These levels are tuned by LLVM developers to provide good trade-offs in most cases.
  • Researchers continue to explore custom pass sequences for domain-specific performance improvements.

Key Optimization Passes

  • mem2reg (Memory to Register Promotion)
    • Promotes stack variables (alloca) into registers, converting code into SSA form.
    • Eliminates redundant alloca/store instructions and introduces φ-nodes as needed.
    • Typically runs early, since most optimizations expect SSA form.
  • Instruction Combination (InstCombine)
    • A peephole optimization pass that simplifies sequences of instructions.
    • Performs algebraic simplifications such as merging arithmetic operations and removing redundancies.
    • Does not alter CFG, focusing instead on local instruction-level optimizations.
  • Dead Code Elimination (DCE)
    • Removes unused code, including:
      • Dead Instruction Elimination – instructions whose results are never used.
      • Dead Store Elimination – memory writes that are never read.
    • Often runs multiple times to clean up after other optimizations.
  • Global Value Numbering (GVN)
    • Eliminates redundant computations by recognizing when expressions compute the same value.
    • Can remove redundant loads from memory by reusing previously loaded values.
  • Inlining (Function Inlining)
    • Replaces function calls with the function’s body to eliminate call overhead.
    • Aggressively applied at -O3 for small or frequently called functions.
    • May increase code size but enables further optimizations (e.g., constant propagation, DCE).
  • Loop Invariant Code Motion (LICM)
    • Moves computations that are constant across loop iterations outside the loop.
    • Saves repeated work by computing once before the loop.
    • Also promotes memory-only variables to registers.
  • Loop Optimizations (Unroll / Unswitch / Vectorize)
    • Unroll: expands loop iterations inline to reduce loop overhead.
    • Unswitch: pulls loop-invariant conditions outside the loop, splitting into specialized loops.
    • Vectorize: converts loop operations into SIMD instructions for parallel execution.
  • SimplifyCFG
    • Simplifies the control flow graph (CFG).
    • Removes empty blocks, collapses branches with constant conditions, merges redundant paths.
    • Helps reduce branching overhead and exposes further optimization opportunities.

Updated: