Optimization Pass

Optimization Passes

In LLVM’s middle-end, dozens of optimization passes are sequentially applied to the IR to improve code quality.
Each pass takes IR as input and performs either analysis (extracting information) or transformation (actually modifying code).
Passes are generally categorized into:
- Analysis Passes – provide information about the program.
- Transform Passes – rewrite or optimize the IR.
LLVM’s Pass Manager orchestrates passes efficiently, providing required analysis results, caching them, and reusing information across passes.

Compiler Optimization Levels

Users can select predefined pass pipelines using optimization levels: -O1, -O2, -O3, -Os.
- -O2 performs medium-level optimizations.
- -O3 applies aggressive optimizations.
Higher levels increase compile time but usually improve runtime performance.
These levels are tuned by LLVM developers to provide good trade-offs in most cases.
Researchers continue to explore custom pass sequences for domain-specific performance improvements.

Key Optimization Passes

mem2reg (Memory to Register Promotion)
- Promotes stack variables (alloca) into registers, converting code into SSA form.
- Eliminates redundant alloca/store instructions and introduces φ-nodes as needed.
- Typically runs early, since most optimizations expect SSA form.
Instruction Combination (InstCombine)
- A peephole optimization pass that simplifies sequences of instructions.
- Performs algebraic simplifications such as merging arithmetic operations and removing redundancies.
- Does not alter CFG, focusing instead on local instruction-level optimizations.
Dead Code Elimination (DCE)
- Removes unused code, including:
  - Dead Instruction Elimination – instructions whose results are never used.
  - Dead Store Elimination – memory writes that are never read.
- Often runs multiple times to clean up after other optimizations.
Global Value Numbering (GVN)
- Eliminates redundant computations by recognizing when expressions compute the same value.
- Can remove redundant loads from memory by reusing previously loaded values.
Inlining (Function Inlining)
- Replaces function calls with the function’s body to eliminate call overhead.
- Aggressively applied at -O3 for small or frequently called functions.
- May increase code size but enables further optimizations (e.g., constant propagation, DCE).
Loop Invariant Code Motion (LICM)
- Moves computations that are constant across loop iterations outside the loop.
- Saves repeated work by computing once before the loop.
- Also promotes memory-only variables to registers.
Loop Optimizations (Unroll / Unswitch / Vectorize)
- Unroll: expands loop iterations inline to reduce loop overhead.
- Unswitch: pulls loop-invariant conditions outside the loop, splitting into specialized loops.
- Vectorize: converts loop operations into SIMD instructions for parallel execution.
SimplifyCFG
- Simplifies the control flow graph (CFG).
- Removes empty blocks, collapses branches with constant conditions, merges redundant paths.
- Helps reduce branching overhead and exposes further optimization opportunities.