Debug & Metadata
Debug Information
- Debug information refers to metadata that records the correspondence between generated machine code and the original source code.
- When compiled with the
-gflag, the compiler produces extra information such as variable names, source line numbers, and scope information, which are stored in a standard debugging format like DWARF. - At the LLVM IR level, this debug info appears as metadata embedded in the IR.
- Metadata provides additional information for compilers and debuggers, while not affecting program execution.
- In LLVM IR, metadata is denoted using the
!(exclamation mark) syntax, either attached to instructions or defined as metadata nodes.
IR Metadata
- Metadata in LLVM IR is optional supplementary information that has no effect on program behavior.
- The guiding principle: compiler optimizations and code generation must not change due to debug info.
- Optimizations run the same way whether or not metadata exists; debug info only serves as a mapping for debugging tools.
Example 1
- With
-g, Clang emits IR where instructions are annotated with!dbg !Nentries, and!DIxxxnodes at the bottom describe file names, variables, etc.
store i32 %val, ptr %ptr, align 4, !dbg !15
!15 = !DILocation(line: 42, column: 5, scope: !8)
- This indicates that the
storecorresponds to line 42, column 5 in the original source code. - Every IR instruction can thus be traced back to its source origin.
Example 2
- Local variable debug info can be represented via debug intrinsics such as
llvm.dbg.declareorllvm.dbg.value:
call void @llvm.dbg.declare(metadata ptr %x.addr, metadata !11, metadata !DIExpression()), !dbg !15
- And a corresponding metadata node:
!11 = !DILocalVariable(name: "x", ... )
- This allows debuggers (e.g., GDB, LLDB) to display the original source variable names and types.
- Later LLVM versions have moved towards storing this as metadata records rather than explicit intrinsics.
Other Uses of Metadata
- Beyond debug info, metadata provides optimization hints:
!range→ specifies possible value ranges (e.g., 0 or 1).!tbaa(Type-Based Alias Analysis) → informs aliasing rules for loads/stores, enabling memory access reordering.!llvm.loop→ loop hints (e.g., unrolling, vectorization).!prof→ branch probability profiling data.
- Metadata is optional but, when present, enables better optimizations and more informative debugging.
Stripping Metadata
- To remove metadata from LLVM IR:
opt -strip-debug→ removes debug metadata only.opt -strip→ removes all symbolic names and metadata.
- This can simplify IR for human inspection, though at the cost of losing source-level correspondence.
In Summary…
- Metadata in LLVM IR provides optional information for debugging and optimization without affecting program execution.
- Debug info is a primary example, mapping IR instructions back to source code.
- Other forms of metadata serve as hints to optimizers.
- Understanding metadata helps in interpreting why certain annotations appear in IR and how they influence tools and optimization pipelines.