A ground-up curriculum for systems programmers with C and x86-64 Assembly foundations. From IR basics to production compiler passes — no shortcuts, no handwaving.
-emit-llvm (output IR), -S (text form), -O0/-O2/-O3 (optimization levels), -c (object file). Practice: compile a C file to IR, to assembly, and to object code using only clang flags.-passes="mem2reg,dce", --print-changed, -stats. This is the tool you'll use most in Phase 4 when writing your own passes.-march (target arch), -filetype=asm vs obj. Try targeting a different arch than your host to feel what "hardware independence" actually means.int -> i32, long -> i64, double -> f64, struct -> {type1, type2, ...}, int* -> i32*. Pointers in IR carry type information; this is crucial for type-safe optimizations.alloca reserves stack memory. load reads from memory, store writes to it. In SSA form, memory operations break the "single assignment" rule, so early passes run mem2reg to convert them to registers. Understanding when and why memory is used is key.call instruction. Understand how arguments are passed, how return values work, and how calling conventions are hidden in the IR (the target-specific ABI handles them later).int arr[10]; arr[5] = 42; Generate IR. Understand getelementptr (GEP) — LLVM's way of computing addresses without loading/storing. This is a critical concept for optimizing array-heavy code.llvm/lib/IR — IR definitions. llvm/lib/Transforms — optimization passes. llvm/lib/Target — backends. clang/lib — C/C++ frontend. Don't try to understand everything; learn to navigate the source tree to find what you need.cmake -G Ninja, ninja, ninja llvm-tools. You won't need to modify the build system now, but knowing how to build LLVM will save hours of debugging later.-debug to print verbose pass execution info. Build LLVM with assertions enabled (CMAKE_BUILD_TYPE=Debug) and use -debug -debug-only=pass-name to spy on what specific passes do.-O0 disables most passes. -O2 runs mid-level optimizations. -O3 adds aggressive ones. They're just different pass pipelines. Reading what each level enables (in PassBuilder code) teaches you what tools the optimizer has.i1, i8, i32, i64 (integers), float, double, ptr (opaque pointer — modern LLVM), [N x T] (arrays), {T1, T2} (structs), T(T1, T2)* (function pointers).alloca (stack), load/store, add/sub/mul/sdiv, icmp/fcmp, br (conditional and unconditional), call/ret, getelementptr (GEP — pointer arithmetic), bitcast/trunc/zext/sext.codegen() method that returns an llvm::Value*. This interface is the bridge between frontend and backend.LLVMContext owns all IR objects. Module is a compilation unit (a .ll file in C++ form). IRBuilder is your "cursor" — it tracks the current insertion point and has methods for every instruction. You'll use these in every compiler you ever write.CreateAdd/Sub/Mul, CreateFCmpOLT (float compare), CreateBr/CreateCondBr, CreateCall, CreateRet, CreateAlloca, CreateStore/Load. Practice until you can translate IR syntax to C++ API calls without looking it up.Function::Create() with FunctionType. BasicBlock::Create() and setting the IRBuilder insertion point with SetInsertPoint(). Handle function arguments with func->arg_begin(). This pattern repeats in every backend codegen.IRCompileLayer, RTDyldObjectLinkingLayer. For Kaleidoscope, you add functions to the JIT incrementally as the user types them. This mirrors how REPLs work.sin, printf). Learn how symbol lookup works: JIT -> process symbols -> stdlib. Implement extern declarations in your language so Kaleidoscope can call C functions.FunctionPass, ModulePass, LoopPass. Pass registration with PassPluginLibraryInfo for out-of-tree passes.run(Function &F, FunctionAnalysisManager &AM) method. It returns PreservedAnalyses — telling the pass manager which analyses are still valid after your transformation. Returning PreservedAnalyses::all() vs none() has performance implications.for (BasicBlock &BB : F), for (Instruction &I : BB). Learn to use dyn_cast<BranchInst>(&I) for instruction type checking. This is boilerplate you'll type hundreds of times.DominatorTree &DT = AM.getResult<DominatorTreeAnalysis>(F).AliasAnalysis returns MustAlias, MayAlias, or NoAlias. Fundamental for auto-vectorization and memory-level parallelism.x * 2 -> x << 1, x + 0 -> x, x / 2 -> x >> 1 (unsigned). Study the existing instcombine source — it's 50k lines of patterns and teaches you how to think about transformations.I.use_empty() and instruction has no side effects, erase it.mul i32 %x, 4) and replaces them with shifts (shl i32 %x, 2). Extend to handle division. Test with opt --load-pass-plugin=./libMyPass.so -passes="strength-reduce".InlineFunction() utility in LLVM API.SmallVector<T, N>: vector with N elements on the stack before heap allocation. Avoids allocations for small collections (most compiler data). ArrayRef: non-owning reference to any array-like container. Use these everywhere in compiler code — never raw std::vector for IR-level work.StringRef: non-owning reference to a string — zero-copy. Twine: lazy string concatenation tree. Never allocate intermediate strings in hot paths. Profilers hate std::string in compilers.for (Use &U : V.uses()). Replace all uses with V.replaceAllUsesWith(NewV). This is the foundation of every transformation — finding what uses what.llvm/lib/Transforms/ contains all the production passes. llvm/lib/Target/X86/ is the best-commented backend. Read real code, not tutorials.Curated references organized by learning phase, from foundation concepts to advanced optimization and backend development.
clang mycode.c | Compile to executable |
clang -S -emit-llvm mycode.c -o mycode.ll | Generate LLVM IR (text) |
clang -c -emit-llvm mycode.c -o mycode.bc | Generate bitcode (binary) |
clang -O2 mycode.c | Optimized compilation |
clang -Xclang -emit-codegen-only mycode.c | Stop after codegen |
clang -print-search-dirs | Show include/library paths |
opt -O2 input.ll -o output.ll | Apply O2 optimizations |
opt -mem2reg input.ll | Promote allocas to registers |
opt -passes='dce,simplifycfg' input.ll | Run specific passes (LLVM 14+) |
opt -help | grep pass | List available passes |
opt -analyze -cfg input.ll | Print control flow graph |
opt -stats input.ll | Print optimization statistics |
llc input.ll -o output.s | Generate assembly |
llc -march=x86-64 input.ll | Target x86-64 |
llc -march=arm input.ll | Target ARM |
llc -O2 input.ll | Optimize during codegen |
llc -view-dag-combine1-dags | Visualize DAG |
llc -print-machineinstrs | Print Machine IR |
i1, i8, i16, i32, i64 | Integer types (bits) |
float, double | Floating point (32, 64 bit) |
ptr | Untyped pointer (LLVM 15+) |
type* | Typed pointer (legacy) |
[N x type] | Array of N elements |
{ type1, type2 } | Struct/Record |
%var = alloca i32 | Allocate on stack |
store i32 %val, ptr %var | Write to memory |
%val = load i32, ptr %var | Read from memory |
%res = add i32 %a, %b | Integer add |
%res = getelementptr [10 x i32], ptr %arr, i64 %idx | Calculate address |
call i32 @func(i32 %arg) | Function call |
llvm-dis mycode.bc -o mycode.ll | Disassemble bitcode |
llvm-as mycode.ll -o mycode.bc | Assemble to bitcode |
llvm-config --cxxflags --libs | Linking flags |
llvm-objdump output.o | Disassemble object file |
llc input.ll -print-after-all | Print IR after each pass |
opt -time-passes input.ll | Measure pass runtimes |
sudo apt-get install -y cmake ninja-build clang lld
git clone https://github.com/llvm/llvm-project.git
cd llvm-project/llvm && mkdir ../build && cd ../build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" ../llvm
ninja -j$(nproc) && sudo ninja install
clang --version
brew install cmake ninja llvm
OR build from source:
git clone https://github.com/llvm/llvm-project.git
cd llvm-project && mkdir build && cd build
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" ../llvm
ninja -j$(sysctl -n hw.ncpu) && sudo ninja install
Install Visual Studio 2019+ with C++ tools
git clone https://github.com/llvm/llvm-project.git
cd llvm-project && mkdir build && cd build
cmake -G "Visual Studio 17" -A x64 -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PROJECTS="clang" ../llvm
cmake --build . --config Release
echo 'int main() { return 42; }' > test.c
clang -S -emit-llvm test.c -o test.ll
opt -O2 test.ll -o test.opt.ll
llc test.opt.ll -o test.s
clang test.s -o test && ./test
git clone https://github.com/llvm/llvm-project.git && cd llvm-project && mkdir build && cd buildcmake -G Ninja -DCMAKE_BUILD_TYPE=Release ../llvm && ninja && ninja install-DCMAKE_CXX_FLAGS_RELEASE="-O3" for faster builds.llvm-config --cxxflags --ldflags --libs core to get linking flags, or use CMake's find_package(LLVM REQUIRED CONFIG).opt -O2 input.ll -o output.ll for O2 pipeline, or opt -passes='mem2reg,dce' input.ll (LLVM 14+) for specific passes.clang -S -emit-llvm mycode.c -o mycode.ll (human-readable) or clang -c -emit-llvm mycode.c -o mycode.bc (bitcode).