Skip to main content

v0.5.0 - The compile pipeline

Released May 5, 2026.

When you run python script.py, the interpreter does a lot of work before the first bytecode executes. It parses the source into an AST. It validates the AST (rejecting things the grammar permits but the language forbids, like True = 1). It runs the symtable resolver, which decides for every name whether it's local, free, cell, global, or a class attribute. It runs codegen to turn the resolved AST into a flat instruction sequence. It runs the flowgraph optimizer, which folds constants, threads jumps, and strips unreachable blocks. Finally, it assembles the optimized sequence into a code object with a real bytecode buffer, a line table, an exception table, and a localsplus layout.

Every one of those stages is its own world. CPython's Python/compile.c historically rolled them all together; the 3.12 cleanup split them into separate files (Python/codegen.c, Python/flowgraph.c, Python/assemble.c, Python/instruction_sequence.c). The new structure makes the pipeline legible: each stage takes a well-typed input and produces a well-typed output.

v0.5.0 ports that pipeline. After this release, a Go caller can hand compile.Compile a parsed AST module and get back a real Code object. The Code object carries a real bytecode buffer, a real const table (with int-int constant folding applied), a real line table in PEP 626 format, a real exception table in PEP 657 format, and a real co_localsplus flattening of locals plus cells plus free vars.

The interpreter to run that Code object lands in v0.6. The parser to feed compile.Compile from source lands in v0.5.5 plus v0.6. For v0.5 the gate uses hand-built AST modules and pins the disassembly text against a checked-in golden corpus.

Highlights

Three themes anchor this release.

A real, optimizing compile pipeline

The pipeline is five stages.

// 1. The AST has already been built (or hand-constructed for tests).
mod := buildAST(`a = 1 + 2`)

// 2. Validate. Rejects unparseable but invalid programs.
if err := ast.Validate(mod); err != nil { /* SyntaxError */ }

// 3. Resolve symbols. Every name gets a scope.
st, _ := symtable.Build(mod, filename, futureFlags)

// 4. Codegen. AST + symtable becomes an instruction sequence.
seq, _ := compile.Codegen(mod, st, optimize)

// 5. Flowgraph optimization, then assemble. Sequence becomes Code.
cfg, _ := compile.FromSequence(seq)
cfg.Optimize()
code, _ := compile.Assemble(cfg, filename)

// `a = 1 + 2` now lives as `LOAD_CONST 3 / STORE_NAME a` in
// code.bytecode. The int-int folder collapsed the addition at
// compile time, the same way CPython does.

Each step is a 1:1 port of the matching CPython source file. The pipeline is byte-shape compatible with CPython's: same co_flags encoding, same line table format, same exception table format, same co_localsplus layout, same const-dedup rules including the float bit-pattern keying that makes NaN-safe dedup work.

Disassembly text as the gate

When you don't have an interpreter yet (we don't, until v0.6), how do you verify your compile pipeline is correct? The answer CPython uses for its own development is dis.dis: render the Code object as a one-line-per-instruction listing and read the listing.

We ported dis.dis and built our gate on top of it.

0 RESUME 0
2 LOAD_CONST 3
4 STORE_NAME a
6 LOAD_CONST None
8 RETURN_VALUE

Ten checked-in .golden files pin the disassembly text for ten representative modules (empty_module, simple_assign, binary_add, load_after_store, if_pass, while_pass, def_add_one, async_def_pass, class_pass, type_alias). If codegen drifts by one byte, the golden test fails. If the flowgraph optimizer stops folding 1 + 2, the binary_add.golden file changes and the test fails. Updating goldens is opt-in (go test ./v05test/ -update -run TestGolden) and CI never passes -update.

This is the same approach CPython uses for its own test-suite-as-spec: Lib/test/test_dis.py pins disassembly text for many of the same shapes we cover. Our .golden files are the same idea, captured as separate files for diff legibility.

Per-pattern Match support, end to end

PEP 634 (structural pattern matching) added the match statement in Python 3.10. The codegen for it is its own subsystem because patterns aren't expressions: they have their own grammar (MatchValue, MatchSingleton, MatchSequence, MatchMapping, MatchClass, MatchStar, MatchAs, MatchOr) and their own emit strategy (set up a guard label, walk the pattern, jump to the next case on mismatch).

We ported the full pattern panel.

match point:
case (0, 0):
return "origin"
case (x, 0):
return f"x-axis at {x}"
case (0, y):
return f"y-axis at {y}"
case _:
return "elsewhere"

The codegen for this walks every pattern kind:

  • (0, 0) is a MatchSequence of two MatchValue subpatterns.
  • (x, 0) is a MatchSequence with one MatchAs and one MatchValue.
  • _ is a MatchAs with no name.

Each pattern emits the matching instruction sequence (with labels, jumps, and capture stores) that CPython would emit. The result rounds through flowgraph and assemble like any other AST.

What's new

The full breakdown by package.

ast/

Everything AST-shaped. Ports the AST node definitions plus the three AST processors CPython ships (validate, preprocess, unparse).

  • asdl.go plus the generated nodes_gen.go from Parser/Python.asdl. The ASDL grammar is the source of truth for what nodes exist, what unions they belong to, and what fields they carry. We pulled Python.asdl from the CPython 3.14 tree and ran our Go-targeted generator against it, the same way CPython runs Parser/asdl_c.py against the same file to produce Python/Python-ast.c.
  • validate.go plus validate_panel.go from Python/ast.c. The AST validator runs after parsing and before the rest of the pipeline. It checks position sanity (line and column numbers monotonic), rejects forbidden identifiers (None, True, False), enforces the Constant kind whitelist (only ints, floats, complex, bytes, str, None, True, False, Ellipsis can be a Constant value), enforces comprehension-shape rules (at least one generator clause), enforces expr_context consistency for assignment targets, and validates the eight match-pattern kinds.
  • preprocess.go from Python/ast_preprocess.c. The AST preprocessor runs after validation and before symtable. It does the small rewrites CPython performs at compile time: PEP 765 finally-block control-flow checks (warning when return / break / continue escapes a finally), string % tuple printf-format fold (replacing "%s" % (x,) with the equivalent BinOp at constant time), Name("__debug__") substitution with the optimize-level Constant, MatchValue / MatchMapping numeric folds, -OO docstring removal, PEP 563 annotation skip.
  • unparse.go from Python/ast_unparse.c. The reverse direction: turn an AST back into source. Used by ast.unparse, by the repr of certain AST nodes, and by our golden tests when we want to print an AST round-trip alongside the disassembly. Handles operator-precedence parenthesization, the 1e309 rendering of infinity (because inf is not a literal), and f-string / t-string round-trip.

future/

Port of Python/future.c. Detects from __future__ import statements at the top of a module and sets the matching feature flags. The flags then route into codegen and influence how certain forms compile.

The flags we cover: Annotations (PEP 563 stringized annotations), BarryAsBdfl (the April Fools != rewrite, which is real and shipped), division, absolute_import, print_function, unicode_literals, nested_scopes, with_statement, generators. The earlier flags are all-on-by-default in 3.14, but the syntax from __future__ import nested_scopes is still legal and we still parse it.

SyntaxError strings for misplaced __future__ imports (after a non-future statement, after a docstring with code between) are preserved verbatim from CPython.

symtable/

Full port of Python/symtable.c. The symtable is where every name in a program gets its scope decided.

  • Build(mod, filename, futureFlags) (*Symtable, error) walks the AST, opens one Entry per scope (module, function, class, lambda, comprehension), and registers every binding (x = 1, def f, class C, function arg, import x, except as e, with as e, for x in, walrus assignment) plus every reference. After Build, every name in the program has a flag set on its enclosing entry.
  • analyze.go runs the post-build resolution pass. Free variables bubble up to the nearest enclosing scope that binds them. Cells get pulled down where needed. Comprehensions inline into their enclosing function when safe (PEP 709, the 3.12 inlining work). Class scopes are special-cased: a class body's locals don't participate in the closure chain for nested functions defined inside it.
  • mangle.go implements name mangling for class private attributes. __name becomes _ClassName__name inside a class body, and the mangle has to run early enough that the symtable records the mangled form, not the source form.
  • The errors panel covers every CPython diagnostic the symtable raises: assignment to free variable, duplicate global / nonlocal declarations, walrus inside class body, named-expr inside iterable in a comprehension, async/await placement in non-async-def scope, and a dozen others. Each error has its text preserved byte for byte from the C source.

compile/

The compiler proper. Five files, each a 1:1 port of a CPython counterpart.

compile/instrseq.go

Port of Python/instruction_sequence.c. The shared instruction sequence representation that codegen produces and flowgraph consumes.

  • Sequence. A list of Instr plus a label table.
  • Instr. Opcode, oparg, location (line + column ranges).
  • JumpTargetLabel. A label that resolves to an Instr offset during flowgraph processing.
  • Addop, Insert, AddNested, ApplyLabelMap as the construction primitives codegen calls.

compile/codegen.go and the visitor panel

Port of Python/codegen.c. The visitor that walks the AST and emits an instruction sequence per scope. The driver is a Compiler struct holding a stack of Units, one per scope; each Unit accumulates an instrseq.Sequence. When the visitor descends into a nested scope, it pushes a Unit; when it exits, the Unit becomes a nested Code object referenced from the outer scope's const table.

We ship the visitors for:

  • Module / Interactive / Expression as the three top-level module shapes.
  • Statements. Pass, ExprStmt, Return, Assign, AugAssign, AnnAssign, Delete, Raise, Assert, Import, ImportFrom.
  • Control flow. If, While, For, AsyncFor, Break, Continue.
  • Definitions. FunctionDef, AsyncFunctionDef, ClassDef, Lambda.
  • Context managers. With, AsyncWith.
  • Exceptions. Try, TryStar (PEP 654 exception groups).
  • Pattern matching. Match plus the eight pattern kinds (Value, Singleton, As, Sequence, Mapping, Class, Or, Star).
  • Comprehensions. ListComp, SetComp, DictComp, GeneratorExp.
  • TypeAlias via CALL_INTRINSIC_1 INTRINSIC_TYPEALIAS (PEP 695).
  • Expressions. BoolOp, BinOp, UnaryOp, Compare, Call, Constant, Name, Attribute, Subscript, Tuple, List, Set, Dict, Starred, Slice, JoinedStr, FormattedValue, NamedExpr, Yield, YieldFrom, Await, IfExp.
  • Assignment targets. Name, Attribute, Subscript, Tuple / List unpack, Starred (UNPACK_SEQUENCE / UNPACK_EX).

Each visitor is a direct translation of the matching function in Python/codegen.c. We kept the function names aligned (a visitor named codegen_visit_stmt_If in C becomes codegen.visitStmtIf in Go) so cross-referencing the port against the source is a name lookup, not a structural search.

compile/flowgraph.go

Port of Python/flowgraph.c. The CFG-driven optimizer.

The flow is:

  1. FromSequence. Convert an instrseq.Sequence to a flowgraph.CFG of BasicBlocks, splitting at labels and after terminator opcodes.
  2. Optimization passes, run to fixed point:
    • Int-int BINARY_OP constant folding (1 + 2 -> 3 at compile time).
    • Jump threading (jump-to-jump shortened to direct jump).
    • Conditional-jump propagation (jump-if-true with a known true target eliminates the conditional).
    • Unreachable-block elimination via DFS reachability, with exception handler labels pinned as roots so a handler reachable only through a raise still survives.
    • Dead-code elimination after unconditional terminators.
    • Redundant-NOP compaction.
  3. ToSequence. Convert the optimized CFG back to a flat Sequence for the assembler.
  4. Stackdepth analysis. A forward linear scan over the sequence using a hand-written effect table. Each opcode contributes a per-side delta; the running max is co_stacksize.

A few optimizer passes from flowgraph.c are not in this drop: swaptimize (the SWAP-based instruction reordering), super-instruction fusion (LOAD_FAST_LOAD_FAST and friends), the LOAD_FAST ref-stack mechanism, cold-block hoisting, the CFG-based stack-depth analyzer, and full pseudo-op lowering. They're tracked in the 1627 spec and land incrementally; the v0.5 baseline is enough for the v0.6 VM.

compile/assemble.go

Port of Python/assemble.c. The sequence-to-Code converter.

  • EXTENDED_ARG widening. A 32-bit oparg needs up to three EXTENDED_ARG prefixes. The assembler inserts them where needed.
  • PEP 626 line table. The compact byte format that maps bytecode offsets to source lines. We emit the short form, the one-line form, the long form, the no-location form, and the no-column varint form, matching CPython's _PyCode_LineNumberFromArray exactly.
  • PEP 657 exception table. The 6-bit varint encoding of (start, end, target, depth, lasti) entries that the unwinder walks to find a handler.
  • Type-keyed const dedup. Two int(3) constants share one slot. Two float('nan') constants share one slot, keyed by bit pattern (because nan != nan, naive equality dedup would produce two slots). Strings are dedup'd by content, bytes by content, tuples recursively.
  • co_qualname walk over the unit stack. A nested function's co_qualname is the dotted path through enclosing scopes (Outer.method.<locals>.inner).
  • Full co_flags assembly. CoOptimized, CoNewLocals, CoVarargs, CoVarkeywords, CoNested, CoGenerator, CoNoFree, CoCoroutine, CoMethod. The flags are set during codegen and carried through to the Code object.
  • Flat 3.11+ co_localsplus layout. Locals plus cells plus free vars in one flat array, with co_localspluskinds carrying the per-slot kind. This is the layout the v0.6 VM expects.

compile/compiler.go

The top-level Compile(mod, filename, optimize) (*Code, error) driver. Walks the AST through symtable, codegen, flowgraph, assemble, and returns the Code. Caller-visible entry point.

compile/dis.go

Port of Lib/dis.py. Renders a Code object as a one-line-per-instruction listing. Reconstructs the 32-bit oparg view by reading EXTENDED_ARG prefixes, then emits a single line per logical instruction. Recurses into nested Code objects attached via co_consts so a function's disassembly includes its inner functions.

The output format is the same one CPython's dis.dis uses, which means the golden corpus we use for testing happens to also be a useful debugging tool: disas(code) produces text a CPython developer would recognize.

marshal/

Skeleton port of Python/marshal.c. Version-5 wire format with encoder/decoder dispatch and Dump / Load round-trips on the constant types used by co_consts (int, float, complex, str, bytes, tuple, None, True, False, Ellipsis).

The code-object arm (TYPE_CODE, REF dedup for the const dedup that survives marshaling, byte parity for the line table and exception table) lands in v0.8 alongside the import system, when we need to load a .pyc file from disk. For v0.5 the skeleton is enough to round-trip the const table in tests.

tokenize/

Skeleton wrapper around Python/Python-tokenize.c. The token type table is generated from Grammar/Tokens, Include/internal/pycore_token.h, and Lib/token.py via tools/tokens_go, so the numeric values pin exactly to CPython.

The Iter / Token surface and the lexer state machine arrive in v0.5.5 / v0.6. The v0.5 skeleton exists so other packages can import the token-name constants without a circular dependency on the lexer.

Why we built it this way

The pipeline shape is not negotiable: CPython's compile.c split into codegen / flowgraph / assemble in 3.12, and our port follows the same split. A few decisions inside that constraint are worth calling out.

Why a golden corpus for the gate

Disassembly text is a tighter contract than "the bytecode buffer is equal". A bytecode buffer can be equal byte-for-byte to a captured reference and still ship with the wrong line table, the wrong exception table, the wrong const dedup. The disassembly text exposes all of it: opcode names, oparg values, line numbers on the left margin, const indices that resolve to formatted values.

Pinning ten goldens, one per characteristic shape, means we catch not just the obvious "you changed the codegen" failures but the subtle "you changed the const dedup" or "you broke EXTENDED_ARG widening" failures. Each .golden file is a tiny diff against itself; if it changes, the change is intentional and the test update is explicit.

Why the int-int folder runs to fixed point

1 + 2 + 3 folds to 6 only if the folder runs twice. The first pass folds 1 + 2 to 3, leaving 3 + 3 on the AST. The second pass folds that to 6. CPython runs the folder in a loop until nothing changes, and we do too. The fixed-point convergence is worth a few milliseconds at compile time to avoid emitting redundant BINARY_OPs the VM would then dispatch on at runtime.

Why we ship symtable as one package, not split per pass

symtable.c is one file in CPython and we kept it as one package. The Build pass and the analyze pass share a lot of state (the entry stack, the symbol flags), and splitting them would have meant exposing that state across a package boundary. Single package, two files: symtable/symtable.go for the public surface, symtable/analyze.go for the resolution pass, plus symtable/mangle.go for the name-mangling helper.

Why we ship marshal as a skeleton

marshal.c is the entry point for loading .pyc files, which we don't do until v0.8. Shipping the full code-object arm now would have meant defining the byte format for the line table and the exception table before the assembler that produces them was working. We shipped enough marshal to round-trip the constant types (because tests want to serialize const tables) and deferred the code-object arm until the import system needs it.

Where it lives

The new packages:

  • ast/ for AST nodes, validation, preprocess, and unparse.
  • future/ for the __future__ import handler.
  • symtable/ for the symbol resolver.
  • compile/ for codegen, flowgraph, assemble, the compiler driver, and the disassembler.
  • marshal/ for the wire format (skeleton).
  • tokenize/ for the token-name constants (skeleton).
  • v05test/ for the gate panel and the golden corpus.

The CPython sources we ported from:

  • Parser/Python.asdl for the AST grammar.
  • Python/ast.c for AST validation.
  • Python/ast_preprocess.c for the preprocessor.
  • Python/ast_unparse.c for the AST-to-source converter.
  • Python/future.c for the __future__ flags.
  • Python/symtable.c for the symtable.
  • Python/instruction_sequence.c for the sequence layer.
  • Python/codegen.c for the AST-to-sequence visitor.
  • Python/flowgraph.c for the CFG optimizer.
  • Python/assemble.c for the sequence-to-Code converter.
  • Lib/dis.py for the disassembler.
  • Python/marshal.c for the wire format.
  • Python/Python-tokenize.c for the token-name table.

Compatibility

  • Go: 1.26 or newer.
  • CPython behavioral target: 3.14.0+.

The gate test panel (v05test/gate_test.go) pins these structural invariants:

  • Empty module compiles to RESUME 0 / LOAD_CONST None / RETURN_VALUE.
  • x = 1 round-trips through symtable, codegen, flowgraph, assemble, and disassemble.
  • a = 1 + 2 reaches the BINARY_OP path and the int-int folder collapses it to int(3) in co_consts.
  • x = 1; x round-trips a STORE_NAME / LOAD_NAME pair.
  • if x: pass and while x: pass exercise POP_JUMP_IF_FALSE.
  • def f(x): return x + 1 produces a nested code object with MAKE_FUNCTION and LOAD_FAST.
  • async def f(): pass sets CoCoroutine on the inner code.

The golden test (v05test/golden_test.go, spec 1629) pins the disassembly text against ten checked-in .golden files. Refresh via go test ./v05test/ -update -run TestGolden; CI never passes -update.

Two gate tests are wired but skipped pending follow-up work: TestGateTryExcept and TestGateComprehension need the CFG-based stack-depth analyzer (handler-entry seeding and comprehension back-edge handling); they unblock when the analyzer lands.

Out of scope

Deferred to later releases.

  • CPython byte-equal marshal parity for Code objects. Lands in v0.8 with the import system. Until then disassembly text is the gate, which is a stronger contract anyway.
  • CFG-based stack-depth analyzer. Current pass is a forward linear scan. The CFG version handles handler entries and comprehension back-edges; it lands as a follow-up. Tracked in the 1627 spec.
  • Swaptimize, super-instructions, LOAD_FAST ref-stack, cold-block hoisting, full pseudo-op lowering. The remaining flowgraph optimizations. Tracked in the 1627 spec. Land as the VM matures.
  • PEG parser. The full parser is its own subsystem. v0.5 exercises the AST-to-Code path on hand-built modules; v0.5.5 layers the lexer plus parser scaffolding; v0.6 wires the generated parser table.

What's next

v0.5.5 (next release) layers the lexer on top of what landed here. The lexer turns Python source into a token stream the pegen parser will consume. The string-literal escape decoder and the f-string / t-string scanner land alongside. The SyntaxError text panel goes in so a parse failure renders byte-for-byte identical to CPython's.

v0.6 turns on the VM. Every Code object you assemble in v0.5 becomes executable bytecode. The runtime gets a frame stack, an evaluator loop, real LOAD_FAST / STORE_FAST / BINARY_OP dispatch, and the call protocol. The exception machinery from v0.3 gets connected: when bytecode raises, the unwinder walks the exception table the assembler emitted today.

From v0.6 forward, every line of Python source rides the pipeline that landed today. The shapes don't change; the inputs get bigger.