v0.5.0 - The compile pipeline
Released May 5, 2026.
When you run python script.py, the interpreter does a lot of
work before the first bytecode executes. It parses the source
into an AST. It validates the AST (rejecting things the grammar
permits but the language forbids, like True = 1). It runs the
symtable resolver, which decides for every name whether it's
local, free, cell, global, or a class attribute. It runs codegen
to turn the resolved AST into a flat instruction sequence. It
runs the flowgraph optimizer, which folds constants, threads
jumps, and strips unreachable blocks. Finally, it assembles the
optimized sequence into a code object with a real bytecode
buffer, a line table, an exception table, and a localsplus
layout.
Every one of those stages is its own world. CPython's
Python/compile.c historically rolled them all together; the
3.12 cleanup split them into separate files (Python/codegen.c,
Python/flowgraph.c, Python/assemble.c,
Python/instruction_sequence.c). The new structure makes the
pipeline legible: each stage takes a well-typed input and
produces a well-typed output.
v0.5.0 ports that pipeline. After this release, a Go caller
can hand compile.Compile a parsed AST module and get back a
real Code object. The Code object carries a real bytecode
buffer, a real const table (with int-int constant folding
applied), a real line table in PEP 626 format, a real exception
table in PEP 657 format, and a real co_localsplus flattening
of locals plus cells plus free vars.
The interpreter to run that Code object lands in v0.6. The
parser to feed compile.Compile from source lands in v0.5.5 plus
v0.6. For v0.5 the gate uses hand-built AST modules and pins the
disassembly text against a checked-in golden corpus.
Highlights
Three themes anchor this release.
A real, optimizing compile pipeline
The pipeline is five stages.
// 1. The AST has already been built (or hand-constructed for tests).
mod := buildAST(`a = 1 + 2`)
// 2. Validate. Rejects unparseable but invalid programs.
if err := ast.Validate(mod); err != nil { /* SyntaxError */ }
// 3. Resolve symbols. Every name gets a scope.
st, _ := symtable.Build(mod, filename, futureFlags)
// 4. Codegen. AST + symtable becomes an instruction sequence.
seq, _ := compile.Codegen(mod, st, optimize)
// 5. Flowgraph optimization, then assemble. Sequence becomes Code.
cfg, _ := compile.FromSequence(seq)
cfg.Optimize()
code, _ := compile.Assemble(cfg, filename)
// `a = 1 + 2` now lives as `LOAD_CONST 3 / STORE_NAME a` in
// code.bytecode. The int-int folder collapsed the addition at
// compile time, the same way CPython does.
Each step is a 1:1 port of the matching CPython source file. The
pipeline is byte-shape compatible with CPython's: same co_flags
encoding, same line table format, same exception table format,
same co_localsplus layout, same const-dedup rules including the
float bit-pattern keying that makes NaN-safe dedup work.
Disassembly text as the gate
When you don't have an interpreter yet (we don't, until v0.6),
how do you verify your compile pipeline is correct? The answer
CPython uses for its own development is dis.dis: render the
Code object as a one-line-per-instruction listing and read the
listing.
We ported dis.dis and built our gate on top of it.
0 RESUME 0
2 LOAD_CONST 3
4 STORE_NAME a
6 LOAD_CONST None
8 RETURN_VALUE
Ten checked-in .golden files pin the disassembly text for ten
representative modules (empty_module, simple_assign,
binary_add, load_after_store, if_pass, while_pass,
def_add_one, async_def_pass, class_pass, type_alias). If
codegen drifts by one byte, the golden test fails. If the
flowgraph optimizer stops folding 1 + 2, the
binary_add.golden file changes and the test fails. Updating
goldens is opt-in (go test ./v05test/ -update -run TestGolden)
and CI never passes -update.
This is the same approach CPython uses for its own
test-suite-as-spec: Lib/test/test_dis.py pins disassembly text
for many of the same shapes we cover. Our .golden files are
the same idea, captured as separate files for diff legibility.
Per-pattern Match support, end to end
PEP 634 (structural pattern matching) added the match statement
in Python 3.10. The codegen for it is its own subsystem because
patterns aren't expressions: they have their own grammar
(MatchValue, MatchSingleton, MatchSequence, MatchMapping,
MatchClass, MatchStar, MatchAs, MatchOr) and their own emit
strategy (set up a guard label, walk the pattern, jump to the
next case on mismatch).
We ported the full pattern panel.
match point:
case (0, 0):
return "origin"
case (x, 0):
return f"x-axis at {x}"
case (0, y):
return f"y-axis at {y}"
case _:
return "elsewhere"
The codegen for this walks every pattern kind:
(0, 0)is a MatchSequence of two MatchValue subpatterns.(x, 0)is a MatchSequence with one MatchAs and one MatchValue._is a MatchAs with no name.
Each pattern emits the matching instruction sequence (with labels, jumps, and capture stores) that CPython would emit. The result rounds through flowgraph and assemble like any other AST.
What's new
The full breakdown by package.
ast/
Everything AST-shaped. Ports the AST node definitions plus the three AST processors CPython ships (validate, preprocess, unparse).
asdl.goplus the generatednodes_gen.gofromParser/Python.asdl. The ASDL grammar is the source of truth for what nodes exist, what unions they belong to, and what fields they carry. We pulledPython.asdlfrom the CPython 3.14 tree and ran our Go-targeted generator against it, the same way CPython runsParser/asdl_c.pyagainst the same file to producePython/Python-ast.c.validate.goplusvalidate_panel.gofromPython/ast.c. The AST validator runs after parsing and before the rest of the pipeline. It checks position sanity (line and column numbers monotonic), rejects forbidden identifiers (None,True,False), enforces the Constant kind whitelist (only ints, floats, complex, bytes, str, None, True, False, Ellipsis can be a Constant value), enforces comprehension-shape rules (at least one generator clause), enforces expr_context consistency for assignment targets, and validates the eight match-pattern kinds.preprocess.gofromPython/ast_preprocess.c. The AST preprocessor runs after validation and before symtable. It does the small rewrites CPython performs at compile time: PEP 765 finally-block control-flow checks (warning whenreturn/break/continueescapes a finally),string % tupleprintf-format fold (replacing"%s" % (x,)with the equivalentBinOpat constant time),Name("__debug__")substitution with the optimize-level Constant, MatchValue / MatchMapping numeric folds,-OOdocstring removal, PEP 563 annotation skip.unparse.gofromPython/ast_unparse.c. The reverse direction: turn an AST back into source. Used byast.unparse, by thereprof certain AST nodes, and by our golden tests when we want to print an AST round-trip alongside the disassembly. Handles operator-precedence parenthesization, the1e309rendering of infinity (becauseinfis not a literal), and f-string / t-string round-trip.
future/
Port of Python/future.c. Detects from __future__ import
statements at the top of a module and sets the matching feature
flags. The flags then route into codegen and influence how
certain forms compile.
The flags we cover: Annotations (PEP 563 stringized
annotations), BarryAsBdfl (the April Fools != rewrite, which
is real and shipped), division, absolute_import,
print_function, unicode_literals, nested_scopes,
with_statement, generators. The earlier flags are
all-on-by-default in 3.14, but the syntax from __future__ import nested_scopes is still legal and we still parse it.
SyntaxError strings for misplaced __future__ imports
(after a non-future statement, after a docstring with code
between) are preserved verbatim from CPython.
symtable/
Full port of Python/symtable.c. The symtable is where every
name in a program gets its scope decided.
Build(mod, filename, futureFlags) (*Symtable, error)walks the AST, opens oneEntryper scope (module, function, class, lambda, comprehension), and registers every binding (x = 1,def f,class C, function arg,import x,except as e,with as e,for x in, walrus assignment) plus every reference. After Build, every name in the program has a flag set on its enclosing entry.analyze.goruns the post-build resolution pass. Free variables bubble up to the nearest enclosing scope that binds them. Cells get pulled down where needed. Comprehensions inline into their enclosing function when safe (PEP 709, the 3.12 inlining work). Class scopes are special-cased: a class body's locals don't participate in the closure chain for nested functions defined inside it.mangle.goimplements name mangling for class private attributes.__namebecomes_ClassName__nameinside a class body, and the mangle has to run early enough that the symtable records the mangled form, not the source form.- The errors panel covers every CPython diagnostic the symtable raises: assignment to free variable, duplicate global / nonlocal declarations, walrus inside class body, named-expr inside iterable in a comprehension, async/await placement in non-async-def scope, and a dozen others. Each error has its text preserved byte for byte from the C source.
compile/
The compiler proper. Five files, each a 1:1 port of a CPython counterpart.
compile/instrseq.go
Port of Python/instruction_sequence.c. The shared instruction
sequence representation that codegen produces and flowgraph
consumes.
Sequence. A list ofInstrplus a label table.Instr. Opcode, oparg, location (line + column ranges).JumpTargetLabel. A label that resolves to an Instr offset during flowgraph processing.Addop,Insert,AddNested,ApplyLabelMapas the construction primitives codegen calls.
compile/codegen.go and the visitor panel
Port of Python/codegen.c. The visitor that walks the AST and
emits an instruction sequence per scope. The driver is a
Compiler struct holding a stack of Units, one per scope; each
Unit accumulates an instrseq.Sequence. When the visitor
descends into a nested scope, it pushes a Unit; when it exits,
the Unit becomes a nested Code object referenced from the outer
scope's const table.
We ship the visitors for:
- Module / Interactive / Expression as the three top-level module shapes.
- Statements. Pass, ExprStmt, Return, Assign, AugAssign, AnnAssign, Delete, Raise, Assert, Import, ImportFrom.
- Control flow. If, While, For, AsyncFor, Break, Continue.
- Definitions. FunctionDef, AsyncFunctionDef, ClassDef, Lambda.
- Context managers. With, AsyncWith.
- Exceptions. Try, TryStar (PEP 654 exception groups).
- Pattern matching. Match plus the eight pattern kinds (Value, Singleton, As, Sequence, Mapping, Class, Or, Star).
- Comprehensions. ListComp, SetComp, DictComp, GeneratorExp.
- TypeAlias via
CALL_INTRINSIC_1 INTRINSIC_TYPEALIAS(PEP 695). - Expressions. BoolOp, BinOp, UnaryOp, Compare, Call, Constant, Name, Attribute, Subscript, Tuple, List, Set, Dict, Starred, Slice, JoinedStr, FormattedValue, NamedExpr, Yield, YieldFrom, Await, IfExp.
- Assignment targets. Name, Attribute, Subscript,
Tuple / List unpack, Starred (
UNPACK_SEQUENCE/UNPACK_EX).
Each visitor is a direct translation of the matching function in
Python/codegen.c. We kept the function names aligned (a
visitor named codegen_visit_stmt_If in C becomes
codegen.visitStmtIf in Go) so cross-referencing the port
against the source is a name lookup, not a structural search.
compile/flowgraph.go
Port of Python/flowgraph.c. The CFG-driven optimizer.
The flow is:
FromSequence. Convert aninstrseq.Sequenceto aflowgraph.CFGofBasicBlocks, splitting at labels and after terminator opcodes.- Optimization passes, run to fixed point:
- Int-int
BINARY_OPconstant folding (1 + 2->3at compile time). - Jump threading (jump-to-jump shortened to direct jump).
- Conditional-jump propagation (jump-if-true with a known true target eliminates the conditional).
- Unreachable-block elimination via DFS reachability, with exception handler labels pinned as roots so a handler reachable only through a raise still survives.
- Dead-code elimination after unconditional terminators.
- Redundant-NOP compaction.
- Int-int
ToSequence. Convert the optimized CFG back to a flatSequencefor the assembler.- Stackdepth analysis. A forward linear scan over the
sequence using a hand-written effect table. Each opcode
contributes a per-side delta; the running max is
co_stacksize.
A few optimizer passes from flowgraph.c are not in this drop:
swaptimize (the SWAP-based instruction reordering), super-instruction
fusion (LOAD_FAST_LOAD_FAST and friends), the LOAD_FAST ref-stack
mechanism, cold-block hoisting, the CFG-based stack-depth analyzer,
and full pseudo-op lowering. They're tracked in the 1627 spec and
land incrementally; the v0.5 baseline is enough for the v0.6 VM.
compile/assemble.go
Port of Python/assemble.c. The sequence-to-Code converter.
- EXTENDED_ARG widening. A 32-bit oparg needs up to three EXTENDED_ARG prefixes. The assembler inserts them where needed.
- PEP 626 line table. The compact byte format that maps
bytecode offsets to source lines. We emit the short form, the
one-line form, the long form, the no-location form, and the
no-column varint form, matching CPython's
_PyCode_LineNumberFromArrayexactly. - PEP 657 exception table. The 6-bit varint encoding of (start, end, target, depth, lasti) entries that the unwinder walks to find a handler.
- Type-keyed const dedup. Two
int(3)constants share one slot. Twofloat('nan')constants share one slot, keyed by bit pattern (becausenan != nan, naive equality dedup would produce two slots). Strings are dedup'd by content, bytes by content, tuples recursively. co_qualnamewalk over the unit stack. A nested function'sco_qualnameis the dotted path through enclosing scopes (Outer.method.<locals>.inner).- Full
co_flagsassembly. CoOptimized, CoNewLocals, CoVarargs, CoVarkeywords, CoNested, CoGenerator, CoNoFree, CoCoroutine, CoMethod. The flags are set during codegen and carried through to the Code object. - Flat 3.11+
co_localspluslayout. Locals plus cells plus free vars in one flat array, withco_localspluskindscarrying the per-slot kind. This is the layout the v0.6 VM expects.
compile/compiler.go
The top-level Compile(mod, filename, optimize) (*Code, error)
driver. Walks the AST through symtable, codegen, flowgraph,
assemble, and returns the Code. Caller-visible entry point.
compile/dis.go
Port of Lib/dis.py. Renders a Code object as a
one-line-per-instruction listing. Reconstructs the 32-bit oparg
view by reading EXTENDED_ARG prefixes, then emits a single line
per logical instruction. Recurses into nested Code objects
attached via co_consts so a function's disassembly includes
its inner functions.
The output format is the same one CPython's dis.dis uses, which
means the golden corpus we use for testing happens to also be a
useful debugging tool: disas(code) produces text a CPython
developer would recognize.
marshal/
Skeleton port of Python/marshal.c. Version-5 wire format with
encoder/decoder dispatch and Dump / Load round-trips on the
constant types used by co_consts (int, float, complex, str,
bytes, tuple, None, True, False, Ellipsis).
The code-object arm (TYPE_CODE, REF dedup for the const dedup
that survives marshaling, byte parity for the line table and
exception table) lands in v0.8 alongside the import system, when
we need to load a .pyc file from disk. For v0.5 the skeleton is
enough to round-trip the const table in tests.
tokenize/
Skeleton wrapper around Python/Python-tokenize.c. The token
type table is generated from Grammar/Tokens,
Include/internal/pycore_token.h, and Lib/token.py via
tools/tokens_go, so the numeric values pin exactly to CPython.
The Iter / Token surface and the lexer state machine arrive
in v0.5.5 / v0.6. The v0.5 skeleton exists so other packages can
import the token-name constants without a circular dependency on
the lexer.
Why we built it this way
The pipeline shape is not negotiable: CPython's compile.c split
into codegen / flowgraph / assemble in 3.12, and our port follows
the same split. A few decisions inside that constraint are worth
calling out.
Why a golden corpus for the gate
Disassembly text is a tighter contract than "the bytecode buffer is equal". A bytecode buffer can be equal byte-for-byte to a captured reference and still ship with the wrong line table, the wrong exception table, the wrong const dedup. The disassembly text exposes all of it: opcode names, oparg values, line numbers on the left margin, const indices that resolve to formatted values.
Pinning ten goldens, one per characteristic shape, means we catch not just the obvious "you changed the codegen" failures but the subtle "you changed the const dedup" or "you broke EXTENDED_ARG widening" failures. Each .golden file is a tiny diff against itself; if it changes, the change is intentional and the test update is explicit.
Why the int-int folder runs to fixed point
1 + 2 + 3 folds to 6 only if the folder runs twice. The first
pass folds 1 + 2 to 3, leaving 3 + 3 on the AST. The second
pass folds that to 6. CPython runs the folder in a loop until
nothing changes, and we do too. The fixed-point convergence is
worth a few milliseconds at compile time to avoid emitting redundant
BINARY_OPs the VM would then dispatch on at runtime.
Why we ship symtable as one package, not split per pass
symtable.c is one file in CPython and we kept it as one
package. The Build pass and the analyze pass share a lot of
state (the entry stack, the symbol flags), and splitting them
would have meant exposing that state across a package boundary.
Single package, two files: symtable/symtable.go for the public
surface, symtable/analyze.go for the resolution pass, plus
symtable/mangle.go for the name-mangling helper.
Why we ship marshal as a skeleton
marshal.c is the entry point for loading .pyc files, which
we don't do until v0.8. Shipping the full code-object arm now
would have meant defining the byte format for the line table and
the exception table before the assembler that produces them was
working. We shipped enough marshal to round-trip the constant
types (because tests want to serialize const tables) and deferred
the code-object arm until the import system needs it.
Where it lives
The new packages:
ast/for AST nodes, validation, preprocess, and unparse.future/for the__future__import handler.symtable/for the symbol resolver.compile/for codegen, flowgraph, assemble, the compiler driver, and the disassembler.marshal/for the wire format (skeleton).tokenize/for the token-name constants (skeleton).v05test/for the gate panel and the golden corpus.
The CPython sources we ported from:
Parser/Python.asdlfor the AST grammar.Python/ast.cfor AST validation.Python/ast_preprocess.cfor the preprocessor.Python/ast_unparse.cfor the AST-to-source converter.Python/future.cfor the__future__flags.Python/symtable.cfor the symtable.Python/instruction_sequence.cfor the sequence layer.Python/codegen.cfor the AST-to-sequence visitor.Python/flowgraph.cfor the CFG optimizer.Python/assemble.cfor the sequence-to-Code converter.Lib/dis.pyfor the disassembler.Python/marshal.cfor the wire format.Python/Python-tokenize.cfor the token-name table.
Compatibility
- Go: 1.26 or newer.
- CPython behavioral target: 3.14.0+.
The gate test panel (v05test/gate_test.go) pins these
structural invariants:
- Empty module compiles to
RESUME 0 / LOAD_CONST None / RETURN_VALUE. x = 1round-trips through symtable, codegen, flowgraph, assemble, and disassemble.a = 1 + 2reaches the BINARY_OP path and the int-int folder collapses it toint(3)inco_consts.x = 1; xround-trips a STORE_NAME / LOAD_NAME pair.if x: passandwhile x: passexercise POP_JUMP_IF_FALSE.def f(x): return x + 1produces a nested code object withMAKE_FUNCTIONandLOAD_FAST.async def f(): passsetsCoCoroutineon the inner code.
The golden test (v05test/golden_test.go, spec 1629) pins the
disassembly text against ten checked-in .golden files. Refresh
via go test ./v05test/ -update -run TestGolden; CI never passes
-update.
Two gate tests are wired but skipped pending follow-up work:
TestGateTryExcept and TestGateComprehension need the
CFG-based stack-depth analyzer (handler-entry seeding and
comprehension back-edge handling); they unblock when the
analyzer lands.
Out of scope
Deferred to later releases.
- CPython byte-equal marshal parity for Code objects. Lands in v0.8 with the import system. Until then disassembly text is the gate, which is a stronger contract anyway.
- CFG-based stack-depth analyzer. Current pass is a forward linear scan. The CFG version handles handler entries and comprehension back-edges; it lands as a follow-up. Tracked in the 1627 spec.
- Swaptimize, super-instructions, LOAD_FAST ref-stack, cold-block hoisting, full pseudo-op lowering. The remaining flowgraph optimizations. Tracked in the 1627 spec. Land as the VM matures.
- PEG parser. The full parser is its own subsystem. v0.5 exercises the AST-to-Code path on hand-built modules; v0.5.5 layers the lexer plus parser scaffolding; v0.6 wires the generated parser table.
What's next
v0.5.5 (next release) layers the lexer on top of what landed here. The lexer turns Python source into a token stream the pegen parser will consume. The string-literal escape decoder and the f-string / t-string scanner land alongside. The SyntaxError text panel goes in so a parse failure renders byte-for-byte identical to CPython's.
v0.6 turns on the VM. Every Code object you assemble in v0.5
becomes executable bytecode. The runtime gets a frame stack, an
evaluator loop, real LOAD_FAST / STORE_FAST / BINARY_OP
dispatch, and the call protocol. The exception machinery from
v0.3 gets connected: when bytecode raises, the unwinder walks the
exception table the assembler emitted today.
From v0.6 forward, every line of Python source rides the pipeline that landed today. The shapes don't change; the inputs get bigger.