1603. gopy roadmap
The port advances along a critical path: runtime state, then object model, then compiler, then VM, then import, then runtime polish. Each phase has a boot test and a release tag. Anything past v0.5 can interleave; the order before v0.5 is load-bearing.
Each phase lists:
- In scope: which
cpython/Python/*files are ported in this phase. - Out of scope: explicitly deferred to a later phase.
- Gate: the executable test that must pass before the next phase starts.
v0.0. Project scaffolding
Goal: An empty Go module that builds. No Python yet.
In scope:
cmd/gopy/main.go. Prints version, exits 0.build/{version,platform,compiler,copyright}.go. Static strings (the four trivial getX.c files).go.mod,go.sum, basic CI (build +go vet).
Gate: go build ./... && go test ./... && gopy --version prints
gopy 0.0.0 (3.14.0+) [go1.22 darwin/arm64] or similar.
v0.1. Memory, arena, primitive sync
Goal: The compiler-side allocator and basic synchronization primitives that everything else needs.
In scope (Python/ files):
pyarena.ctoarena/arena.go.lock.c,parking_lot.c,critical_section.ctopysync/.thread.ctopythread/thread.go. Just thread create/join wrappers on top of Go's runtime.bootstrap_hash.ctohash/secret.go. Just the seed init; hash funcs come in v0.4.
Out of scope: GC, brc, qsbr (these need PyObject).
Gate: arena alloc/free unit tests pass, mutex stress test passes.
v0.2. Object model foundation (handover from cpython/Objects/)
Note: The 1600-series spec covers only cpython/Python/. This v0.2
phase is the integration boundary with the Objects/ port (a separate
spec series). We list the dependency for completeness.
Required from object spec:
Objectinterface,Type,Header,VarHeader.- Concrete types:
int,float,bool,None,bytes,str,list,tuple,dict,set,frozenset,slice,range. - Tuple/list/dict basic ops, hashing of int/str/tuple/frozenset.
Type.Slot(...)dispatch and the tp_* protocol.
Gate: Construct a dict, hash a tuple, iterate a list, all from a
test-only gopy/objtest/ package, since we have no parser yet.
v0.3. Errors, traceback, refcount-only GC
Goal: We can raise and catch exceptions in Go-only fixtures.
In scope:
errors.ctoerrors/.traceback.ctotraceback/.suggestions.ctoerrors/suggest.go.gc.c(refcount path only; no cycle collector yet) togc/.brc.cpartial: just the field layout, ops are no-ops in the GIL build.pystate.cskeleton (Runtime, Interpreter, Thread structs; no init flow).
Out of scope: Cycle collector, free-threading, qsbr, finalize-on-resurrect.
Gate: errors.SetString(state.PyExc_ValueError, "boom") then
errors.Occurred(ts) != nil, and traceback.Format(ts.Exception())
produces a non-empty string.
v0.4. Strings, hashing, ctype, int/float parsing
Goal: Number/string conversion and hashing, the bedrock of dict and the parser.
In scope:
pyhash.ctohash/{fnv,siphash,hash}.go.pyctype.ctopystrconv/ctype.go.pystrcmp.ctopystrconv/cmp.go.mystrtoul.ctopystrconv/strtoul.go.pystrtod.cplusdtoa.ctopystrconv/{strtod,dtoa}.go.- Decision point: cgo-wrap David Gay's dtoa, or pure-Go reimplement? See 1660 §"dtoa decision". Recommend pure-Go using the reference algorithm to keep the project cgo-free.
pystrhex.ctopystrconv/hex.go.mysnprintf.c: drop, usefmt.pymath.c,pyfpe.ctopymath/{pymath,fpe}.go.formatter_unicode.ctoformat/format.go.
Gate: hash.Buffer([]byte("hello")) matches CPython's
hash(b"hello") under PYTHONHASHSEED=0. pystrconv.ParseFloat("3.14")
returns the same uint64 bit pattern as CPython.
v0.5. Compiler pipeline (parser-side handover)
Note: The Python parser (PEG) port lives in the 1640-1645 sub-block and lands in v0.5.5 (next phase). v0.5 itself exercises the ast-to-Code path on hand-built modules; once v0.5.5 lands the disassembly goldens get re-pinned against parsed source.
In scope (the rest of the compiler is in this spec):
asdl.cplusPython-ast.ctoast/{asdl,nodes_gen}.go.ast.ctoast/validate.go.ast_preprocess.ctoast/preprocess.go.ast_unparse.ctoast/unparse.go.future.ctofuture/future.go.symtable.ctosymtable/.instruction_sequence.ctocompile/instrseq.go.codegen.ctocompile/codegen.go.flowgraph.ctocompile/flowgraph.go.assemble.ctocompile/assemble.go.compile.ctocompile/compiler.go.
Gate (structural): compile.Compile(module("a = 1 + 2")) produces
a *Code whose disassembly contains LOAD_CONST and STORE_NAME and
whose const pool holds the folded int(3) after the int-int BINARY_OP
pass runs.
Gate (disassembly parity): the v05test package pins compile
output via two layers. Structural assertions live in
gate_test.go (TestGateEmptyModule, TestGateSimpleAssign,
TestGateBinaryAdd, TestGateLoadAfterStore, TestGateIfWhile,
TestGateDef, TestGateAsyncFunction). Disassembly-text goldens
live in golden_test.go against testdata/golden/*.golden for the
ten-fixture panel in spec 1629 (empty_module, simple_assign,
binary_add, load_after_store, if_pass, while_pass,
def_add_one, async_def_pass, class_pass, type_alias).
Two structural cases (TestGateTryExcept, TestGateComprehension)
are wired but t.Skip'd pending the CFG-based stack-depth analyser
(handler entry seeding / comprehension back-edge); they flip green
once that lands.
Gate (full byte-equal marshal parity): deferred to v0.8 alongside the import system. v0.5 has the marshal package skeleton plus a roundtrip test, but the code-object marshal arm (TYPE_LONG, ref-dedup, line/exception-table byte parity) lands with import.
Optimisation panel status: int-int BINARY_OP constant folding,
jump threading, conditional-jump propagation, unreachable-block
elimination, post-terminator dead-code elimination, and
redundant-NOP compaction landed for v0.5 alongside the CFG-driven
pass driver. Swaptimize, super-instructions, LOAD_FAST ref-stack,
cold-block hoist, CFG-based stackdepth, and full pseudo-op lowering
are deferred and tracked separately.
Other v0.5 landings: full ast.Validate panel
(forbidden-name, comprehension shape, expr_context, Starred placement,
match-pattern shape, PEP 695 type-param constraints), TypeAlias
codegen via INTRINSIC_TYPEALIAS, PEP 626 line-table writer, PEP 657
exception-table writer, co_qualname walk, type-keyed const dedup.
v0.5.5. Parser handover
Goal: Real source text reaches compile.Compile. The disassembly
goldens shipped in v0.5 against hand-built AST modules get re-pinned
against parsed source.
In scope (Parser/ files; spec block 1640-1645):
Parser/lexer/{lexer,state,buffer}.ctoparser/lexer/. Spec 1641.Parser/tokenizer/{utf8,string,file,readline}_tokenizer.cplushelpers.ctoparser/lexer/driver_*.go. Spec 1641.Parser/pegen.c,Parser/pegen_errors.c,Parser/peg_api.c,Parser/action_helpers.c,Parser/token.ctoparser/pegen/andparser/errors/. Specs 1642 and 1643.Parser/parser.cregenerated fromGrammar/python.gramvia a Go-target fork ofTools/peg_generator/. Lives attools/parser_gen/. Output checked in toparser/pegen/parser_gen.go. Spec 1642.Parser/string_parser.ctoparser/string/. Spec 1644.
Out of scope: Parser/myreadline.c (interactive readline; lands
in v0.9 alongside the REPL). Soft keyword work beyond what 3.14
already needs.
Gate: partest/gate_test.go parses each v0.5 golden fixture
from source and round-trips through compile.Compile, producing
disassembly text that matches the v0.5 golden file byte-for-byte.
The partest/errors_panel_test.go corpus pins SyntaxError text
byte-for-byte to CPython.
v0.6. Bytecode interpreter (Tier-1 only)
Goal: Execute the compiled bytecode. No specialization, no Tier-2.
In scope:
bytecodes.c(DSL) plus Go-emitting code generator tovm/opcodes_gen.go.ceval.ctovm/eval.go.ceval_macros.htovm/dispatch.go.ceval_gil.ctovm/gil.go.frame.ctovm/frame.go.stackrefs.ctovm/stackref.go.
Out of scope: specialize.c, optimizer.c, jit.c,
instrumentation.c. Stub out the entry hooks (e.g.
_Py_call_instrumentation becomes a no-op).
Gate: gopy -c "print(1+2)" prints 3. The dis builtin shows
unspecialized bytecode. All exception types raise correctly through the
frame chain.
v0.7. Init/Finalize and minimum sys/builtins
Goal: Real Py_Initialize / Py_Finalize lifecycle. gopy -c works
without manual setup boilerplate.
In scope:
pystate.ccomplete tostate/.pylifecycle.ctolifecycle/.preconfig.c,initconfig.c,interpconfig.ctoinitconfig/.pathconfig.ctopathconfig/.pythonrun.ctopythonrun/.bltinmodule.cminimal subset (print, len, range, iter, abs, type, isinstance, repr, str, int, float, list, tuple, dict, set, getattr, setattr, hasattr, callable, id, hash, sorted, reversed, enumerate, zip, map, filter, sum, min, max, any, all, divmod, pow, chr, ord, bin, oct, hex, ascii, format, vars, dir) tobuiltin/.sysmodule.cminimal subset (path, modules, argv, version, version_info, flags, implementation, stdin/stdout/stderr placeholders, exit, getrefcount, setrecursionlimit, getrecursionlimit) tosysmod/._warnings.ctowarnings/.getargs.ctogetargs/.modsupport.ctomodsupport/.structmember.ctostructmember/.
Gate: gopy -c "import sys; print(sys.version_info)" works
end-to-end through full Initialize, Run, Finalize.
v0.8. Import, marshal, codecs
Goal: import foo from a .py file, with __pycache__ round-trip.
In scope:
marshal.ctomarshal/.import.ctoimp/import.go.frozen.c(table only; frozen importlib bootstrap is a separate task) toimp/frozen.go.codecs.ctocodecs/.- The frozen importlib._bootstrap blob, regenerated with our
freeze_modulesGo tool that mirrors CPython's. Produces an equivalent .h-equivalent .go file.
Out of scope: importdl.c (native .so loading). gopy does not load C
extensions.
Gate: import json; json.dumps({"a": 1}) works. The .pyc generated
by gopy can be loaded by CPython 3.14.
v0.9. Contextvars, hamt, time, tokenize (shipped 2026-05-06)
Goal: Stdlib time, contextvars, tokenize modules work.
Shipped:
hamt.ctohamt/.context.cplus_contextvars.ctocontextvar/.pytime.ctopytime/(Time_,Monotonic,PerfCounter, rounding modes, per-platform info files).Python-tokenize.ctotokenize/(real lexer state machine).getopt.ctogetopt/(cmd/gopyparses through this now).hashtable.ctohashtable/.- vm tail: generators on goroutines,
MATCH_*,WITH_EXCEPT_START, set builders,IMPORT_STAR, async-stub opcodes.
Deferred to v0.10+: frozen importlib code-object embedding,
sub-interpreter contention path, __match_args__ MRO walk, full async
iterator surface.
Gate: python -m asyncio smoke test runs (asyncio uses contextvars).
v0.10. Cycle GC, weakrefs, finalizers, parser drop
Goal: Reference cycles are reclaimed. gc.collect() works. Parser
parses the full Lib/test/test_grammar.py and matches CPython 3.14
ast.dump byte-for-byte across the seeded fixture set.
In scope (cycle GC, the original v0.10 theme):
gc.ccomplete togc/collector.go(generations,gc_collect_main, reachability walk, weakref clearing pass).gc_gil.ctogc/gil.go(collector-vs-mutator interlock).object_stack.ctogc/objstack.go.weakrefobject.ctoobjects/weakref.go(PyWeakref, callback queue,_PyWeakref_ClearWeakRefsExceptCallbacks).- Finalizer (tp_finalize) queue and resurrection check.
gcbuilt-in module surface.
Also shipped under v0.10.x: parser end-to-end (v0.10.2 tag).
Lib/test/test_grammar.py parses cleanly, the Lib corpus reaches
ok=720 / sentinel=0 / fail=0, and parser/parity_test.go pins
ast.dump byte-equality against python3 3.14 across 50+ fixtures.
Spec: 1613_gopy_gc.md (cycle GC). Parser polish lives in
changelog/v0.10.*.md.
Gate: test_gc from CPython passes; cycle of two objects with
mutual references is reclaimed; finalizer fires once and only once.
Parser parity gate green on Linux / macOS / Windows.
v0.11. Specialize, monitor (shipped 2026-05-07)
Goal: Adaptive specialization (PEP 659) and monitoring (PEP 669).
The dispatch loop rewrites adaptive opcodes to specialized variants
on warmup, the sys.monitoring runtime fires events with PEP 669
tool-id semantics, and sys.settrace / sys.setprofile work on top
of that runtime as a thin adapter layer.
Shipped:
specialize.ctospecialize/. Spec 1694. Backoff counter, inline cache layouts,_PyCode_Quicken,_PyOpcode_Cachesand_PyOpcode_Deopttables, per-family entry points forLOAD_ATTR,STORE_ATTR,LOAD_GLOBAL,LOAD_SUPER_ATTR,BINARY_OP,COMPARE_OP,CONTAINS_OP,TO_BOOL,STORE_SUBSCR,UNPACK_SEQUENCE,FOR_ITER,SEND,CALL,CALL_KW.instrumentation.ctomonitor/. Spec 1695. Per-interp state, 19 fire-event entry points, the_Py_Instrumentshadow walk, per-code tool-slot lifecycle, line instrumentation driven by the PEP 626 line table, the shared callback runner that honoursDisable/Missing. Thesys.monitoringbuiltin module surfacesuse_tool_id,register_callback,set_events,set_local_events,restart_events, plus the constants.legacy_tracing.ctovm/legacy_tracing.go. Spec 1696. Bridges PEP 669 events back to thePy_tracefuncshapesys.settrace/sys.setprofileexpect, registering as PEP 669 tools 6 and 7.vm/sys_trace_builtins.goexposes the four Python-visible builtins on top.- DSL generator extension: opcode table regenerated to pick up the
specialized variants (
LOAD_ATTR_INSTANCE_VALUE,BINARY_OP_ADD_INT, ...) and theINSTRUMENTED_*mirror set.
Gate (achieved): vmtest/v011_gate_test.go drives the
specializer (TO_BOOL rewrite + deopt round-trip), the PEP 669
fan-out (callback fires with the per-event arg trio), the
sys.settrace bridge (PyTrace_RETURN with the right value), and
the sys.monitoring builtin surface (use_tool_id plus
register_callback reflected on InterpState) end to end through
the public entry points.
Deferred to v0.12: A dis.dis byte-equal panel against CPython
3.14 over a fixture set is wired but not yet pinned; once the
Tier-2 optimizer in v0.12 starts running these will be the first
rows of the dis parity gate.
v0.12. Tier-2 optimizer (interpreter-only, no JIT)
Goal: Trace projection, abstract interpretation, and Tier-2
micro-op interpreter. v0.11 left the specializer-rewritten bytecode
in the Tier-1 dispatch loop; v0.12 adds the second tier underneath
it. When a hot loop's JUMP_BACKWARD triggers the side-table
(co_executors), the runtime projects a linear trace of micro-ops
out of the specialized bytecode, runs the trace through the
abstract interpreter to fold constants and eliminate redundant
guards, and then dispatches the optimized trace through the uop
interpreter the next time control reaches that bytecode offset.
In scope:
optimizer.c(1755 lines) tooptimizer/. Trace projection, executor-object lifecycle, side-table onCode.Executors,_PyOptimizer_Optimizeentry point, bloom filter for executor invalidation. Spec 1697.optimizer_bytecodes.c(1107 lines) plusoptimizer_cases.c.h(generated) pluspycore_uop_ids.h,pycore_uop_metadata.htooptimizer/uops.go,optimizer/uops_cases_gen.go, generator attools/uops_gen/. The uop ID table, per-uop metadata (operand count, stack delta, refcount effect), and the case bodies the uop interpreter dispatches on. Spec 1698.optimizer_analysis.c(656 lines) plusoptimizer_symbols.c(880 lines) tooptimizer/analysis.goandoptimizer/symbols.go. The abstract interpretation pass that runs over a freshly projected trace and theJitOptSymbollattice (top, bottom, type, type+value, const) it operates on. Spec 1699.vm/dispatch.go.JUMP_BACKWARDwarm-up counter andENTER_EXECUTORarm; the dispatch loop hands off to a Tier-2 executor when one exists at the current offset.objects/code.go.co_executorsside table and the_PyExecutorArrayshape it stores.
Out of scope: jit.c. The JIT stub continues to return "no
executor"; gopy's Tier-2 stays interpreter-only.
Sub-specs:
1697_gopy_optimizer_overview.md. Tier-2 architecture, executor lifecycle, trace projection fromoptimizer.c, the bloom filter for invalidation.1698_gopy_optimizer_uops.md. uop ID table, uop metadata, uop interpreter, the DSL generator that emitsuops_cases_gen.go.1699_gopy_optimizer_analysis.md. Abstract interp pass, theJitOptSymbollattice, guard elimination.
Gate: dis.dis parity panel is pinned: CPython 3.14 and gopy
produce byte-equal disassembly across a seeded fixture set covering
specialized + Tier-2 variants. A long-running tight loop runs
through ENTER_EXECUTOR and the uop interpreter measurably faster
than v0.11 on the same machine. We do not need to match CPython's
absolute Tier-2 speed; we need the same shape.
v0.13. Sub-interpreters and cross-interpreter data
In scope:
crossinterp.ctocrossinterp/.interpconfig.cpolishing (per-interp PyConfig).
Gate: interpreters PEP 734 module (in stdlib) basic usage works.
v0.14. Free-threaded build
Goal: An optional -tags pygil_disabled build path. Not the default.
In scope:
gc_free_threading.ctogc/freethreading.go.brc.ccomplete (real biased refcount) togc/brc.go.qsbr.ctogc/qsbr.go.uniqueid.c,index_pool.ctogc/{uniqueid,indexpool}.go.- Free-threaded variants of object slots (mostly in
Objects/spec).
Gate: Same test suite as v0.13 passes under -tags pygil_disabled.
v0.15. Tracemalloc, audit, faulthandler, perf
In scope:
tracemalloc.ctotracemalloc/.- audit hooks (
pycore_audit.h) toaudit/. - faulthandler module.
remote_debugging.c(best-effort) toremotedebug/.
Gate: tracemalloc.start() then tracemalloc.get_traced_memory()
returns sane values.
v1.0. CPython test-suite parity
Goal: A defined subset (about 80%) of cpython/Lib/test/ passes.
This phase is iterative bug-fixing. No new files; just polish plus test investment.
Gate: python -m test --pgo analogue. A curated subset of CPython
tests all pass on gopy.
Out-of-roadmap (post-v1.0, not in this spec series)
- JIT (jit.c): Copy-and-patch JIT requires LLVM stencil generation. Defer indefinitely. gopy will use Go's compiler-level optimizations plus the Tier-2 trace interpreter for now.
- C extension loading (importdl.c, dynload_*.c): Out of scope. The Go port reimplements common C extensions in Go on demand.
- Emscripten / wasm support: drop unless explicitly funded.
- perf trampoline: drop.
Summary table
| Tag | Theme | Critical files |
|---|---|---|
| v0.0 | Scaffold | get*.c |
| v0.1 | Arena, sync | pyarena, lock, parking_lot, critical_section, thread |
| v0.2 | Object model (handover) | (cpython/Objects/) |
| v0.3 | Errors, refcount GC | errors, traceback, suggestions, gc (rc-only) |
| v0.4 | Strings, numbers, hash | pyhash, pyctype, pystr*, mystr*, dtoa, formatter |
| v0.5 | Compiler pipeline | ast, asdl, future, symtable, codegen, flowgraph, assemble, compile, instruction_sequence |
| v0.6 | VM Tier-1 | ceval, ceval_gil, ceval_macros, bytecodes, frame, stackrefs |
| v0.7 | Init, lifecycle, sys, builtins | pystate, pylifecycle, *config, pathconfig, pythonrun, bltinmodule, sysmodule, _warnings, getargs, modsupport, structmember |
| v0.8 | Import, marshal, codecs | import, marshal, frozen, codecs |
| v0.9 | Contextvars, time, tokenize | hamt, context, _contextvars, pytime, Python-tokenize, getopt, hashtable, intrinsics |
| v0.10 | Cycle GC | gc (cycle path), gc_gil, object_stack |
| v0.11 | Specialize, monitor | specialize, instrumentation, legacy_tracing |
| v0.12 | Tier-2 optimizer | optimizer*, executor_cases.c.h |
| v0.13 | Sub-interpreters | crossinterp, interpconfig |
| v0.14 | Free-threaded build | gc_free_threading, brc (full), qsbr, uniqueid, index_pool |
| v0.15 | Profiling, debugging | tracemalloc, audit, faulthandler, remote_debugging |
| v1.0 | CPython test parity | (no new files) |