1600. gopy overview
Goal
gopy is a fresh re-implementation of CPython's interpreter core in Go. The
target is 100% behavioural compatibility with the upstream CPython 3.14-era
sources at $HOME/github/python/cpython. That means same data structures,
same models, same code logic, same wire formats, same error messages. The
only change is naming and surface API style, which adopts Go-idiomatic
conventions modelled on the Go standard library.
This is not a clean-room reimagining. It is a line-by-line port. When behaviour deviates from CPython, the bug is in the port, not in CPython.
The CPython source-of-truth folder is cpython/Python/ (about 138k lines of
C across 91 .c files plus ~30 .h files). The Go target is tamnd/gopy,
currently at v0.9.0; v0.10 is in flight on feat/v0.10.0-gc.
Non-goals
- No new features. No improved API. No "better" GC.
- No Python 2 support.
- No alternative implementations (we are not PyPy / Cinder / GraalPython).
- No partial Python. The goal is to run unmodified CPython 3.14 stdlib.
- C extension compatibility (PyObject* ABI) is out of scope. We will not
load
.somodules. C extensions are reimplemented in Go on demand.
Sources of truth
| Concern | Source |
|---|---|
| Runtime semantics | cpython/Python/*.c, cpython/Include/internal/*.h |
| Object semantics | cpython/Objects/*.c |
| Parser / lexer | cpython/Parser/* |
| Stdlib | cpython/Lib/* |
| Tests | cpython/Lib/test/* |
| Spec authority for ambiguity | the C source, not the docs |
This 1600-series covers cpython/Python/, cpython/Objects/, and
cpython/Parser/. The Objects port lives in the 1670-1689 sub-block
(formerly numbered 1700-series, renumbered to keep one folder).
The Parser port lives in the 1640-1645 sub-block. Stdlib ports
are tracked in their own spec series.
High-level architecture
┌────────────────────────────────────────────────┐
│ gopy/cmd/gopy │
│ (entry point, like python.c) │
└────────────────────┬───────────────────────────┘
│
┌────────────────────▼───────────────────────────┐
│ gopy/lifecycle │
│ Initialize / Finalize / NewInterp │
└─┬──────────────┬──────────────┬────────────────┘
│ │ │
┌─────────▼──┐ ┌──────▼──────┐ ┌────▼────────┐
│ initconfig │ │ imp │ │ pythonrun │
│ preconfig │ │ importlib │ │ REPL/eval │
│ pathconfig │ │ marshal │ │ pyc rd/wr │
└────────────┘ └──────┬──────┘ └────┬────────┘
│ │
┌────────────────▼──────────────▼──── ─────────────┐
│ gopy/state │
│ Runtime · Interpreter · Thread · CrossInterp │
└─┬──────────────┬─────────────────┬──────────────┘
│ │ │
┌──────────▼─┐ ┌───────▼──────┐ ┌──────▼─────────┐
│ gopy/ │ │ gopy/ │ │ gopy/ │
│ vm │ │ compile │ │ gc │
│ (ceval, │ │ ast/sym/ │ │ cycle coll, │
│ frame, │ │ codegen/ │ │ refcount, │
│ uops) │ │ flowgraph/ │ │ arena, brc, │
│ │ │ assemble │ │ qsbr, weakref │
└──┬─────────┘ └──┬───────────┘ └────────────────┘
│ │
┌──────▼────────┐ ┌─────▼──────────┐
│ gopy/ │ │ gopy/ │
│ specialize │ │ tokenize │
│ optimizer │ │ parser (sep.)│
│ jit (deferred│ └────────────────┘
│ monitor │
└───────────────┘
Cross-cutting: gopy/pysync, gopy/hash, gopy/pytime,
gopy/format, gopy/pystrconv, gopy/codecs,
gopy/errors, gopy/traceback, gopy/warnings,
gopy/contextvar, gopy/hamt, gopy/hashtable,
gopy/intrinsics, gopy/structmember, gopy/getargs,
gopy/modsupport, gopy/builtin, gopy/sysmod,
gopy/tracemalloc, gopy/audit, gopy/monitor
Spec files in this series
Implemented (spec written and code shipped)
Meta / infrastructure
| # | File | Focus | Shipped |
|---|---|---|---|
| 1600 | 1600_gopy_overview.md | This file | meta |
| 1601 | 1601_gopy_naming.md | Naming conventions: C to Go translation rules | meta |
| 1602 | 1602_gopy_filemap.md | C source to Go package mapping (every file) | meta |
| 1603 | 1603_gopy_roadmap.md | Phased milestone plan v0.1 to v1.0 | meta |
| 1630 | 1630_gopy_vm_overview.md | VM block overview (Tier-1 interpreter) | meta |
| 1640 | 1640_gopy_parser_overview.md | Parser block overview | meta |
v0.1: arena and sync
| # | File | Focus | Shipped |
|---|---|---|---|
| 1604 | 1604_gopy_arena.md | pyarena.c port | v0.1 |
| 1605 | 1605_gopy_pythread.md | thread.c cross-platform port | v0.1 |
| 1606 | 1606_gopy_pysync.md | lock.c, parking_lot.c, critical_section.c | v0.1 |
| 1607 | 1607_gopy_hashsecret.md | bootstrap_hash.c seed init | v0.1 |
v0.3: errors and traceback
| # | File | Focus | Shipped |
|---|---|---|---|
| 1611 | 1611_gopy_errors.md | errors.c plus the BaseException gating subset | v0.3 |
v0.4: strings, numbers, hash
| # | File | Focus | Shipped |
|---|---|---|---|
| 1660 | 1660_gopy_strings_numbers.md | pyctype, pystrcmp, mystrtoul, pystrtod, dtoa, pystrhex, pymath, pyfpe, formatter_unicode | v0.4 |
| 1661 | 1661_gopy_hash.md | pyhash.c (SipHash-1-3, FNV-1a) | v0.4 |
v0.5 / v0.5.5: compiler and parser
| # | File | Focus | Shipped |
|---|---|---|---|
| 1620 | 1620_gopy_compile_pipeline.md | ast, asdl, future, symtable, codegen, flowgraph, assemble, compile, instruction_sequence, ast_preprocess, ast_unparse | v0.5 |
| 1625 | 1625_gopy_compile_testing.md | Per-checkbox test plan for 1620 and 1665 | v0.5 |
| 1626 | 1626_gopy_codegen.md | codegen.c port detail | v0.5 |
| 1627 | 1627_gopy_flowgraph.md | flowgraph.c port detail (CFG, passes; stackdepth + super-instr deferred) | v0.5 |
| 1628 | 1628_gopy_assemble.md | assemble.c port detail | v0.5 |
| 1629 | 1629_gopy_compile_goldens.md | Disassembly golden corpus for v05test | v0.5 |
| 1641 | 1641_gopy_lexer_tokenizer.md | Parser/lexer/, Parser/tokenizer/ | v0.5.5 |
| 1642 | 1642_gopy_pegen.md | pegen.c, parser.c, generated PEG runtime | v0.5.5 |
| 1643 | 1643_gopy_parser_errors.md | pegen_errors.c, action_helpers.c, peg_api.c, token.c | v0.5.5 |
| 1644 | 1644_gopy_string_parser.md | string_parser.c (f-string, t-string, bytes) | v0.5.5 |
v0.6: VM Tier-1
| # | File | Focus | Shipped |
|---|---|---|---|
| 1621 | 1621_gopy_bytecodes_dsl.md | bytecodes.c DSL parser + Go-emitting generator | v0.6 |
| 1635 | 1635_gopy_intrinsics.md | intrinsics.c (CALL_INTRINSIC_1 / 2 dispatch) | v0.6 |
| 1636 | 1636_gopy_eval_loop.md | ceval.c, ceval_macros.h, opcode dispatch loop | v0.6 |
| 1637 | 1637_gopy_frame.md | frame.c, frame layout, locals, generator state | v0.6 |
| 1638 | 1638_gopy_stackref.md | stackrefs.c, tagged stack values | v0.6 |
| 1639 | 1639_gopy_eval_gil.md | ceval_gil.c, GIL, eval breaker, signal bridge | v0.6 |
v0.7: lifecycle, sys, builtins, warnings
| # | File | Focus | Shipped |
|---|---|---|---|
| 1622 | 1622_gopy_lifecycle.md | pylifecycle, preconfig, initconfig, pathconfig | v0.7 |
| 1624 | 1624_gopy_pythonrun.md | RunString / RunFile / REPL | v0.7 |
| 1651 | 1651_gopy_modules.md | builtins, sys, _warnings subsets | v0.7 |
v0.8: marshal, import, codecs; Module and set objects
| # | File | Focus | Shipped |
|---|---|---|---|
| 1681 | 1681_gopy_set.md | setobject.c (set, frozenset) | v0.8 |
| 1686 | 1686_gopy_exceptions.md | exceptions.c: ImportError / ModuleNotFoundError hierarchy | v0.8 |
| 1688 | 1688_gopy_module_misc.md | moduleobject.c (Module type, name / doc / file / loader / spec) | v0.8 |
| 1690 | 1690_gopy_marshal.md | marshal.c: TYPE_LONG, FLAG_REF, TYPE_CODE, TYPE_SET, TYPE_DICT, TYPE_COMPLEX, .pyc header (PEP 552) | v0.8 |
| 1691 | 1691_gopy_import.md | import.c, frozen.c: sys.modules cache, inittab, frozen table, ExecCodeModule, source/.pyc loaders, ImportModuleLevel, IMPORT_NAME/FROM | v0.8 |
| 1692 | 1692_gopy_codecs.md | codecs.c: registry, error handlers, built-in utf-8 / ascii / latin-1 codecs | v0.8 |
Written, partial scaffold (spec written, some code shipped, full panel pending)
| # | File | Focus | Phase |
|---|---|---|---|
| 1665 | 1665_gopy_tokenize.md | Python-tokenize.c public iterator surface | v0.5 / v0.9 |
| 1670 | 1670_gopy_objects_overview.md | Objects block overview (1670-1689) | meta |
| 1671 | 1671_gopy_object_protocol.md | Object interface, Header, VarHeader, refcount | v0.2 |
| 1672 | 1672_gopy_type.md | Type, slots, MRO, lookup | v0.2 |
| 1683 | 1683_gopy_abstract.md | abstract.c subset (PyObject_, PyNumber_) | v0.2+ |
Written, pending implementation
v0.9: contextvars, time, remaining VM bytecodes, runtime helpers (shipped)
Tag v0.9.0 published 2026-05-06. Tracker rows kept here for the
file-by-file map; full release notes live in changelog/v0.9.0.md.
| # | File | Focus | Status | Phase |
|---|---|---|---|---|
| 1634 | 1634_gopy_monitor.md | sys.monitoring + sys.settrace / setprofile | W | v0.9+ |
| 1645 | 1645_gopy_myreadline.md | myreadline.c, interactive readline editing | W | v0.9+ |
| 1662 | 1662_gopy_hamt.md | hamt.c, HAMT backing store for contextvars | S | v0.9 |
| 1663 | 1663_gopy_context.md | context.c, _contextvars.c, PEP 567 contextvars | S | v0.9 |
| 1664 | 1664_gopy_time.md | pytime.c, monotonic clock, conversions, deadline math | S | v0.9 |
| 1668 | 1668_gopy_runtime_helpers.md | getopt.c CLI option parser plus hashtable.c generic table | S | v0.9 |
| 1693 | 1693_gopy_vm_remaining.md | IMPORT_, RETURN_GENERATOR / YIELD / SEND, MATCH_, WITH_EXCEPT_START, BUILD_SET / SET_ADD | S | v0.9 |
v0.10: cycle GC, weakrefs, finalizers (in flight)
Branch feat/v0.10.0-gc. Spec status legend: W = spec written,
no code. C = code shipped, tests pending. S = code + tests shipped.
| # | File | Focus | Status | Phase |
|---|---|---|---|---|
| 1613 | 1613_gopy_gc.md | gc.c full collector (generations, weakref clearing, finalizer queue) plus gc_gil.c, object_stack.c | W | v0.10 |
| 1666 | 1666_gopy_tracemalloc.md | allocation tracing | W | v0.10 |
| 1689 | 1689_gopy_obj_misc.md | weakrefobject.c rows pulled forward to feed cycle clearing | W | v0.10 |
v0.11+: specialization, optimizer, debug
| # | File | Focus | Phase |
|---|---|---|---|
| 1631 | 1631_gopy_specialize.md | PEP 659 adaptive specialization | v0.11 |
| 1632 | 1632_gopy_optimizer.md | Tier-2 trace projector + abstract interp | v0.12 |
| 1633 | 1633_gopy_jit.md | Copy-and-patch JIT (deferred) | post-v1.0 |
| 1667 | 1667_gopy_remote_debug.md | remote debugging hooks | v0.13 |
Objects block: pending (code lands incrementally v0.2-v0.9)
| # | File | Focus | Phase |
|---|---|---|---|
| 1673 | 1673_gopy_long.md | longobject.c (PyLong, small-int cache) | v0.2 |
| 1674 | 1674_gopy_float_complex.md | floatobject.c (v0.2), complexobject.c (v0.6) | v0.2 / v0.6 |
| 1675 | 1675_gopy_bool_none.md | boolobject.c, None, NotImplemented, Ellipsis | v0.2 |
| 1676 | 1676_gopy_bytes.md | bytesobject.c, bytearrayobject.c, bytes_methods.c | v0.4 |
| 1677 | 1677_gopy_unicode.md | unicodeobject.c, unicodectype.c | v0.4 |
| 1678 | 1678_gopy_tuple.md | tupleobject.c, empty-tuple singleton | v0.2 |
| 1679 | 1679_gopy_list.md | listobject.c, list_resize curve, Timsort | v0.2 |
| 1680 | 1680_gopy_dict.md | dictobject.c, odictobject.c | v0.2 |
| 1682 | 1682_gopy_slice_range.md | sliceobject.c, rangeobject.c | v0.2 |
| 1684 | 1684_gopy_call.md | call.c, vectorcall | v0.6 |
| 1685 | 1685_gopy_descr_method.md | descrobject.c, methodobject.c, classobject.c, funcobject.c | v0.4 / v0.6 |
| 1687 | 1687_gopy_code_frame_gen.md | codeobject.c, frameobject.c, genobject.c, cellobject.c | v0.5.5 / v0.6 |
| 1689 | 1689_gopy_obj_misc.md | weakref, memoryview, typevar, union, GenericAlias, Interpolation, Template, obmalloc | v0.9+ |
Reserved (spec not yet written)
| # | File (planned) | Focus | Phase |
|---|---|---|---|
| 1612 | 1612_gopy_traceback.md | traceback.c data and formatting | v0.3 (retro) |
| 1614 | 1614_gopy_brc.md | brc.c biased refcount field layout | v0.3+ |
| 1615 | 1615_gopy_state.md | pystate.c Runtime / Interpreter / Thread | v0.3+ |
| 1698 | 1698_gopy_quirks.md | Cross-cutting quirks the porter must preserve | meta |
| 1699 | 1699_gopy_glossary.md | Glossary: C term to Go term mapping | meta |
Compatibility floors (what "100% compatible" means in practice)
The port is graded on the following observable surfaces. Each must match CPython byte-for-byte, except where noted:
- Bytecode: same opcode numbers, same oparg encoding, same EXTENDED_ARG
widening, same exception table format, same line-number table
(
co_linetable) format, same cache layout.dis.dis(f)output identical. - Marshal:
marshal.dumps(obj)produces identical bytes for the same object graph..pycfiles produced by gopy are loadable by CPython and vice versa, including version-magic-number compatibility. - Hash: SipHash-1-3 with the same key-derivation from the seed,
producing identical
hash(x)for str/bytes/numeric. (PYTHONHASHSEED=0 gives deterministic match.) - Eval semantics: every observable behaviour of
eval+execmatches: exception types, exception messages (string-equal), traceback frame order,__cause__/__context__chains, generator state, async iteration order. - Built-in module attributes:
sys.flags,sys.implementation.cache_tag(gopy uses its own cache tag, see Quirks),sys.version_info, andsys.pathsemantics. - Import:
importlib._bootstrapruns to completion.import foofinds modules by the same rules.__pycache__layout is identical. - Repr / format:
repr(obj)andformat(obj, spec)produce identical strings for builtins. Float repr uses shortest-roundtrip dtoa. - Error messages: exception constructors produce identical
str(exc)for identical inputs. (This is a high bar but a non-negotiable test target.)
Items where we intentionally diverge (recorded in 1698_gopy_quirks.md):
sys.implementation.nameis"gopy", not"cpython".sys.implementation.cache_tagis"gopy-3140"so.pycfiles do not collide.gc.is_finalizedand friends behave per CPython, but the underlying mechanism uses Go's GC plus an emulated refcount/cycle layer (see 1613).- C extension loading (
importlib.machinery.ExtensionFileLoader) is disabled by default; only Go-native extension modules load.
Test strategy
- The CPython test suite (
Lib/test/) is the reference oracle. - Phase 0 ships a "smoke" subset:
test_grammar,test_builtin,test_dis,test_marshal,test_compile,test_dict,test_list,test_int,test_str,test_exceptions. Once these pass, broaden. - Bytecode-level tests: dis(f) round-trip equivalence between
gopyand reference CPython, executed in CI. - Hash-stability tests with
PYTHONHASHSEED=0. - A
compat/subdirectory at the gopy root holds CPython-cross tests that run the same Python source under both runtimes and diff outputs.