Skip to main content

1600. gopy overview

Goal

gopy is a fresh re-implementation of CPython's interpreter core in Go. The target is 100% behavioural compatibility with the upstream CPython 3.14-era sources at $HOME/github/python/cpython. That means same data structures, same models, same code logic, same wire formats, same error messages. The only change is naming and surface API style, which adopts Go-idiomatic conventions modelled on the Go standard library.

This is not a clean-room reimagining. It is a line-by-line port. When behaviour deviates from CPython, the bug is in the port, not in CPython.

The CPython source-of-truth folder is cpython/Python/ (about 138k lines of C across 91 .c files plus ~30 .h files). The Go target is tamnd/gopy, currently at v0.9.0; v0.10 is in flight on feat/v0.10.0-gc.

Non-goals

  • No new features. No improved API. No "better" GC.
  • No Python 2 support.
  • No alternative implementations (we are not PyPy / Cinder / GraalPython).
  • No partial Python. The goal is to run unmodified CPython 3.14 stdlib.
  • C extension compatibility (PyObject* ABI) is out of scope. We will not load .so modules. C extensions are reimplemented in Go on demand.

Sources of truth

ConcernSource
Runtime semanticscpython/Python/*.c, cpython/Include/internal/*.h
Object semanticscpython/Objects/*.c
Parser / lexercpython/Parser/*
Stdlibcpython/Lib/*
Testscpython/Lib/test/*
Spec authority for ambiguitythe C source, not the docs

This 1600-series covers cpython/Python/, cpython/Objects/, and cpython/Parser/. The Objects port lives in the 1670-1689 sub-block (formerly numbered 1700-series, renumbered to keep one folder). The Parser port lives in the 1640-1645 sub-block. Stdlib ports are tracked in their own spec series.

High-level architecture

┌────────────────────────────────────────────────┐
│ gopy/cmd/gopy │
│ (entry point, like python.c) │
└────────────────────┬───────────────────────────┘

┌────────────────────▼───────────────────────────┐
│ gopy/lifecycle │
│ Initialize / Finalize / NewInterp │
└─┬──────────────┬──────────────┬────────────────┘
│ │ │
┌─────────▼──┐ ┌──────▼──────┐ ┌────▼────────┐
│ initconfig │ │ imp │ │ pythonrun │
│ preconfig │ │ importlib │ │ REPL/eval │
│ pathconfig │ │ marshal │ │ pyc rd/wr │
└────────────┘ └──────┬──────┘ └────┬────────┘
│ │
┌────────────────▼──────────────▼─────────────────┐
│ gopy/state │
│ Runtime · Interpreter · Thread · CrossInterp │
└─┬──────────────┬─────────────────┬──────────────┘
│ │ │
┌──────────▼─┐ ┌───────▼──────┐ ┌──────▼─────────┐
│ gopy/ │ │ gopy/ │ │ gopy/ │
│ vm │ │ compile │ │ gc │
│ (ceval, │ │ ast/sym/ │ │ cycle coll, │
│ frame, │ │ codegen/ │ │ refcount, │
│ uops) │ │ flowgraph/ │ │ arena, brc, │
│ │ │ assemble │ │ qsbr, weakref │
└──┬─────────┘ └──┬───────────┘ └────────────────┘
│ │
┌──────▼────────┐ ┌─────▼──────────┐
│ gopy/ │ │ gopy/ │
│ specialize │ │ tokenize │
│ optimizer │ │ parser (sep.)│
│ jit (deferred│ └────────────────┘
│ monitor │
└───────────────┘

Cross-cutting: gopy/pysync, gopy/hash, gopy/pytime,
gopy/format, gopy/pystrconv, gopy/codecs,
gopy/errors, gopy/traceback, gopy/warnings,
gopy/contextvar, gopy/hamt, gopy/hashtable,
gopy/intrinsics, gopy/structmember, gopy/getargs,
gopy/modsupport, gopy/builtin, gopy/sysmod,
gopy/tracemalloc, gopy/audit, gopy/monitor

Spec files in this series

Implemented (spec written and code shipped)

Meta / infrastructure

#FileFocusShipped
16001600_gopy_overview.mdThis filemeta
16011601_gopy_naming.mdNaming conventions: C to Go translation rulesmeta
16021602_gopy_filemap.mdC source to Go package mapping (every file)meta
16031603_gopy_roadmap.mdPhased milestone plan v0.1 to v1.0meta
16301630_gopy_vm_overview.mdVM block overview (Tier-1 interpreter)meta
16401640_gopy_parser_overview.mdParser block overviewmeta

v0.1: arena and sync

#FileFocusShipped
16041604_gopy_arena.mdpyarena.c portv0.1
16051605_gopy_pythread.mdthread.c cross-platform portv0.1
16061606_gopy_pysync.mdlock.c, parking_lot.c, critical_section.cv0.1
16071607_gopy_hashsecret.mdbootstrap_hash.c seed initv0.1

v0.3: errors and traceback

#FileFocusShipped
16111611_gopy_errors.mderrors.c plus the BaseException gating subsetv0.3

v0.4: strings, numbers, hash

#FileFocusShipped
16601660_gopy_strings_numbers.mdpyctype, pystrcmp, mystrtoul, pystrtod, dtoa, pystrhex, pymath, pyfpe, formatter_unicodev0.4
16611661_gopy_hash.mdpyhash.c (SipHash-1-3, FNV-1a)v0.4

v0.5 / v0.5.5: compiler and parser

#FileFocusShipped
16201620_gopy_compile_pipeline.mdast, asdl, future, symtable, codegen, flowgraph, assemble, compile, instruction_sequence, ast_preprocess, ast_unparsev0.5
16251625_gopy_compile_testing.mdPer-checkbox test plan for 1620 and 1665v0.5
16261626_gopy_codegen.mdcodegen.c port detailv0.5
16271627_gopy_flowgraph.mdflowgraph.c port detail (CFG, passes; stackdepth + super-instr deferred)v0.5
16281628_gopy_assemble.mdassemble.c port detailv0.5
16291629_gopy_compile_goldens.mdDisassembly golden corpus for v05testv0.5
16411641_gopy_lexer_tokenizer.mdParser/lexer/, Parser/tokenizer/v0.5.5
16421642_gopy_pegen.mdpegen.c, parser.c, generated PEG runtimev0.5.5
16431643_gopy_parser_errors.mdpegen_errors.c, action_helpers.c, peg_api.c, token.cv0.5.5
16441644_gopy_string_parser.mdstring_parser.c (f-string, t-string, bytes)v0.5.5

v0.6: VM Tier-1

#FileFocusShipped
16211621_gopy_bytecodes_dsl.mdbytecodes.c DSL parser + Go-emitting generatorv0.6
16351635_gopy_intrinsics.mdintrinsics.c (CALL_INTRINSIC_1 / 2 dispatch)v0.6
16361636_gopy_eval_loop.mdceval.c, ceval_macros.h, opcode dispatch loopv0.6
16371637_gopy_frame.mdframe.c, frame layout, locals, generator statev0.6
16381638_gopy_stackref.mdstackrefs.c, tagged stack valuesv0.6
16391639_gopy_eval_gil.mdceval_gil.c, GIL, eval breaker, signal bridgev0.6

v0.7: lifecycle, sys, builtins, warnings

#FileFocusShipped
16221622_gopy_lifecycle.mdpylifecycle, preconfig, initconfig, pathconfigv0.7
16241624_gopy_pythonrun.mdRunString / RunFile / REPLv0.7
16511651_gopy_modules.mdbuiltins, sys, _warnings subsetsv0.7

v0.8: marshal, import, codecs; Module and set objects

#FileFocusShipped
16811681_gopy_set.mdsetobject.c (set, frozenset)v0.8
16861686_gopy_exceptions.mdexceptions.c: ImportError / ModuleNotFoundError hierarchyv0.8
16881688_gopy_module_misc.mdmoduleobject.c (Module type, name / doc / file / loader / spec)v0.8
16901690_gopy_marshal.mdmarshal.c: TYPE_LONG, FLAG_REF, TYPE_CODE, TYPE_SET, TYPE_DICT, TYPE_COMPLEX, .pyc header (PEP 552)v0.8
16911691_gopy_import.mdimport.c, frozen.c: sys.modules cache, inittab, frozen table, ExecCodeModule, source/.pyc loaders, ImportModuleLevel, IMPORT_NAME/FROMv0.8
16921692_gopy_codecs.mdcodecs.c: registry, error handlers, built-in utf-8 / ascii / latin-1 codecsv0.8

Written, partial scaffold (spec written, some code shipped, full panel pending)

#FileFocusPhase
16651665_gopy_tokenize.mdPython-tokenize.c public iterator surfacev0.5 / v0.9
16701670_gopy_objects_overview.mdObjects block overview (1670-1689)meta
16711671_gopy_object_protocol.mdObject interface, Header, VarHeader, refcountv0.2
16721672_gopy_type.mdType, slots, MRO, lookupv0.2
16831683_gopy_abstract.mdabstract.c subset (PyObject_, PyNumber_)v0.2+

Written, pending implementation

v0.9: contextvars, time, remaining VM bytecodes, runtime helpers (shipped)

Tag v0.9.0 published 2026-05-06. Tracker rows kept here for the file-by-file map; full release notes live in changelog/v0.9.0.md.

#FileFocusStatusPhase
16341634_gopy_monitor.mdsys.monitoring + sys.settrace / setprofileWv0.9+
16451645_gopy_myreadline.mdmyreadline.c, interactive readline editingWv0.9+
16621662_gopy_hamt.mdhamt.c, HAMT backing store for contextvarsSv0.9
16631663_gopy_context.mdcontext.c, _contextvars.c, PEP 567 contextvarsSv0.9
16641664_gopy_time.mdpytime.c, monotonic clock, conversions, deadline mathSv0.9
16681668_gopy_runtime_helpers.mdgetopt.c CLI option parser plus hashtable.c generic tableSv0.9
16931693_gopy_vm_remaining.mdIMPORT_, RETURN_GENERATOR / YIELD / SEND, MATCH_, WITH_EXCEPT_START, BUILD_SET / SET_ADDSv0.9

v0.10: cycle GC, weakrefs, finalizers (in flight)

Branch feat/v0.10.0-gc. Spec status legend: W = spec written, no code. C = code shipped, tests pending. S = code + tests shipped.

#FileFocusStatusPhase
16131613_gopy_gc.mdgc.c full collector (generations, weakref clearing, finalizer queue) plus gc_gil.c, object_stack.cWv0.10
16661666_gopy_tracemalloc.mdallocation tracingWv0.10
16891689_gopy_obj_misc.mdweakrefobject.c rows pulled forward to feed cycle clearingWv0.10

v0.11+: specialization, optimizer, debug

#FileFocusPhase
16311631_gopy_specialize.mdPEP 659 adaptive specializationv0.11
16321632_gopy_optimizer.mdTier-2 trace projector + abstract interpv0.12
16331633_gopy_jit.mdCopy-and-patch JIT (deferred)post-v1.0
16671667_gopy_remote_debug.mdremote debugging hooksv0.13

Objects block: pending (code lands incrementally v0.2-v0.9)

#FileFocusPhase
16731673_gopy_long.mdlongobject.c (PyLong, small-int cache)v0.2
16741674_gopy_float_complex.mdfloatobject.c (v0.2), complexobject.c (v0.6)v0.2 / v0.6
16751675_gopy_bool_none.mdboolobject.c, None, NotImplemented, Ellipsisv0.2
16761676_gopy_bytes.mdbytesobject.c, bytearrayobject.c, bytes_methods.cv0.4
16771677_gopy_unicode.mdunicodeobject.c, unicodectype.cv0.4
16781678_gopy_tuple.mdtupleobject.c, empty-tuple singletonv0.2
16791679_gopy_list.mdlistobject.c, list_resize curve, Timsortv0.2
16801680_gopy_dict.mddictobject.c, odictobject.cv0.2
16821682_gopy_slice_range.mdsliceobject.c, rangeobject.cv0.2
16841684_gopy_call.mdcall.c, vectorcallv0.6
16851685_gopy_descr_method.mddescrobject.c, methodobject.c, classobject.c, funcobject.cv0.4 / v0.6
16871687_gopy_code_frame_gen.mdcodeobject.c, frameobject.c, genobject.c, cellobject.cv0.5.5 / v0.6
16891689_gopy_obj_misc.mdweakref, memoryview, typevar, union, GenericAlias, Interpolation, Template, obmallocv0.9+

Reserved (spec not yet written)

#File (planned)FocusPhase
16121612_gopy_traceback.mdtraceback.c data and formattingv0.3 (retro)
16141614_gopy_brc.mdbrc.c biased refcount field layoutv0.3+
16151615_gopy_state.mdpystate.c Runtime / Interpreter / Threadv0.3+
16981698_gopy_quirks.mdCross-cutting quirks the porter must preservemeta
16991699_gopy_glossary.mdGlossary: C term to Go term mappingmeta

Compatibility floors (what "100% compatible" means in practice)

The port is graded on the following observable surfaces. Each must match CPython byte-for-byte, except where noted:

  1. Bytecode: same opcode numbers, same oparg encoding, same EXTENDED_ARG widening, same exception table format, same line-number table (co_linetable) format, same cache layout. dis.dis(f) output identical.
  2. Marshal: marshal.dumps(obj) produces identical bytes for the same object graph. .pyc files produced by gopy are loadable by CPython and vice versa, including version-magic-number compatibility.
  3. Hash: SipHash-1-3 with the same key-derivation from the seed, producing identical hash(x) for str/bytes/numeric. (PYTHONHASHSEED=0 gives deterministic match.)
  4. Eval semantics: every observable behaviour of eval + exec matches: exception types, exception messages (string-equal), traceback frame order, __cause__/__context__ chains, generator state, async iteration order.
  5. Built-in module attributes: sys.flags, sys.implementation.cache_tag (gopy uses its own cache tag, see Quirks), sys.version_info, and sys.path semantics.
  6. Import: importlib._bootstrap runs to completion. import foo finds modules by the same rules. __pycache__ layout is identical.
  7. Repr / format: repr(obj) and format(obj, spec) produce identical strings for builtins. Float repr uses shortest-roundtrip dtoa.
  8. Error messages: exception constructors produce identical str(exc) for identical inputs. (This is a high bar but a non-negotiable test target.)

Items where we intentionally diverge (recorded in 1698_gopy_quirks.md):

  • sys.implementation.name is "gopy", not "cpython".
  • sys.implementation.cache_tag is "gopy-3140" so .pyc files do not collide.
  • gc.is_finalized and friends behave per CPython, but the underlying mechanism uses Go's GC plus an emulated refcount/cycle layer (see 1613).
  • C extension loading (importlib.machinery.ExtensionFileLoader) is disabled by default; only Go-native extension modules load.

Test strategy

  • The CPython test suite (Lib/test/) is the reference oracle.
  • Phase 0 ships a "smoke" subset: test_grammar, test_builtin, test_dis, test_marshal, test_compile, test_dict, test_list, test_int, test_str, test_exceptions. Once these pass, broaden.
  • Bytecode-level tests: dis(f) round-trip equivalence between gopy and reference CPython, executed in CI.
  • Hash-stability tests with PYTHONHASHSEED=0.
  • A compat/ subdirectory at the gopy root holds CPython-cross tests that run the same Python source under both runtimes and diff outputs.