Skip to main content

v0.10.0 - The cycle collector

Released May 6, 2026.

Reference counting is the easy part of CPython's memory model. You add a Py_INCREF when you take a reference, you add a Py_DECREF when you let one go, and when the count hits zero you free the object. Every CPython tutorial gets to that point in a paragraph.

Then someone writes a = []; a.append(a) and the easy part stops working. The list holds a reference to itself, the refcount never hits zero, and the memory never comes back. Real Python programs hit this kind of cycle constantly. Closures capture frames that capture closures. Exception tracebacks capture frames that capture exceptions. Logging handlers register themselves on loggers that keep handlers alive. Without a cycle collector, a long-running Python process slowly turns into a memory leak.

That's why CPython carries the generational cycle collector in Python/gc.c. It's the second half of the memory model, the half that catches what refcounting can't. v0.3 of gopy shipped only the first half. We had a tracked map, a gc.collect() entry point, and a comment in the source that said "the rest lands in v0.10".

v0.10.0 makes good on that comment. Every function in Python/gc.c gc_collect_main now has a Go counterpart with a 1:1 citation. update_refs snapshots refcounts onto each gcHead, subtract_refs drains intra-cycle references through the tp_traverse slot, move_unreachable splits survivors from cycle junk, handle_weakrefs clears weakrefs and queues their callbacks, finalize_garbage runs PEP 442 finalizers, and survivors get promoted into the next generation. The gc builtin module exposes the full public surface CPython publishes.

Highlights

Three pieces of work define this release.

The full collector algorithm

The collector is not one function. It's a choreographed sequence of eight passes over the candidate set, each touching the same list in a different way, each leaving the list in the shape the next pass expects. Getting any one pass wrong means objects either leak (false negative) or get freed while live code still holds them (false positive). The CPython implementation has been polished for two decades; we ported it the way it stands rather than rolling our own.

import gc

class Node:
def __init__(self):
self.peer = None

a = Node()
b = Node()
a.peer = b
b.peer = a
del a, b

# Without the collector, the two nodes outlive their last name
# binding and leak. With it, gc.collect() finds the cycle through
# tp_traverse, breaks it, and reclaims both objects.
gc.collect()

The driver gc_collect_main in gc/collector.go runs the same phases in the same order CPython does:

  1. Merge the younger generations into the candidate list. When the user calls gc.collect(2), generations 0 and 1 drain into the gen-2 list so the collector treats the survivors of each younger pass as candidates of the next.
  2. update_refs. Walk every tracked object and copy its refcount into the gcHead.refs shadow field. We will not mutate the real refcount; we will only ever decrement the shadow.
  3. subtract_refs. Walk every tracked object's tp_traverse, and for each reference target that lives inside the candidate set, decrement the target's shadow refcount. After this pass, any object whose shadow is still positive must be reachable from outside the candidate set (a real live reference points at it).
  4. move_unreachable. Partition the candidate list. Objects with positive shadows (still externally referenced) plus the transitive closure of objects they reach through tp_traverse are the survivors. Everything else is unreachable.
  5. handle_weakrefs. Walk the unreachable list, clear every weakref that points into it, and queue the weakref's callback to fire after the collector releases its lock.
  6. finalize_garbage. Run each unreachable object's Finalizer exactly once. Mark _PyGC_PREV_MASK_FINALIZED so a resurrection can't re-fire it.
  7. delete_garbage. Drop the unreachable objects from the tracked map. The Go runtime reclaims them once the last reference goes out of scope.
  8. Promote survivors. Move the survivors into the next generation's list. Fire the queued weakref callbacks.

Each pass is a separate file under gc/ so the diff against Python/gc.c is legible function-by-function. We deliberately kept the file decomposition matching the CPython source layout, not Go convention, because the next time CPython 3.14.x ships a gc patch we want to apply it with git apply rather than rewrite.

tp_traverse on every container

The collector is useless without tp_traverse. The slot tells the collector "here are the references this object holds". If a list forgets to walk its elements, the collector can't find the cycle. If a dict forgets to walk both keys and values, the collector misses half the edges.

We ported tp_traverse for every container type that lived in the tree at v0.10.0: tuple, list, dict, set, frozenset. Each walks its elements through the user-supplied visitproc the way CPython does. Future container ports plug into the same slot.

// objects/list.go - the tp_traverse implementation.
func listTraverse(self Object, visit Visitproc, arg interface{}) int {
l := self.(*List)
for _, elem := range l.items {
if rc := visit(elem, arg); rc != 0 {
return rc
}
}
return 0
}

The slot mirrors Include/cpython/object.h tp_traverse. Cooperation with the collector is opt-in: only types that publish TpTraverse get walked. That's the same contract CPython uses, and it's what lets the collector skip atomic objects (ints, strings, bytes) that can never be part of a cycle.

gc module public surface

The Python-level gc module is what every user-facing memory tool reaches for. objgraph, pympler, tracemalloc, every "why is my Python process eating RAM" diagnostic in the ecosystem calls into one of these entry points. We ported the lot:

import gc

gc.collect() # Run a full collection. Returns object count freed.
gc.enable() / gc.disable() # Toggle auto-collection.
gc.isenabled() # Read the toggle.
gc.get_threshold() # (700, 10, 10) by default.
gc.set_threshold(800, 12, 12)
gc.get_count() # Per-generation allocation deltas.
gc.is_tracked(obj) # Is this object on the tracked map?
gc.get_objects(gen=None) # All tracked objects, optionally per-gen.
gc.get_referrers(*objs) # Who points at these objects?
gc.get_referents(*objs) # What do these objects point at?
gc.freeze() # Move tracked objects to the permanent gen.
gc.unfreeze() # Drain the permanent gen back.
gc.get_freeze_count() # Size of the permanent gen.
gc.garbage # Unreclaimable objects (with DEBUG_SAVEALL).
gc.callbacks # Hooks fired around each collection.

These names are not casual. They're the contract real production code keys off, and we matched them entry-for-entry against Modules/gcmodule.c. Each entry point is a thin Go shim around the collector's internal Go API; the dispatch matches what the C module does so attribute lookup, docstrings, and signatures all match.

What's new

The full feature breakdown, grouped by package.

gc/

The new package carries the collector itself. Each file under gc/ ports a region of Python/gc.c (or Modules/gcmodule.c for the module surface). The file split matches the CPython source layout deliberately.

  • state.go. gcState mirrors _gc_runtime_state from Include/internal/pycore_interp_structs.h. Three generations with the CPython default thresholds (700 / 10 / 10), the permanent generation for gc.freeze, the tracked map, the finalizer registry, and the per-referent weakref index. Ports Python/gc.c _PyGC_InitState. We kept the state struct flat (no hidden indirection) so the dump tooling reads memory the way CPython's debugger extensions expect.
  • list.go. gcHead is the doubly-linked list head every tracked object carries. Fields: prev, next, obj, refs (the shadow refcount for the collector), and a flags bitfield carrying gcFinalized, gcCollecting, and gcUnreachable. Operations: gc_list_init, gc_list_append, gc_list_remove, gc_list_merge, gc_list_size. Ports Python/gc.c gc_list_*.
  • objstack.go. The chunked work queue the reachable visit walk pushes into. CPython uses a stack of fixed-size blocks rather than a Go slice because the worst-case depth on pathological graphs can spike, and the chunked layout amortises growth. We kept the chunk size identical to CPython's. Ports Python/gc.c _PyObjectStack.
  • gil.go. Entry and exit guards that route the VM into the collector at safe points. The free-threaded build of CPython uses a different file (Python/gc_free_threading.c); we ship only the GIL-enabled layout for v0.10. Ports Python/gc_gil.c.
  • refs.go. update_refs copies the live refcount onto each gcHead.refs; subtract_refs walks tp_traverse and decrements the targets' shadow refcounts to expose intra-cycle references. Ports Python/gc.c update_refs / subtract_refs. These two functions are the heart of the algorithm. Get them right and the collector is correct. Get them wrong and you either leak (refs counted twice) or free live data (refs counted zero).
  • reachable.go. move_unreachable partitions the candidate set using the shadow refcounts. visit_reachable rolls back objects pulled in by surviving traversals. clear_unreachable_mask and untrack_tuples close out the pass. Ports Python/gc.c move_unreachable / visit_reachable / untrack_tuples. The tuple untrack is a small but important optimisation: a tuple whose elements are all atomic can never be part of a cycle, so the collector evicts it from the tracked map.
  • weakref.go. RegisterWeakref records each weakref against its referent. handle_weakrefs clears the referent and queues (weakref, callback) pairs. Callbacks fire after the collector lock has been released so they can allocate, raise, or trigger another collection without deadlocking. Ports Python/gc.c handle_weakrefs.
  • finalize.go. finalize_garbage walks the unreachable list and invokes each registered Finalizer exactly once, setting the _PyGC_PREV_MASK_FINALIZED flag so a second pass cannot re-fire it. reclaim_unreachable drops the tracked-map entries and unlinks the list; the Go runtime reclaims memory once the last reference goes out of scope. Ports Python/gc.c finalize_garbage / delete_garbage.
  • collector.go. The gc_collect_main driver. Drains generations 0..gen into a young list, runs the eight-phase algorithm, fires weakref callbacks outside the lock, promotes survivors. Ports Python/gc.c gc_collect_main.
  • inspect.go. GetObjects(gen), GetReferrers, GetReferents, Freeze, Unfreeze, GetFreezeCount, Garbage. The tp_traverse-driven helpers walk the tracked map directly so they match what gc.get_referrers and gc.get_referents return in CPython. Ports Modules/gcmodule.c gc_get_objects_impl / gc_get_referrers / gc_get_referents / gc_freeze / gc_unfreeze / gc_get_freeze_count_impl.
  • module.go. The gc built-in module surface: collect, enable, disable, isenabled, get_threshold, set_threshold, get_count, is_tracked, get_objects, get_referrers, get_referents, freeze, unfreeze, get_freeze_count, plus the garbage and callbacks list attributes. Ports Modules/gcmodule.c.

objects/

Two changes in the object layer make the collector possible.

  • type.go. The TpTraverse slot lands on Type. Each container type implements it: tuple, list, dict, and set walk their elements through the user-supplied visitproc. The collector uses this slot to discover intra-cycle edges. Ports Include/cpython/object.h tp_traverse. Without this slot the collector would be flying blind: every object would look like a leaf and the cycle-detection algorithm couldn't find anything.
  • weakref.go. Weakref is now a real Python object with Referent, Callback, Clear. WeakrefType carries the Call slot (returns the referent or None) plus Repr and Hash. The collector calls into Clear once the referent goes unreachable. Ports Objects/weakrefobject.c PyWeakref_NewRef plus the v0.10 subset of weakref.ref (proxy and CallableProxy ride along on the same scaffolding in v0.10.1).

Why we built it this way

Three design calls deserve a callout.

Why a 1:1 port instead of a Go-flavoured cycle collector

Go has a tracing garbage collector. The obvious shortcut would have been to lean on it: tag every Python object as a Go pointer, let the Go runtime reclaim cycles, and skip the entire eight-phase algorithm. We considered it and rejected it for three reasons.

The semantics don't match. CPython runs finalizers in a specific order (PEP 442), suppresses re-fire on _PyGC_PREV_MASK_FINALIZED, and clears weakrefs before finalizers run. Go's runtime does none of those things. Real Python programs key off the ordering. A __del__ that closes a file handle, fires before a __del__ that removes the file: that's an interaction the program author wrote intentionally and the collector promises to honour.

The timing doesn't match. CPython collects on allocation hooks (every 700 allocations by default at gen 0). Go collects on heap growth. A long-running Python service that holds millions of small objects in steady state would never trigger Go's collector, because the heap isn't growing, but CPython would collect constantly because the allocation count keeps moving.

The introspection doesn't match. gc.get_referrers(obj) is a contract real production code uses. objgraph builds its visualisations off it. The Go runtime gives us no equivalent: we can't ask "which Go pointers point at this Go pointer?" without reimplementing exactly the bookkeeping we'd save by leaning on Go in the first place.

The shortcut would have saved a week and broken every program that uses Python's memory model the way Python's memory model documents it. The 1:1 port costs more lines and preserves the semantics. The choice was easy.

Why we kept the chunked work queue

Both Go's slice and the CPython chunked stack handle "push and pop integers" with similar throughput. The pathological case is where the chunked stack pulls ahead: a deeply-recursive object graph (think a -> b -> c -> ... -> a with 100,000 hops) bursts the work queue to 100,000 entries during one traversal. With a Go slice, that's one giant allocation and a memcpy on every resize. With the chunked stack, it's 12 chunk allocations of 8192 entries each, no copies, and the chunks come from a free list. The chunked layout is what makes the worst case manageable.

Why finalizers go through Finalizer not __del__

PEP 442 unifies finalizers under tp_finalize. The legacy __del__ path (the one that resurrects objects into gc.garbage) is what PEP 442 fixed. gopy ships only the PEP 442 path; legacy resurrection-via-__del__ is out of scope. A program that writes a __del__ method gets a finalizer that fires once, in the order the collector visits the unreachable list, and then the object goes. That matches what CPython 3.14 does for new code.

Where it lives

  • gc/collector.go is the main driver. Read here first to follow the algorithm at the top level.
  • gc/refs.go is the heart of the cycle-detection trick. The function pair update_refs / subtract_refs is what makes the collector work; everything else is bookkeeping around them.
  • gc/module.go is the user-facing surface.
  • objects/type.go carries the TpTraverse slot. Every container port that lands after v0.10.0 plugs into it.

Compatibility

A few user-visible changes are worth flagging.

  • gc.collect() now actually collects cycles. Programs that worked around the v0.3 stub (manual breaking of cycles, periodic process restarts) can drop the workaround.
  • gc.get_objects() returns every tracked object, not just the ones in a particular generation. The optional generation argument restricts the result the same way CPython 3.14 does.
  • __del__ runs through PEP 442 only. A program that depended on the legacy resurrection semantics (resurrection-via-__del__ pulling an object back into the live graph and skipping collection) will see the object collected. This matches CPython 3.14's behaviour for new-style finalizers.

What's next

The remaining cycle-collector work pinned to v0.11:

  • gc.callbacks invocation. The list attribute exists; running the registered callbacks before and after each collection lands with the full callbacks panel in v0.11.
  • The gc_select_generation auto-trigger. Manual gc.collect() is wired today; the allocator-hook auto-trigger arrives once the obmalloc port lands.

Out of scope for v0.10.x:

  • Free-threaded collector. Python/gc_free_threading.c is the no-GIL build of the same algorithm. gopy ships only the GIL-enabled layout. When the no-GIL port lands (likely v0.13 or later), the file pair joins the tree.
  • Legacy __del__ resurrection and gc.garbage. PEP 442 unifies finalizers; the legacy path is intentionally not ported.

The other major thread shipped this same week is the v0.10.1 backlog drop (compile / eval / exec, __build_class__, __slots__, super, io.open, the myreadline dispatch hook, plus a long tail of small backlogged ports). That release builds on the collector landing here: weakref.WeakSet and friends in v0.10.1 use the weakref clearing path the collector ships in v0.10.0.

Acknowledgments

This release closes the long-standing v0.3 stub for the cycle collector. Spec 1611 (gc full port) is the design doc; the algorithm follows Python/gc.c line-for-line, and every Go file under gc/ carries citations into the CPython source so future 3.14.x rebases stay tractable.