Skip to main content

Modules/gcmodule.c

cpython 3.14 @ ab2d84fe1023/Modules/gcmodule.c

The gc module exposes CPython's cyclic garbage collector to Python code. CPython's reference-counting engine handles the common case of object deallocation, but reference cycles (object A holds a reference to B, which holds a reference back to A) keep both objects alive forever under pure reference counting. gcmodule.c implements the tri-generational mark-and-sweep collector that breaks these cycles.

The collector tracks every GC-managed object in one of three doubly-linked generation lists. A collection traverses all reachable objects, subtracts internal reference counts to find objects that are only referenced from within the candidate set, and finalises or frees those with zero external references.

This file also owns the entire gc.* Python-visible API: gc.collect(), gc.enable() / gc.disable() / gc.isenabled(), gc.get_count(), gc.get_threshold() / gc.set_threshold(), gc.get_objects(), gc.get_referrers() / gc.get_referents(), gc.freeze() / gc.unfreeze() / gc.get_freeze_count(), gc.is_finalized(), the gc.garbage list, and the gc.callbacks hook list.

Map

LinesSymbolRolegopy
1-200GC_NEXT/GC_PREV, GcState, gc_enable, gc_disable, gc_isenabled, gc_collectGeneration list macros, per-interpreter GC state, enable/disable/collect public API.module/gc/
200-500gc_get_count, gc_get_threshold, gc_set_threshold, gc_get_statsCollection counters, generation thresholds, per-generation statistics.module/gc/
500-800gc_get_objects, gc_get_referrers, gc_get_referentsObject enumeration and reference graph traversal via tp_traverse.module/gc/
800-1000gc_freeze, gc_unfreeze, gc_get_freeze_countFreeze/unfreeze permanent generation for immortal objects (CPython 3.12+).module/gc/
1000-1200gc_is_finalized, gc_callbacks, gc_garbage, gc_DEBUG_*, _gcmodule, PyInit_gcFinalization query, callback list, uncollectable garbage list, debug flags, module definition.module/gc/

Reading

gc.collect() triggering (lines 1 to 200)

cpython 3.14 @ ab2d84fe1023/Modules/gcmodule.c#L1-200

GC-tracked objects are linked into a doubly-linked list through the PyGC_Head header that precedes every GC-managed object in memory. The macros GC_NEXT and GC_PREV abstract the list pointer arithmetic:

#define GC_NEXT(o) ((PyGC_Head *)(o)->_gc_next)
#define GC_PREV(o) ((PyGC_Head *)(o)->_gc_prev)

The per-interpreter GcState struct holds three generation lists and the threshold counters that control when a collection is triggered:

struct _gc_runtime_state {
PyObject *trash_delete_later;
int trash_delete_nesting;
int enabled;
int debug;
struct gc_generation generations[NUM_GENERATIONS]; /* 0, 1, 2 */
PyGC_Head *generation0;
struct gc_generation permanent_generation; /* frozen objects */
struct gc_collection_stats stats[NUM_GENERATIONS];
Py_ssize_t long_lived_pending;
Py_ssize_t long_lived_total;
};

gc_collect drives the collector. When called without arguments it selects the highest generation that has crossed its threshold; when called with an explicit generation argument it collects exactly that generation and all younger ones:

static PyObject *
gc_collect_impl(PyObject *module, int generation)
{
PyThreadState *tstate = _PyThreadState_GET();
GcState *gcstate = &tstate->interp->gc;

if (gcstate->collecting) {
/* Reentrant call; return 0 immediately. */
return PyLong_FromSsize_t(0);
}
gcstate->collecting = 1;
Py_ssize_t n = collect_with_callback(tstate, generation);
gcstate->collecting = 0;
return PyLong_FromSsize_t(n);
}

collect_with_callback invokes each callable in gc.callbacks before and after the collection, passing a {"collected": n, "uncollectable": m} dict.

The allocation counter in generation 0 is incremented by _PyObject_GC_Alloc on every allocation. When it exceeds gcstate->generations[0].threshold, CPython schedules a generation-0 collection at the next safe point.

gc.freeze() and immortal objects (lines 800 to 1000)

cpython 3.14 @ ab2d84fe1023/Modules/gcmodule.c#L800-1000

gc.freeze() moves every object currently in the three normal generations into the permanent_generation list. Objects in the permanent generation are never considered for collection and are never traversed by the collector, making them effectively immortal from the GC's perspective.

This is used by Python's multiprocessing fork-server to pre-populate the parent interpreter's heap before forking. Because frozen objects are not tracked by the per-child GC, the child process avoids touching (and thus dirtying) the parent's copy-on-write pages:

static PyObject *
gc_freeze_impl(PyObject *module)
{
PyThreadState *tstate = _PyThreadState_GET();
GcState *gcstate = &tstate->interp->gc;

for (int i = 0; i < NUM_GENERATIONS; i++) {
PyGC_Head *gen_head = GEN_HEAD(gcstate, i);
PyGC_Head *perm_head = &gcstate->permanent_generation.head;
/* Splice the entire generation list onto the permanent list. */
if (!gc_list_is_empty(gen_head)) {
gc_list_merge(gen_head, perm_head);
}
gcstate->generations[i].count = 0;
}
Py_RETURN_NONE;
}

gc_list_merge does O(1) linked-list surgery: it repoints the tail of the source list to the head of the destination and vice versa, then reinitialises the source to an empty sentinel.

gc.unfreeze() moves all objects back from permanent_generation into generation 2, allowing them to be collected again. gc.get_freeze_count() returns permanent_generation.count without traversal.

get_referrers() traversal (lines 500 to 800)

cpython 3.14 @ ab2d84fe1023/Modules/gcmodule.c#L500-800

gc.get_referrers(*objs) returns a list of every GC-tracked object that holds a reference to any of the given objects. It walks all three generation lists and calls each object's tp_traverse slot with a custom visitor:

typedef struct {
PyObject *objs; /* tuple of target objects */
PyObject *result; /* list being built */
} TraverseData;

static int
referrers_traverse(PyObject *obj, void *arg)
{
TraverseData *d = (TraverseData *)arg;
if (PySequence_Contains(d->objs, obj)) {
/* obj is referenced by the object currently being traversed. */
/* The currently-traversed object is passed via the traversal
callback's outer context; see collect_referrers_to(). */
}
return 0;
}

The outer loop passes each candidate object to tp_traverse; if the visitor fires for any of the target objects, the candidate is added to the result list. Because tp_traverse only visits objects that the candidate holds direct references to, get_referrers returns direct referrers only, not the full transitive closure.

gc.get_referents(*objs) is the inverse: it calls tp_traverse on each of the given objects and collects everything the visitor is called with, building the set of objects directly referenced by the arguments.

gc.get_objects(generation=None) walks whichever generation lists are requested (or all three if generation is None) and returns their contents as a list, excluding the list object itself.

gopy mirror

module/gc/ (pending). The GC state maps to a Go struct with three gcGeneration slices indexed 0, 1, 2, plus a permanent slice for frozen objects. Each generation holds a doubly-linked list of *GcHead nodes that prefix every tracked object. Collect, Enable, Disable, Freeze, GetReferrers, and GetReferents port the corresponding CPython functions above. The gc.garbage and gc.callbacks module attributes are plain *objects.List values stored in the module's namespace dict.

CPython 3.14 changes

The permanent generation and gc.freeze() / gc.unfreeze() / gc.get_freeze_count() were added in CPython 3.12 (bpo-45953). Before 3.12 there were only three generations and no freeze API. The per-interpreter GcState (formerly global in Modules/gcmodule.c) moved into PyInterpreterState in 3.9 as part of the per-interpreter GC work. In 3.13 the GIL-free build introduced an incremental collector; the interface seen by Python code (gc.collect(), thresholds, callbacks) is unchanged, but the internal collection algorithm differs when --disable-gil is used.