Skip to main content

hamt.c

Full implementation of CPython's immutable Hash Array Mapped Trie. Every contextvars.Context mutation allocates only the path from root to the changed leaf; all unchanged subtrees are shared with the old version. The result is O(log N) set/delete and O(1) snapshot.

Map

LinesSymbolRole
10–240Block commentDesign narrative: HAMT algorithm, node types, operations
280–340hamt_bitpos, hamt_bitindex, hamt_mask5-bit hash slice helpers
348–640BitmapNode find/assoc/without helpersCopy-on-write path for sparse nodes
643–900hamt_node_bitmap_assocInsert into a BitmapNode with structural sharing
903–1038hamt_node_bitmap_withoutDelete from a BitmapNode; shrink or downgrade
1040–1240hamt_node_bitmap_findLookup in a BitmapNode
1241–1460CollisionNode assoc/find/withoutFlat array for equal-hash keys
1466–1600ArrayNode find/without helpersFull 32-child node operations
1605–1840hamt_node_array_assocInsert into an ArrayNode
1841–1990hamt_node_array_findLookup in an ArrayNode
1994–2280Iteration: hamt_iterator_*Zero-allocation depth-first traversal
2287–2300hamt_findInternal dispatch for top-level lookup
2303–2320_PyHamt_FindPublic lookup converting hamt_find_t to int
2321–2450_PyHamt_Assoc, _PyHamt_WithoutPublic mutating API
2450–2885Type slots, PyTypeObject defs, module initPython type machinery

Reading

5-bit hash slice helpers (lines 280–340)

Each level of the trie consumes 5 bits of the hash, giving at most 32 children. hamt_mask extracts the 5-bit index at the current shift; hamt_bitpos converts it to a single-bit mask; hamt_bitindex counts how many lower bits are set to find the slot index inside the sparse array.

// CPython: Python/hamt.c:280 hamt_mask
#define hamt_mask(hash, shift) (((uint32_t)(hash) >> (shift)) & 0x01f)
#define hamt_bitpos(hash, shift) ((uint32_t)1 << hamt_mask((hash), (shift)))
#define hamt_bitindex(bitmap, bit) \
_Py_popcount32((bitmap) & ((bit) - 1))

_Py_popcount32 maps to a hardware POPCNT instruction on x86 and ARM, so slot lookup is a single instruction after the mask.

BitmapNode assoc (lines 643–900)

// CPython: Python/hamt.c:643 hamt_node_bitmap_assoc
hamt_node_bitmap_assoc(PyHamtNode_Bitmap *self,
uint32_t shift, int32_t hash,
PyObject *key, PyObject *val, int* added_leaf)
{
uint32_t bit = hamt_bitpos(hash, shift);
uint32_t idx = hamt_bitindex(self->b_bitmap, bit);

if (self->b_bitmap & bit) {
/* slot already occupied: either update value or recurse */
uint32_t key_idx = 2 * idx;
PyObject *key_or_null = self->b_array[key_idx];
...
}
/* slot empty: copy node, insert new key/val pair */
...
}

When the bitmap bit for the current 5-bit slice is already set, the function checks whether the existing slot holds a key (direct entry) or a sub-node pointer (key is NULL). If a key matches, a new BitmapNode is allocated with one slot replaced. If it is a sub-node, the call recurses. If the bit is clear, the node is copied with one extra slot inserted. All paths allocate at most one new node per tree level.

BitmapNode without (lines 903–1038)

// CPython: Python/hamt.c:903 hamt_node_bitmap_without
hamt_node_bitmap_without(PyHamtNode_Bitmap *self,
uint32_t shift, int32_t hash,
PyObject *key,
PyHamtNode **new_node)
{
uint32_t bit = hamt_bitpos(hash, shift);
if ((self->b_bitmap & bit) == 0) {
return W_NOT_FOUND;
}
...
}

Removal mirrors insertion. When a BitmapNode is left with only one key/value pair after deletion it may be replaced by a plain leaf, avoiding nodes with a single child. An empty BitmapNode returns W_EMPTY so the caller can free the path entirely.

hamt_find and dispatch (lines 2287–2300)

// CPython: Python/hamt.c:2287 hamt_find
static hamt_find_t
hamt_find(PyHamtObject *o, PyObject *key, PyObject **val)
{
if (o->h_count == 0) {
return F_NOT_FOUND;
}
int32_t key_hash = hamt_hash(key);
if (key_hash == -1) {
return F_ERROR;
}
return hamt_node_find(o->h_root, 0, key_hash, key, val);
}

hamt_node_find is a type-dispatch wrapper that checks Py_TYPE(node) and calls hamt_node_bitmap_find, hamt_node_array_find, or hamt_node_collision_find. The initial shift argument is 0; each recursive call passes shift + 5 until a leaf or collision is reached.

gopy notes

The gopy port of HAMT lives under module/contextlib/ and in objects/ for the node types. The copy-on-write allocation pattern translates directly to Go: each assoc/without call returns a new *BitmapNode or *ArrayNode using make. The hamt_find_t enum (F_ERROR, F_NOT_FOUND, F_FOUND) maps to (found bool, err error).

The hamt_bitindex popcount is available as bits.OnesCount32 from the Go standard library math/bits package, matching the CPython _Py_popcount32 call exactly.

CPython 3.14 changes

The algorithm is unchanged since Python 3.7. In 3.14 the PyHamtObject and PyHamtNode_* struct definitions moved from the local file into Include/internal/pycore_structs.h to satisfy the one-definition-rule across the new free-threaded build. The hamt.c implementation file still owns all function bodies; only the struct declarations relocated.