Skip to main content

Modules/_heapqmodule.c (part 3)

Source:

cpython 3.14 @ ab2d84fe1023/Modules/_heapqmodule.c

This annotation covers heap construction and the sift operations. See modules_heapq2_detail for heappush and heappop.

Map

LinesSymbolRole
1-80heapifyConvert a list into a heap in-place, O(n)
81-180_siftdownRestore heap after inserting at the end
181-280_siftupRestore heap after removing the root
281-400nlargest / nsmallestEfficient k-largest/smallest algorithms

Reading

heapify

// CPython: Modules/_heapqmodule.c:260 heapify
static PyObject *
heapify(PyObject *module, PyObject *heap)
{
Py_ssize_t n = PyList_GET_SIZE(heap);
/* Start from the last parent and sift down each subtree root */
for (Py_ssize_t i = n / 2 - 1; i >= 0; i--) {
if (_siftup((PyListObject *)heap, i) < 0)
return NULL;
}
Py_RETURN_NONE;
}

heapify is O(n) because lower nodes are cheaper to sift. A naive O(n log n) approach would push each element one by one. The bottom-up pass makes each of the n/2 parent nodes a valid heap root, at amortized cost of O(1) per node.

_siftdown

// CPython: Modules/_heapqmodule.c:80 _siftdown
static int
_siftdown(PyListObject *heap, Py_ssize_t startpos, Py_ssize_t pos)
{
PyObject *newitem = PyList_GET_ITEM(heap, pos);
while (pos > startpos) {
Py_ssize_t parentpos = (pos - 1) >> 1;
PyObject *parent = PyList_GET_ITEM(heap, parentpos);
int cmp = PyObject_RichCompareBool(newitem, parent, Py_LT);
if (cmp < 0) return -1;
if (!cmp) break; /* parent <= newitem: heap property restored */
PyList_SET_ITEM(heap, pos, Py_NewRef(parent));
Py_DECREF(PyList_GET_ITEM(heap, pos)); /* old ref now overwritten */
pos = parentpos;
}
PyList_SET_ITEM(heap, pos, Py_NewRef(newitem));
Py_DECREF(newitem);
return 0;
}

_siftdown moves a newly inserted element up towards the root until the heap invariant holds. (pos - 1) >> 1 is the parent index in a 0-based binary heap.

_siftup

// CPython: Modules/_heapqmodule.c:160 _siftup
static int
_siftup(PyListObject *heap, Py_ssize_t pos)
{
Py_ssize_t endpos = PyList_GET_SIZE(heap);
Py_ssize_t startpos = pos;
PyObject *newitem = PyList_GET_ITEM(heap, pos);
/* Bubble up the smaller child until we reach a leaf */
Py_ssize_t childpos = 2 * pos + 1; /* left child */
while (childpos < endpos) {
Py_ssize_t rightpos = childpos + 1;
if (rightpos < endpos) {
int cmp = PyObject_RichCompareBool(
PyList_GET_ITEM(heap, rightpos),
PyList_GET_ITEM(heap, childpos), Py_LT);
if (cmp < 0) return -1;
if (cmp) childpos = rightpos;
}
PyList_SET_ITEM(heap, pos, Py_NewRef(PyList_GET_ITEM(heap, childpos)));
pos = childpos;
childpos = 2 * pos + 1;
}
PyList_SET_ITEM(heap, pos, Py_NewRef(newitem));
return _siftdown(heap, startpos, pos);
}

_siftup fills the gap left by removing the root by pushing the smaller child up, then calls _siftdown to place the displaced last element. This two-phase approach is CPython's strategy for heappop.

nlargest / nsmallest

// CPython: Modules/_heapqmodule.c:340 nlargest (Python-level fallback)
/* For small n vs large iterable:
nlargest(k, it) uses a min-heap of size k:
- Push first k elements.
- For each remaining element: if > heap[0], replace and siftup.
For k close to len(it): sort and slice.
For k == 1: use max().
*/

nlargest(k, iterable) keeps a min-heap of size k. Elements smaller than the current minimum are discarded without a heap operation. This gives O(n log k) complexity. For k >= n // 2, sorting is cheaper.

gopy notes

heapify is module/heapq.Heapify in module/heapq/module.go. _siftdown and _siftup are unexported helpers. nlargest/nsmallest threshold logic is replicated in Go with the same k >= n/2 cutoff.