Modules/itertoolsmodule.c
cpython 3.14 @ ab2d84fe1023/Modules/itertoolsmodule.c
The itertools module. Every public object is a C extension type, not a
Python class. Each type stores only the state needed for __next__ and
exposes __reduce__ / __setstate__ so that a partially-consumed iterator
can be pickled and resumed.
The 18 types split into four groups by function:
- Infinite iterators.
count,cycle,repeatnever stop (unlessrepeatis given atimesargument). - Filtering and slicing.
compress,filterfalse,islice,takewhile,dropwhile. - Combining.
chain,starmap. - Grouping and accumulation.
groupby,accumulate. - Combinatoric.
product,permutations,combinations,combinations_with_replacement,pairwise,batched.
All 18 types follow the same structural template: a struct with
PyObject_HEAD plus state fields; a _new function that validates and
stores arguments; an iternext function implementing __next__; a
__reduce__ that returns (type, constructor_args, state) as a 3-tuple;
and a __setstate__ that restores the mutable state fields.
Map
| Lines | Symbol | Role | gopy |
|---|---|---|---|
| 1-120 | includes, _itertoolsmodule state, helper macros | Per-interpreter state and internal utilities. | module/itertools/module.go:state |
| 120-300 | count, count_new, count_next, count_repr | Infinite arithmetic sequence. Stores start and step as Python objects. | module/itertools/module.go:Count |
| 300-500 | cycle, cycle_new, cycle_next | Infinite repetition of an iterable. Buffers the first pass in a list. | module/itertools/module.go:Cycle |
| 500-600 | repeat, repeat_new, repeat_next | Emit a single object times times (or infinitely). | module/itertools/module.go:Repeat |
| 600-800 | chain, chain_new, chain_from_iterable, chain_next | Chain multiple iterables end-to-end. | module/itertools/module.go:Chain |
| 800-1000 | compress, compress_new, compress_next | Filter data by selectors. | module/itertools/module.go:Compress |
| 1000-1200 | filterfalse, filterfalse_new, filterfalse_next, islice, islice_new, islice_next | Predicate-false filter and step-slicing. | module/itertools/module.go:FilterFalse |
| 1200-1600 | starmap, starmap_new, starmap_next, takewhile, takewhile_new, takewhile_next, dropwhile, dropwhile_new, dropwhile_next | Map with unpacking, and predicate-based take/drop. | module/itertools/module.go:Starmap |
| 1600-2000 | groupby, groupby_new, groupby_next, _grouper, _grouper_next | Group consecutive equal-key elements. Stores tgtkey, currkey, currvalue. | module/itertools/module.go:GroupBy |
| 2000-2400 | accumulate, accumulate_new, accumulate_next | Running total with optional binary function. | module/itertools/module.go:Accumulate |
| 2400-3000 | product, product_new, product_next | Cartesian product of input iterables. Stores pool arrays and index array. | module/itertools/module.go:Product |
| 3000-3500 | permutations, permutations_new, permutations_next | Successive r-length permutations from a pool. | module/itertools/module.go:Permutations |
| 3500-4000 | combinations, combinations_new, combinations_next, cwr, cwr_new, cwr_next | r-length combinations and combinations-with-replacement. | module/itertools/module.go:Combinations |
| 4000-4200 | pairwise, pairwise_new, pairwise_next | Emit overlapping pairs (a, b), (b, c), .... | module/itertools/module.go:Pairwise |
| 4200-4400 | batched, batched_new, batched_next | Collect n items at a time into tuples. | module/itertools/module.go:Batched |
| 4400-4500 | itertoolsmodule, PyInit_itertools | Module definition and type registration. | module/itertools/module.go:Module |
Reading
groupby key tracking (lines 1600 to 2000)
cpython 3.14 @ ab2d84fe1023/Modules/itertoolsmodule.c#L1600-2000
groupby is the most stateful type in the file. It advances the underlying
iterator lazily, sharing elements between the outer groupby iterator and
the inner _grouper subiterator for the current group. The trickiest part
is that advancing the outer groupby to the next group must exhaust (or
discard) the current _grouper.
typedef struct {
PyObject_HEAD
PyObject *it; /* underlying iterator */
PyObject *keyfunc; /* None or callable */
PyObject *tgtkey; /* key for the next group to be yielded */
PyObject *currkey; /* key of the element in currvalue */
PyObject *currvalue; /* most recently fetched element */
const _groupby_module_state *gs;
} groupbyobject;
static PyObject *
groupby_next(groupbyobject *lz, PyObject *Py_UNUSED(ignored))
{
PyObject *newvalue, *newkey, *r, *grouper;
...
/* Advance past the current group */
while (lz->currkey != NULL) {
int rcmp = PyObject_RichCompareBool(lz->tgtkey, lz->currkey, Py_EQ);
if (rcmp <= 0) break; /* new key or error */
/* Still in the old group: consume one more element */
newvalue = (*Py_TYPE(lz->it)->tp_iternext)(lz->it);
if (newvalue == NULL) return NULL;
...
if (lz->keyfunc == Py_None)
newkey = newvalue;
else
newkey = PyObject_CallOneArg(lz->keyfunc, newvalue);
Py_XDECREF(lz->currkey);
lz->currkey = newkey;
Py_XDECREF(lz->currvalue);
lz->currvalue = newvalue;
}
/* Establish the new target key */
Py_XINCREF(lz->currkey);
Py_XDECREF(lz->tgtkey);
lz->tgtkey = lz->currkey;
...
grouper = _grouper_create(lz, lz->tgtkey);
r = PyTuple_Pack(2, lz->tgtkey, grouper);
return r;
}
lz->tgtkey is the key of the group being consumed now. Each call to
groupby_next advances past any remaining elements of the previous _grouper
by comparing currkey against tgtkey in a while loop. The _grouper
subiterator just reads lz->currvalue and advances lz->it until currkey
changes; it holds a borrowed reference to the parent groupbyobject so that
the two share state.
This means the _grouper subiterators are invalidated when the outer groupby
advances: the documented behavior that a previous group's iterator must be
fully consumed (or discarded) before calling next(groupby_iter) again.
product cartesian product (lines 2400 to 3000)
cpython 3.14 @ ab2d84fe1023/Modules/itertoolsmodule.c#L2400-3000
product materializes all input iterables into tuple pools at construction
time. It then maintains an index array of length npools, where
indices[i] is the current position within pools[i].
static PyObject *
product_next(productobject *lz, PyObject *Py_UNUSED(ignored))
{
PyObject *pool, *elem, *oldelem, **pools, **result;
Py_ssize_t npools = PyTuple_GET_SIZE(lz->pools);
Py_ssize_t *indices = lz->indices;
...
/* Build the current result tuple */
if (lz->stopped)
return NULL;
result = lz->result;
if (result == NULL) {
/* First call: construct the initial tuple */
...
} else {
/* Increment the rightmost index that has not overflowed */
Py_ssize_t i = npools - 1;
while (i >= 0) {
pool = PyTuple_GET_ITEM(lz->pools, i);
indices[i]++;
if (indices[i] == PyTuple_GET_SIZE(pool)) {
indices[i] = 0; /* wrap and carry */
i--;
} else {
break;
}
}
if (i < 0) {
lz->stopped = 1;
return NULL;
}
/* Update only the changed suffix of result */
for (; i < npools; i++) {
pool = PyTuple_GET_ITEM(lz->pools, i);
elem = PyTuple_GET_ITEM(pool, indices[i]);
oldelem = result[i];
Py_INCREF(elem);
result[i] = elem;
Py_DECREF(oldelem);
}
}
...
}
The "increment rightmost, carry left" pattern is identical to incrementing a
mixed-radix counter where each digit's radix is the size of its pool. The
optimization of updating only the changed suffix (for (; i < npools; i++))
avoids rebuilding the entire result tuple on each step.
permutations and combinations pool rotation (lines 3000 to 4000)
cpython 3.14 @ ab2d84fe1023/Modules/itertoolsmodule.c#L3000-4000
permutations stores a pool (copy of input), an index array, and a cycles
array. The cycles array implements the factorial number system used to
enumerate permutations without recursion:
static PyObject *
permutations_next(permutationsobject *po, PyObject *Py_UNUSED(ignored))
{
Py_ssize_t i, j, k, r = po->r, n;
Py_ssize_t *indices = po->indices, *cycles = po->cycles;
...
/* Advance cycles from the right */
i = r - 1;
while (i >= 0) {
cycles[i]--;
if (cycles[i] == 0) {
/* Rotate indices[i:] left by one */
PyObject *tmp = pool[indices[i]];
for (j = i; j < n - 1; j++)
indices[j] = indices[j+1];
indices[n-1] = i; /* not pool index; see CPython note */
cycles[i] = n - i;
i--;
} else {
j = cycles[i];
/* Swap indices[i] with indices[n-j] */
k = indices[i]; indices[i] = indices[n-j]; indices[n-j] = k;
/* Emit current r-prefix */
...
return result;
}
}
po->stopped = 1;
return NULL;
}
The cycles[i] counter counts down from n - i to 1. When it hits 0,
the element at position i has been paired with every element to its right,
so the sub-array is rotated to bring the next candidate into position i.
This is Heap's-algorithm-like but for partial permutations of length r.
combinations is simpler: it maintains a single index array and advances
the rightmost index that can still be incremented, then fills the suffix
with consecutive values. combinations_with_replacement differs only in
that each index can repeat the value of its predecessor (the non-strict
monotone constraint instead of the strict monotone constraint).
gopy mirror
module/itertools/module.go (pending). Each type maps to a Go struct
implementing the objects.Iter interface. The state fields translate
directly: Go slices replace PyObject* pool arrays, and int or int64
replaces Py_ssize_t index arrays. __reduce__ / __setstate__ will be
implemented via Go's Pickler interface.
CPython 3.14 changes
pairwise was added in Python 3.10. batched was added in Python 3.12,
with the strict keyword argument added in 3.13 to raise ValueError when
the last batch is shorter than n. accumulate gained the initial
keyword argument in 3.8. chain.from_iterable has been available since
2.6 but was moved to a classmethod slot in 3.3. The __reduce__ /
__setstate__ protocol is stable across 3.x.