Lib/collections/__init__.py
The collections module splits across two files: this pure-Python layer and the C
extension _collections, which provides deque and defaultdict. Everything else
lives here as pure Python with occasional delegation to _collections_abc.
Map
| Lines | Symbol | Role |
|---|---|---|
| 1–80 | module preamble | imports, __all__, ABCs wired up |
| 81–200 | ChainMap | layered dict view over a list of mappings |
| 201–340 | Counter | multiset with arithmetic operators |
| 341–490 | OrderedDict | Python fallback for ordered dict |
| 491–560 | namedtuple | factory: builds a class via exec on a template string |
| 561–600 | UserDict / UserList / UserString | thin wrappers delegating to self.data |
Reading
ChainMap lookup chain
ChainMap stores a list of mappings in self.maps and walks them left-to-right
on every key access. Writes always go to maps[0].
# CPython: Lib/collections/__init__.py:161 ChainMap.__getitem__
def __getitem__(self, key):
for mapping in self.maps:
try:
return mapping[key]
except KeyError:
pass
return self.__missing__(key)
Counter arithmetic
Counter.__add__ discards zero and negative counts so the result is always a
proper multiset. The in-place variant __iadd__ does the same after accumulation.
# CPython: Lib/collections/__init__.py:272 Counter.__add__
def __add__(self, other):
if not isinstance(other, Counter):
return NotImplemented
result = Counter()
for elem, count in self.items():
newcount = count + other[elem]
if newcount > 0:
result[elem] = newcount
for elem, count in other.items():
if elem not in self and count > 0:
result[elem] = count
return result
namedtuple class factory
namedtuple builds source text for an entire class body, then calls exec to
compile it into a real class. Field validators run before the template is rendered.
# CPython: Lib/collections/__init__.py:505 namedtuple
def namedtuple(typename, field_names, *, rename=False, defaults=None, module=None):
# ... validation ...
namespace = {'_tuple_new': tuple.__new__, '__builtins__': {}}
exec(class_definition, namespace)
result = namespace[typename]
result.__module__ = module or _sys._getframe(1).f_globals.get('__name__', '__main__')
return result
UserDict delegation pattern
UserDict keeps the real dict in self.data. Almost every method forwards to it,
letting subclasses override only what they need.
# CPython: Lib/collections/__init__.py:570 UserDict.__setitem__
def __setitem__(self, key, item):
self.data[key] = item
gopy notes
dequeanddefaultdictare not in this file; they come from_collections(C). Port them fromModules/_collectionsmodule.c.namedtuplerelies onexecwith a namespace dict. The gopy compiler will needEXEC_STMTor an equivalent host call before namedtuple can run end-to-end.Counterarithmetic depends on__missing__returning 0 by default; verify that thedefaultdict-style default works once the C type is in place.ChainMap.new_childcreates a freshChainMapwith a new empty map prepended. No special VM support needed, but the copy semantics (maps[1:]) must shallow-copy the list, not the mappings themselves.
CPython 3.14 changes
Countergainedtotal()in 3.10; no further changes in 3.14.namedtuplefield validation tightened: duplicate names now raiseValueErroreven whenrename=Trueif the renamed field would also collide.ChainMapis unchanged from 3.12.UserStringpicked up__class_getitem__for generic alias support.