Skip to main content

Lib/dataclasses.py

cpython 3.14 @ ab2d84fe1023/Lib/dataclasses.py

dataclasses.py is a pure-Python module that provides the @dataclass class decorator (PEP 557). The decorator inspects __annotations__ on the decorated class and its MRO, builds a Field object for each annotated name, then generates and execs synthetic method source code for __init__, __repr__, __eq__, and optionally the ordering methods (__lt__, __le__, __gt__, __ge__) and __hash__.

The module exports a small but carefully designed API surface: dataclass, field, Field, FrozenInstanceError, InitVar, KW_ONLY, MISSING, fields, asdict, astuple, make_dataclass, replace, and is_dataclass. All generated code is executed with a temporary __globals__ namespace that contains only the names the generated body actually needs, which keeps the compiled function objects portable across pickle boundaries.

Python 3.10 added KW_ONLY (a sentinel object placed in the annotations list to mark all following fields as keyword-only in __init__). Python 3.12 added __slots__-generating support when slots=True is passed to @dataclass.

Map

LinesSymbolRolegopy
1-80Module-level sentinels: MISSING, KW_ONLY, HAS_DEFAULT, HAS_DEFAULT_FACTORY, HAS_INIT_VARSentinel objects used as default markers in Field so that None can be a valid default.module/dataclasses/module.go
81-300Field, fieldPer-field configuration container; field() is the public constructor; Field.__set_name__ validates that default and default_factory are not both set.module/dataclasses/module.go
301-420InitVarGeneric alias that marks a parameter as __init__-only; never stored as an instance attribute; detected in _process_class by isinstance(f.type, InitVar).module/dataclasses/module.go
421-900_process_classCore logic: walks the MRO to collect inherited fields, builds the __init__ source string (including __post_init__ forwarding), execs it, and attaches all generated dunder methods.module/dataclasses/module.go
901-1000dataclassPublic decorator; thin wrapper around _process_class; accepts init, repr, eq, order, unsafe_hash, frozen, match_args, kw_only, slots, weakref_slot.module/dataclasses/module.go
1001-1150fields, asdict, astuplefields returns the tuple of Field objects stored in __dataclass_fields__; asdict and astuple recurse into nested dataclasses, lists, tuples, and dicts.module/dataclasses/module.go
1151-1350make_dataclassProgrammatic dataclass creation: builds an __annotations__ dict and calls dataclass(type(name, bases, ns)).module/dataclasses/module.go
1351-1450replaceReturns a new instance of the same dataclass with specified fields replaced; validates that only init-eligible field names are passed.module/dataclasses/module.go
1451-1500is_dataclass, _is_dataclass_instancePredicate helpers; is_dataclass returns True for classes and instances whose type carries __dataclass_fields__.module/dataclasses/module.go

Reading

_process_class field collection and __init__ codegen (lines 421 to 900)

cpython 3.14 @ ab2d84fe1023/Lib/dataclasses.py#L421-900

def _process_class(cls, init, repr, eq, order, unsafe_hash, frozen,
match_args, kw_only, slots, weakref_slot):
# Collect fields defined on this class and inherited ones.
fields = {}

# Walk the MRO in reverse so subclass fields override base fields.
for b in cls.__mro__[-1:0:-1]:
base_fields = getattr(b, _FIELDS, None)
if base_fields is not None:
for f in base_fields.values():
fields[f.name] = f

# Now process this class's own annotations.
cls_fields = [_field_init(f, kw_only)
for f in _fields_in_class(cls)]
for f in cls_fields:
fields[f.name] = f

# Generate and exec __init__.
flds = [f for f in fields.values() if f._field_type is _FIELD]
_set_new_attribute(cls, '__init__',
_init_fn(flds, frozen, has_post_init,
self_name, globals, slots))

The MRO walk (reversed so that object is processed first) ensures inherited fields retain their declaration order while being overridable by the subclass. _field_init processes each (name, annotation) pair from cls.__annotations__, consulting the default value already stored on the class if any. The KW_ONLY sentinel is detected here: once a field named KW_ONLY is encountered, all subsequent fields in that class have kw_only=True injected.

_init_fn builds a string of Python source:

def _init_fn(fields, frozen, has_post_init, self_name, globals, slots):
seen_default = False
body_lines = []
for f in fields:
...
if f.default is MISSING and f.default_factory is MISSING:
# No default.
if seen_default:
raise TypeError(f'non-default argument {f.name!r} '
'follows default argument')
else:
seen_default = True
...
return _create_fn('__init__',
[self_name] + [_init_param(f) for f in fields],
body_lines,
locals=locals,
globals=globals,
return_type=None)

_create_fn assembles the source string from the parameter list and body lines, then calls exec with a private globals dict that contains only the names the body actually references (default factories, field types for InitVar checks, etc.). The resulting function object is stored back on the class via _set_new_attribute, which skips the assignment if the class already defines the method unless force=True.

InitVar and KW_ONLY (lines 301 to 420)

cpython 3.14 @ ab2d84fe1023/Lib/dataclasses.py#L301-420

class InitVar:
__slots__ = ('type', )

def __init__(self, type):
self.type = type

def __class_getitem__(cls, type):
return cls(type)

# Sentinel for keyword-only fields.
KW_ONLY = KW_ONLY()

InitVar is a generic alias used in annotations to declare parameters that are passed to __init__ (and forwarded to __post_init__) but never stored as instance attributes. A field annotated InitVar[T] is detected in _process_class by isinstance(f.type, InitVar); such fields are collected into the __post_init__ call signature rather than the instance initialization body.

KW_ONLY is a module-level sentinel instance. When _process_class encounters an annotation whose value is KW_ONLY, it switches a boolean flag that forces all subsequent Field objects in that class to have kw_only=True. This allows mixing positional and keyword-only fields without relying solely on the class-level kw_only argument.

asdict recursive conversion (lines 1001 to 1100)

cpython 3.14 @ ab2d84fe1023/Lib/dataclasses.py#L1001-1100

def asdict(obj, *, dict_factory=dict):
if not _is_dataclass_instance(obj):
raise TypeError("asdict() should be called on dataclass instances")
return _asdict_inner(obj, dict_factory)

def _asdict_inner(obj, dict_factory):
if _is_dataclass_instance(obj):
result = []
for f in fields(obj):
value = _asdict_inner(getattr(obj, f.name), dict_factory)
result.append((f.name, value))
return dict_factory(result)
elif isinstance(obj, tuple) and hasattr(obj, '_fields'):
# namedtuple
return type(obj)(*[_asdict_inner(v, dict_factory) for v in obj])
elif isinstance(obj, (list, tuple)):
return type(obj)(_asdict_inner(v, dict_factory) for v in obj)
elif isinstance(obj, dict):
return type(obj)((_asdict_inner(k, dict_factory),
_asdict_inner(v, dict_factory))
for k, v in obj.items())
else:
return copy.deepcopy(obj)

_asdict_inner recurses depth-first. Nested dataclasses become dicts (assembled via dict_factory to allow OrderedDict or similar). Named tuples are reconstructed element-by-element. Plain lists and tuples preserve their concrete type via type(obj)(...). Dicts have both keys and values recursed. Anything else is deep-copied so the returned dict is fully independent of the original object graph. astuple follows the same recursive structure but produces nested tuples instead of dicts.