Python/Python-ast.c
cpython 3.14 @ ab2d84fe1023/Python/Python-ast.c
Python-ast.c is not written by hand. It is generated by
Parser/asdl_c.py from the grammar description in Parser/Python.asdl. The
generator emits one C constructor for every node kind in the ASDL grammar,
plus the recursive validator _PyAST_Validate.
The file serves three audiences:
- The compiler (
Python/compile.c) calls the constructors when building the AST from the parse tree. - The
aststdlib module re-exports the same node types as Python classes so user code can inspect or transform trees. - Third-party tools (linters, formatters, type checkers) that embed CPython link against these constructors through the stable C API.
Key groups of symbols:
| Group | Example symbols | Notes |
|---|---|---|
| Sequence allocator | _Py_asdl_seq_new, _Py_asdl_generic_seq_new | Arena-backed; no individual frees needed |
| Module constructors | _PyAST_Module, _PyAST_Interactive, _PyAST_Expression | Top-level mod_ty nodes |
| Statement constructors | _PyAST_FunctionDef, _PyAST_AsyncFunctionDef, _PyAST_ClassDef, _PyAST_Return, ... | All ~20 stmt_ty variants |
| Expression constructors | _PyAST_BoolOp, _PyAST_BinOp, _PyAST_UnaryOp, _PyAST_Lambda, _PyAST_Constant, ... | All ~40 expr_ty variants |
| Pattern constructors | _PyAST_MatchValue, _PyAST_MatchOr, ... | PEP 634 structural pattern matching |
| Misc constructors | _PyAST_alias, _PyAST_arg, _PyAST_keyword, _PyAST_withitem | Helper node types |
| Validator | _PyAST_Validate | Recursive sanity check; called before compilation |
Map
| Lines | Symbol | Role | gopy |
|---|---|---|---|
| 1-200 | (includes, arena helpers) | _Py_asdl_seq_new, _Py_asdl_int_seq_new; arena glue | - |
| 201-500 | Module / Interactive / Expression | Top-level mod_ty constructors | - |
| 501-1800 | Statement constructors | _PyAST_FunctionDef through _PyAST_Nonlocal (all ~20 stmt kinds) | - |
| 1801-4500 | Expression constructors | _PyAST_BoolOp through _PyAST_Constant (all ~40 expr kinds) | - |
| 4501-5800 | Pattern constructors | _PyAST_MatchValue through _PyAST_MatchAs | - |
| 5801-6500 | Misc node constructors | _PyAST_alias, _PyAST_arg, _PyAST_keyword, _PyAST_withitem, _PyAST_match_case | - |
| 6501-7500 | Python type objects | PyAST_type, per-node PyTypeObject definitions for the ast module | - |
| 7501-8000 | _PyAST_Validate | Recursive validator; checks invariants before compile.c consumes the tree | - |
Reading
Arena-backed sequence allocation
All ASDL sequences are backed by the compile-time PyArena. The allocator
returns a typed asdl_seq * and registers the backing block with the arena so
it is freed when the arena is torn down, with no per-item cleanup:
/* Python/Python-ast.c:42 _Py_asdl_seq_new */
asdl_seq *
_Py_asdl_seq_new(Py_ssize_t size, PyArena *arena)
{
asdl_seq *seq = NULL;
size_t n;
/* check for overflow */
if (size == 0) {
n = sizeof(asdl_seq);
} else {
if ((size_t)size > (SIZE_MAX - sizeof(asdl_seq)) / sizeof(void *)) {
PyErr_NoMemory();
return NULL;
}
n = sizeof(asdl_seq) + (size_t)size * sizeof(void *);
}
seq = (asdl_seq *)PyArena_Malloc(arena, n);
if (!seq) {
PyErr_NoMemory();
return NULL;
}
seq->size = size;
return seq;
}
Because every allocation is arena-owned, the compiler can build an entire AST without tracking individual node lifetimes.
A representative statement constructor: _PyAST_FunctionDef
Every stmt_ty constructor follows the same pattern: allocate a struct _stmt
from the arena, fill its tag and union fields, copy in the source location, and
return. No validation is performed here; that is left to _PyAST_Validate.
/* Python/Python-ast.c:560 _PyAST_FunctionDef */
stmt_ty
_PyAST_FunctionDef(identifier name, arguments_ty args, asdl_stmt_seq *body,
asdl_expr_seq *decorator_list, expr_ty returns,
string type_comment, asdl_type_param_seq *type_params,
int lineno, int col_offset, int end_lineno,
int end_col_offset, PyArena *arena)
{
stmt_ty p;
if (!name) {
PyErr_SetString(PyExc_ValueError,
"field 'name' is required for FunctionDef");
return NULL;
}
p = (stmt_ty)PyArena_Malloc(arena, sizeof(*p));
if (!p) return NULL;
p->kind = FunctionDef_kind;
p->v.FunctionDef.name = name;
p->v.FunctionDef.args = args;
p->v.FunctionDef.body = body;
p->v.FunctionDef.decorator_list = decorator_list;
p->v.FunctionDef.returns = returns;
p->v.FunctionDef.type_comment = type_comment;
p->v.FunctionDef.type_params = type_params;
p->lineno = lineno;
p->col_offset = col_offset;
p->end_lineno = end_lineno;
p->end_col_offset = end_col_offset;
return p;
}
Required fields (non-sequence, non-optional) are checked for NULL before
the arena allocation; missing them is a programmer error in the parser, not a
user-visible exception.
The AST validator
_PyAST_Validate is called from compiler_mod after _PyAST_Optimize and
before _PySymtable_Build. It walks the entire tree and asserts structural
invariants that the constructors cannot enforce (for example, that a Return
node does not appear at module scope):
/* Python/Python-ast.c:7530 validate_expr (inner helper) */
static int
validate_expr(struct validator *state, expr_ty exp, expr_context_ty ctx)
{
/* recursion guard */
if (++state->recursion_depth > state->recursion_limit) {
PyErr_SetString(PyExc_RecursionError, "AST is too deeply nested");
return 0;
}
int ret = -1;
switch (exp->kind) {
case BoolOp_kind:
ret = validate_exprs(state, exp->v.BoolOp.values, Load, 0);
break;
case BinOp_kind:
ret = validate_expr(state, exp->v.BinOp.left, Load) &&
validate_expr(state, exp->v.BinOp.right, Load);
break;
case Constant_kind:
ret = validate_constant(exp->v.Constant.value);
break;
/* ... ~40 more cases ... */
default:
PyErr_Format(PyExc_SystemError, "unknown expr kind: %d", exp->kind);
ret = 0;
}
--state->recursion_depth;
return ret;
}
The recursion guard uses a struct validator carrying both current depth and a
limit derived from sys.getrecursionlimit(), preventing stack overflows on
adversarially nested sources.
gopy mirror
gopy uses its own AST node types defined in the parser/ package (populated
by the PEG parser). Those types are idiomatic Go structs rather than
arena-allocated C unions, and they do not correspond one-to-one to CPython's
Python-ast.c constructors.
Python-ast.c has not been ported and a direct port is not planned. The file
is machine-generated boilerplate; the meaningful logic lives in
Parser/Python.asdl (the grammar) and in _PyAST_Validate (the validator).
If a gopy AST validator is ever needed it would be modeled on the validator
section of this file.
CPython 3.14 changes
- The
type_paramsfield was added toFunctionDef,AsyncFunctionDef, andClassDefconstructors for PEP 695 type parameter syntax (def f[T](...)). - Pattern-matching node constructors (
MatchValue,MatchOr, etc.) were stabilized; a few field names changed between 3.10 and 3.14. _PyAST_Validategained thestruct validatorrecursion-depth tracking (replacing a bare integer passed through every call frame) to support accurate limit checking in sub-interpreters with independent recursion limits._Py_asdl_generic_seq_newwas introduced as a typed alias over_Py_asdl_seq_newto give static analysis tools better type information for the untypedvoid *sequence slots used by pattern nodes.