Skip to main content

Python/getargs.c

cpython 3.14 @ ab2d84fe1023/Python/getargs.c

The argument parsing engine for C extensions. Every PyArg_Parse* function in the public C API routes through this file. The design is a format-string mini-language where each character maps to a C type and a conversion function. Parsing is deliberately strict: unknown format characters are rejected at runtime with a SystemError, and type mismatches raise TypeError.

The top of the file defines the public entry points: PyArg_ParseTuple for positional-only functions, PyArg_ParseTupleAndKeywords for functions with keyword support, PyArg_VaParseTupleAndKeywords for the va_list variant, and _PyArg_ParseStack / _PyArg_ParseStackAndKeywords for the Argument Clinic fast path that receives a C array instead of a tuple object.

All of these converge on the internal vgetargs1_impl function, which drives the format-string loop. vgetargs1_impl calls convertsimple once per format character to convert the corresponding Python argument into the caller's C variable. When the format string contains |, all following arguments are optional; if the Python call did not supply them, the corresponding C slots are left at their defaults. $ marks the start of keyword-only arguments (arguments that must be named by the caller and may not be passed positionally).

Two guard functions, _PyArg_NoKeywords and _PyArg_NoPositional, cover the common pattern of rejecting any keywords or any positional arguments entirely, without going through the full format machinery. PyArg_ValidateKeywordArguments walks the kwargs dict and raises TypeError for any key that does not appear in the allowed keyword list.

Map

LinesSymbolRolegopy
1-120file header / PyArg_Parse / PyArg_ParseTuplePublic entry points for positional-only parsing. PyArg_ParseTuple validates that args is a tuple, then calls vgetargs1_impl.pythonrun/getargs.go:ParseTuple
121-320vgetargs1_implCore loop: iterates format string, advances the argument pointer, calls convertsimple for each format character, handles `optional marker and$` keyword-only marker.
321-700convertsimplePer-character dispatch. Converts a single Python argument to the C type indicated by the format character. Raises TypeError on mismatch.pythonrun/getargs.go:convertSimple
701-900PyArg_ParseTupleAndKeywords / PyArg_VaParseTupleAndKeywords / vgetargskeywordsKeyword argument parsing. vgetargskeywords builds a keywords index, scans kwargs, matches names to format positions, then calls convertsimple.pythonrun/getargs.go:ParseTupleAndKeywords
901-1100_PyArg_ParseStack / _PyArg_ParseStackAndKeywordsClinic-generated fast path: receives a PyObject *const * array and Py_ssize_t nargs instead of a tuple. Avoids tuple allocation on the hot path.pythonrun/getargs.go:ParseStack
1101-1300PyArg_ValidateKeywordArguments / _PyArg_NoKeywords / _PyArg_NoPositional / _PyArg_BadArgumentArgument validation guards and error formatters.pythonrun/getargs.go:ValidateKeywordArguments
1301-1500cleanup_ptr / cleanup_buffer / returncomplexCleanup helpers registered during parsing for buffers and complex numbers that were allocated before a later argument fails.pythonrun/getargs.go:cleanupBuffer
1501-1600_PyArg_UnpackKeywords / _PyArg_UnpackKeywordsWithVarargLow-level helpers used by Argument Clinic for `METH_FASTCALLMETH_KEYWORDS` functions: repack positional and keyword arrays into a flat slot array.

Reading

convertsimple key format characters (lines 321 to 700)

cpython 3.14 @ ab2d84fe1023/Python/getargs.c#L321-700

convertsimple is the dispatch heart of the file. It takes the current format character, the Python argument, and a pointer to the C destination variable, and performs the conversion:

static const char *
convertsimple(PyObject *arg, const char **p_format, va_list *p_va,
int flags, char *msgbuf, size_t bufsize,
freelist_t *freelist)
{
const char *format = *p_format;
char c = *format++;
...
switch (c) {
case 'b': /* byte — unsigned char */
...
case 'i': /* int */
{
int *p = va_arg(*p_va, int *);
long ival = PyLong_AsLong(arg);
if (ival == -1 && PyErr_Occurred()) return converterr("int", ...);
*p = (int)ival;
break;
}
case 'l': /* long int */
...
case 's': /* str to const char * (UTF-8) */
{
const char **p = va_arg(*p_va, const char **);
Py_ssize_t len;
ssize_t slen;
*p = PyUnicode_AsUTF8AndSize(arg, &len);
if (*p == NULL) return converterr("str", ...);
...
break;
}
case 'O': /* object — any type, no conversion */
{
PyTypeObject *type;
PyObject **p;
if (*format == '!') { /* O! — exact type check */
type = va_arg(*p_va, PyTypeObject *);
p = va_arg(*p_va, PyObject **);
if (!PyObject_TypeCheck(arg, type))
return converterr(type->tp_name, ...);
*p = arg;
format++;
} else if (*format == '&') { /* O& — converter callback */
typedef int (*converter)(PyObject *, void *);
converter convert = va_arg(*p_va, converter);
void *addr = va_arg(*p_va, void *);
if (!convert(arg, addr))
return converterr("(converter)", ...);
format++;
} else {
p = va_arg(*p_va, PyObject **);
*p = arg;
}
break;
}
...
}
}

Selected format characters and their C types:

CharC typeConversion
bunsigned charPyLong_AsLong, range [0, 255]
hshort intPyLong_AsLong
iintPyLong_AsLong
llongPyLong_AsLong
Llong longPyLong_AsLongLong
kunsigned longPyLong_AsUnsignedLongMask (no overflow check)
ddoublePyFloat_AsDouble
ffloat(float)PyFloat_AsDouble
sconst char *PyUnicode_AsUTF8AndSize, valid until object is GC'd
yconst char *PyBytes_AsStringAndSize, bytes only
zconst char *like s but accepts None (gives NULL)
uconst Py_UNICODE *deprecated wide-char buffer
pintPyObject_IsTrue — truth predicate
OPyObject *any object, borrowed reference
O!PyObject *exact type check via PyObject_TypeCheck
O&convertercaller-supplied int conv(PyObject*, void*)
*Py_bufferPyBUF_SIMPLE buffer protocol

The | character is not handled inside convertsimple; it is consumed by vgetargs1_impl before calling convertsimple and sets a flag that makes subsequent missing arguments non-fatal. $ similarly sets a flag that the remaining names must be keyword-only.

vgetargs1_impl loop (lines 121 to 320)

cpython 3.14 @ ab2d84fe1023/Python/getargs.c#L121-320

static int
vgetargs1_impl(PyObject *compat_args, PyObject *const *stack,
Py_ssize_t nargs, const char *format, va_list *p_va, int flags)
{
int min = -1, max = INT_MAX;
int i = 0;
const char *fname = NULL, *message = NULL;
...
for (;;) {
const char *thisgroupformat;
int ngroup = 0;
...
c = *format++;
if (c == '\0') break;
if (c == '|') { min = i; continue; }
if (c == '$') { /* keyword-only start */ continue; }
if (c == ':') { fname = format; break; }
if (c == ';') { message = format; break; }

if (i >= nargs) {
if (i < min) { /* required arg missing */ ... error ... }
/* optional: skip va_arg slots */
skipitem(&format, p_va, flags);
continue;
}

arg = (stack != NULL) ? stack[i] : PyTuple_GET_ITEM(compat_args, i);
msg = convertsimple(arg, &format, p_va, flags, ...);
if (msg != NULL) { /* conversion failed */ ... error ... }
i++;
}
...
}

The loop tracks i (number of positional arguments consumed) against nargs. When i >= nargs and the optional marker | has been seen (min != -1), missing arguments are silently skipped by calling skipitem which advances p_va without reading from args. If the | has not been seen and i >= nargs, the function raises TypeError: function takes exactly N arguments (M given).

fname and message come from :name and ;message suffixes in the format string. :name is used to give the function name in error messages; ;message replaces the entire generated error message with a literal string.

Keyword argument handling (lines 701 to 900)

cpython 3.14 @ ab2d84fe1023/Python/getargs.c#L701-900

vgetargskeywords is the keyword-aware variant. It receives the positional args as a tuple (or stack array), the keyword dict, and a NULL-terminated char *kwlist[] naming every accepted parameter in format-string order.

static int
vgetargskeywords(PyObject *args, PyObject *kwargs,
const char *format, char **kwlist, va_list *p_va, int flags)
{
Py_ssize_t nargs = PyTuple_GET_SIZE(args);
Py_ssize_t nkwargs = (kwargs == NULL) ? 0 : PyDict_GET_SIZE(kwargs);
...
/* First pass: positional arguments */
for (i = 0; kwlist[i] && i < nargs; i++) {
...
msg = convertsimple(PyTuple_GET_ITEM(args, i), &format, p_va, ...);
}
/* Second pass: keyword arguments */
for (; kwlist[i]; i++) {
PyObject *current_arg = NULL;
if (nkwargs > 0 && kwlist[i][0]) {
current_arg = PyDict_GetItemString(kwargs, kwlist[i]);
}
if (current_arg != NULL) {
...
msg = convertsimple(current_arg, &format, p_va, ...);
} else if (required) {
PyErr_Format(PyExc_TypeError, "%.200s() missing required "
"argument: '%.200s'", ...);
return 0;
} else {
skipitem(&format, p_va, flags);
}
}
/* Third pass: reject unknown kwargs */
if (nkwargs > 0) {
PyObject *key;
Py_ssize_t pos = 0;
while (PyDict_Next(kwargs, &pos, &key, NULL)) {
/* check key is in kwlist */
...
}
}
return 1;
}

The three-pass structure guarantees that positional arguments fill the leading slots, then keyword arguments fill remaining named slots, and finally any keyword that does not appear in kwlist raises TypeError. The $ marker sets keyword_only_start, and any parameter after it is skipped in the first (positional) pass even if the caller supplies it positionally.

CPython 3.14 changes worth noting

_PyArg_UnpackKeywordsWithVararg was added in 3.12 for Argument Clinic functions declared with *args. It separates the fixed named parameters from the star-args slice during fast-call dispatch. In 3.14 the p format character is documented as stable API (it was informally supported since 3.3 but not guaranteed). The u and u# format characters remain deprecated since 3.3 and emit a DeprecationWarning in 3.13+. _PyArg_BadArgument was added in 3.11 to provide a consistent, structured error message format used by Argument Clinic; it is unchanged in 3.14.