Skip to main content

Modules/struct module

Source:

cpython 3.14 @ ab2d84fe1023/Modules/_struct.c

The struct module converts between Python values and C structs packed into bytes objects. It is used heavily for binary protocol parsing (network packets, file formats, system calls).

Map

LinesSymbolRole
1-300Format character tableb, h, i, l, q, f, d, s, p, P, etc.
301-700s_pack, s_unpackCore pack/unpack implementations
701-1100Struct typeCached compiled format object
1101-1400calcsize, pack_into, unpack_fromBuffer variants
1401-2000iter_unpackIterator over repeated structure

Reading

Format string prefix

// CPython: Modules/_struct.c:95 format_prefixes
/* byte order / alignment prefix:
@ native order, native size, native alignment (default)
= native order, standard size, no alignment
< little-endian, standard size
> big-endian, standard size
! big-endian (network), standard size
*/

Without a prefix, @ is assumed. Standard size means C type sizes as defined by the C standard (short=2, int=4, etc.).

Pack implementation

// CPython: Modules/_struct.c:840 s_pack
static PyObject *
s_pack(PyObject *self, PyObject *const *args, Py_ssize_t nargs)
{
PyStructObject *soself = (PyStructObject *)self;
PyObject *result = PyBytes_FromStringAndSize(NULL, soself->s_size);
char *buf = PyBytes_AS_STRING(result);
for (Py_ssize_t i = 0; i < soself->s_len; i++) {
formatcode *code = &soself->s_codes[i];
PyObject *v = args[i];
code->fmtdef->pack(buf + code->offset, v, code->size, code->repeat);
}
return result;
}

Each format character maps to a formatdef entry with pack and unpack function pointers. The Struct object pre-parses the format string once.

iter_unpack

// CPython: Modules/_struct.c:1680 s_iter_unpack
static PyObject *
s_iter_unpack(PyObject *self, PyObject *buffer)
{
/* Returns an iterator yielding successive tuples of size s_size from buffer */
unpackiterobject *it = PyObject_New(unpackiterobject, &unpackiter_type);
it->s = (PyStructObject *)self;
it->buf = buffer;
it->index = 0;
return (PyObject *)it;
}

iter_unpack avoids creating a list of all tuples at once. Useful for streaming packet parsing.

calcsize

// CPython: Modules/_struct.c:760 calcsize
static PyObject *
calcsize(PyObject *module, PyObject *format)
{
PyStructObject *s_object = (PyStructObject *)cache_struct(module, format);
...
return PyLong_FromSsize_t(s_object->s_size);
}

struct.calcsize('iHd') computes the packed byte count without allocating a buffer.

gopy notes

struct is needed for binary I/O in the stdlib (e.g., zipfile, tarfile, wave, imghdr). In gopy it is implemented in module/struct/ using Go's encoding/binary. The format-string parser must handle repeat counts (4s, 10H) and all byte-order prefixes. Struct objects cache the parsed format for repeated use.