`Lib/struct.py`

cpython 3.14 @ ab2d84fe1023/Lib/struct.py

Lib/struct.py is a two-line shim:

from _struct import *
from _struct import _clearcache, pack, pack_into, unpack, unpack_from, \
    iter_unpack, calcsize, error, Struct

All real logic lives in Modules/_struct.c. The Python file's only job is to make the C module's symbols available under the struct namespace and to ensure error and the private _clearcache function are exported even though they are not in _struct.__all__.

This annotation therefore covers Modules/_struct.c directly, treating the C source as the authoritative implementation.

Struct is a compiled format object. Calling Struct(fmt) parses the format string once and caches the result; subsequent pack/unpack calls reuse the parsed representation. Module-level functions pack, unpack, etc. maintain an internal LRU cache of Struct objects keyed by format string, so repeated calls with the same format are as fast as using an explicit Struct.

Map

Lines	Symbol	Role	gopy
1-50	`from _struct import *`, `__all__`	Full re-export; the Python layer adds no logic. All symbols originate in `Modules/_struct.c`.	`(stdlib pending)`
`_struct.c` format parser	byte-order prefixes, format codes	`@`, `=`, `<`, `>`, `!` set byte order and alignment; `x c b B h H i I l L q Q e f d s p P n N ?` are the format codes with defined size and alignment.	`(stdlib pending)`
`_struct.c` Struct type	`Struct.__init__`, `Struct.pack`, `Struct.unpack`, `Struct.pack_into`, `Struct.unpack_from`, `Struct.iter_unpack`, `Struct.size`, `Struct.format`	Compiled format object; `size` is computed once at compile time; `iter_unpack` returns an iterator that advances through a buffer in `size`-byte steps.	`(stdlib pending)`
`_struct.c` module-level	`pack`, `unpack`, `pack_into`, `unpack_from`, `iter_unpack`, `calcsize`	Convenience wrappers that compile the format string through an internal LRU cache and delegate to the matching `Struct` method.	`(stdlib pending)`

Reading

Byte-order prefix semantics

cpython 3.14 @ ab2d84fe1023/Lib/struct.py#L1-50

The first character of a format string selects byte order and alignment:

Prefix	Byte order	Size	Alignment
`@` (default)	native	native	native
`=`	native	standard	none
`<`	little-endian	standard	none
`>`	big-endian	standard	none
`!`	network (big-endian)	standard	none

"Native" size means the C compiler's sizeof for the corresponding type. "Standard" size is the fixed width mandated by the format code table. "Native alignment" inserts padding bytes before each field so it falls on its natural boundary. No prefix other than @ inserts padding.

import struct

# native byte order, native sizes, native alignment
struct.pack('@ii', 1, 2)   # may have padding; size is platform-dependent

# little-endian, standard sizes (4 bytes each), no padding
struct.pack('<ii', 1, 2)   # always 8 bytes

# big-endian (network order), same as >
struct.pack('!H', 0x0102)  # b'\x01\x02'

Format code table

Code	C type	Python type	Standard size (bytes)
`x`	pad byte	no value	1
`c`	`char`	bytes of length 1	1
`b`	`signed char`	int	1
`B`	`unsigned char`	int	1
`?`	`_Bool`	bool	1
`h`	`short`	int	2
`H`	`unsigned short`	int	2
`i`	`int`	int	4
`I`	`unsigned int`	int	4
`l`	`long`	int	4
`L`	`unsigned long`	int	4
`q`	`long long`	int	8
`Q`	`unsigned long long`	int	8
`n`	`ssize_t`	int	(native only)
`N`	`size_t`	int	(native only)
`e`	half-float	float	2
`f`	`float`	float	4
`d`	`double`	float	8
`s`	`char[]`	bytes	1 per character
`p`	Pascal string	bytes	1 per character
`P`	`void *`	int	(native only)

A repeat count may precede any code: 4H is four unsigned shorts. For s the count is the byte length of the string, not a repeat: 10s is one 10-byte field.

`pack` and `unpack` usage

import struct

# module-level convenience (implemented in _struct.c):
struct.pack('>IH', 0xDEAD, 0xBEEF)    # => b'\x00\x00\xde\xad\xbe\xef'
struct.unpack('>IH', b'\x00\x00\xde\xad\xbe\xef')  # => (0xDEAD, 0xBEEF)
struct.calcsize('>IH')                 # => 6

# pack_into writes into an existing writable buffer at offset
buf = bytearray(8)
struct.pack_into('<I', buf, 2, 0xDEADBEEF)
# buf[2:6] = b'\xef\xbe\xad\xde'

# unpack_from reads from offset without slicing
(val,) = struct.unpack_from('<I', buf, 2)
# val = 0xDEADBEEF

pack returns a bytes object whose length equals calcsize(fmt). unpack always returns a tuple, even for a single-field format. pack_into writes into any writable buffer that supports the buffer protocol (bytearray, memoryview). unpack_from accepts an optional offset parameter so callers avoid an intermediate slice.

`Struct` compiled format cache

s = struct.Struct('>IH')
data = s.pack(0xDEAD, 0xBEEF)
values = s.unpack(data)
s.pack_into(buf, offset, 0xDEAD, 0xBEEF)

Struct.__init__ parses the format string in C and stores an internal list of (code, count, offset) triples plus the total size. Struct.size equals calcsize(fmt). pack, unpack, pack_into, and unpack_from on a Struct instance skip the parse step, making them faster than the module-level functions for hot paths.

`iter_unpack` streaming

fmt = struct.Struct('<HH')   # compile once
data = b'\x01\x00\x02\x00\x03\x00\x04\x00'

for a, b in fmt.iter_unpack(data):
    print(a, b)
# 1 2
# 3 4

iter_unpack(fmt, buffer) returns a lazy iterator that yields successive unpack results stepping by calcsize(fmt) bytes. The buffer length must be an exact multiple of the step size; otherwise struct.error is raised on the first call to __next__. The iterator holds a reference to the original buffer, so mutating it between iterations produces undefined results. The module-level struct.iter_unpack compiles the format string through the internal LRU cache before delegating to the same mechanism.

gopy mirror

encoding/binary in Go handles fixed-size integer encoding and decoding with explicit byte-order values (binary.LittleEndian, binary.BigEndian). For a full struct port, gopy needs to implement the format-string parser, the format-code table above, padding/alignment logic for @-prefixed formats, and the Struct compiled cache. iter_unpack maps naturally to a Go iterator over a byte slice. The e (half-float) code requires IEEE 754 binary16 conversion since Go's encoding/binary does not handle float16 natively; a small bit-manipulation helper is needed.

Map​

Reading​

Byte-order prefix semantics​

Format code table​

pack and unpack usage​

Struct compiled format cache​

iter_unpack streaming​

gopy mirror​

Map