Skip to main content

Objects/bytesobject.c

Source:

cpython 3.14 @ ab2d84fe1023/Objects/bytesobject.c

Objects/bytesobject.c implements the bytes type. Bytes objects are immutable sequences of integers in [0, 255]. The file provides the single-byte intern table (first 256 byte values are pre-allocated singletons), hash caching, the full string-like method suite (find, replace, split, join, etc.), and the buffer protocol.

Map

LinesSymbolRole
1-100single-byte intern tablecharacters[256]; singleton allocation
101-300PyBytes_FromStringAndSize, _PyBytes_FromHexConstructors
301-600bytes_find, bytes_count, bytes_indexSearch methods
601-1000bytes_split, bytes_rsplit, bytes_joinSplitting and joining
1001-1400bytes_replace, bytes_translateReplacement and translation
1401-1800bytes_strip, bytes_lower, bytes_upperWhitespace and case
1801-2500bytes_decode, bytes_hex, bytes_fromhexCodec integration; hex encoding
2501-3500buffer protocol, hash, type wiringgetbuffer; ob_shash; PyBytes_Type

Reading

Single-byte intern table

PyBytes_FromStringAndSize for a single byte checks the intern table first. Each of the 256 single-byte values is allocated once at interpreter init and reused. This avoids allocation and comparison overhead for common patterns like b'\n', b' ', and null bytes.

// Objects/bytesobject.c:1 PyBytes_FromStringAndSize
PyObject *
PyBytes_FromStringAndSize(const char *str, Py_ssize_t size)
{
if (size == 1 && str != NULL) {
return (PyObject *)characters[(unsigned char)*str];
}
/* allocate new PyBytesObject */
...
}

Hash caching

ob_shash stores the hash value after the first hash() call. -1 means not yet computed (or an actual hash of -1 is stored as -2). The hash uses SipHash-1-3 on the byte content.

// Objects/bytesobject.c:2501 bytes_hash
static Py_hash_t
bytes_hash(PyBytesObject *a)
{
if (a->ob_shash == -1) {
a->ob_shash = _Py_HashBytes(a->ob_sval, Py_SIZE(a));
}
return a->ob_shash;
}

Buffer protocol

bytes_buffer_getbuf exports the raw byte array as a read-only Py_buffer. This allows zero-copy passing of bytes data to C functions that accept const char *, including struct.pack_into, ctypes, and C extension modules.

// Objects/bytesobject.c:2520 bytes_buffer_getbuf
static int
bytes_buffer_getbuf(PyBytesObject *self, Py_buffer *view, int flags)
{
return PyBuffer_FillInfo(view, (PyObject*)self,
(void *)self->ob_sval, Py_SIZE(self),
1, /* readonly */
flags);
}

gopy notes

The gopy bytes type wraps a Go []byte. Single-byte interning maps to a [256]*Bytes array initialized at startup. Hash caching uses a -1 sentinel on a hash int64 field. The buffer protocol maps to exposing the underlying []byte slice for zero-copy operations.