Skip to main content

Objects/memoryobject.c (part 2)

Source:

cpython 3.14 @ ab2d84fe1023/Objects/memoryobject.c

This annotation covers data extraction and shape manipulation. See objects_memoryobject_detail for memoryview.__new__, Py_buffer layout, format strings, and nbytes.

Map

LinesSymbolRole
1-100memoryview.castReinterpret the memory as a different element type
101-240memoryview.tolistConvert to a (nested) list of Python scalars
241-380memoryview.tobytesCopy the memory to a new bytes object
381-540memoryview.__getitem__Index or slice the view
541-700Multi-dimensional viewsndim, shape, strides, suboffsets
701-800memoryview.__setitem__Write elements or slices into the underlying buffer

Reading

memoryview.cast

// CPython: Objects/memoryobject.c:1280 memoryview_cast_impl
static PyObject *
memoryview_cast_impl(PyMemoryViewObject *self, PyObject *format, PyObject *shape)
{
/* Return a new memoryview reinterpreting the bytes as 'format'.
The total byte count must be compatible.
Only allowed for contiguous 1-D views. */
Py_buffer *src = &self->view;
if (!PyBuffer_IsContiguous(src, 'C')) {
PyErr_SetString(PyExc_TypeError,
"memoryview: cast must be 1D contiguous");
return NULL;
}
/* Compute new itemsize from format string */
Py_ssize_t new_itemsize = get_native_fmtchar(&new_fmt, PyUnicode_AsUTF8(format));
if (src->len % new_itemsize != 0) {
PyErr_SetString(PyExc_TypeError, "memoryview: length is not a multiple of itemsize");
return NULL;
}
...
}

m.cast('H') reinterprets a byte view as 16-bit unsigned shorts. cast is essential for working with binary protocols: read raw bytes, cast to the appropriate struct type.

memoryview.tolist

// CPython: Objects/memoryobject.c:1380 memoryview_tolist
static PyObject *
memoryview_tolist(PyMemoryViewObject *mv, PyObject *noargs)
{
/* Convert the view to a nested Python list.
1-D: return flat list of scalars.
N-D: return nested lists of depth ndim. */
Py_buffer *view = &mv->view;
if (view->ndim == 0) {
return unpack_single(view->buf, view->format);
}
return tolist_rec(view->buf, view, 0);
}

np.array(m).tolist() is more efficient for NumPy arrays, but memoryview.tolist() works for any buffer including array.array, bytearray, and mmap.

memoryview.__getitem__

// CPython: Objects/memoryobject.c:1520 memoryview_subscript
static PyObject *
memoryview_subscript(PyMemoryViewObject *mv, PyObject *key)
{
Py_buffer *view = &mv->view;
if (view->ndim == 1) {
if (PyIndex_Check(key)) {
/* Single element */
Py_ssize_t index = PyNumber_AsSsize_t(key, PyExc_IndexError);
if (index < 0) index += view->shape[0];
return unpack_single(view->buf + index * view->strides[0], view->format);
}
/* Slice: return a new memoryview */
...
}
/* Multi-dimensional: index the first dimension */
...
}

mv[0] on a 1-D memoryview('H', ...) returns a Python int. mv[1:3] returns a new memoryview sharing the same buffer. mv[0, 1] on a 2-D view indexing both dimensions.

Multi-dimensional strides

// CPython: Objects/memoryobject.c:680 strides
/* For a 2-D array with shape (rows, cols) and itemsize s:
strides = (cols * s, s) -- C-contiguous (row-major)
strides = (s, rows * s) -- Fortran-contiguous (column-major)
Arbitrary strides allow non-contiguous views (transposed matrix, etc.) */

m.strides gives the byte step for each dimension. A transposed memoryview has swapped strides. suboffsets is needed for indirect (non-contiguous) arrays like NumPy's object arrays.

gopy notes

memoryview.cast is objects.MemoryViewCast in objects/memoryview.go. memoryview.tolist converts using a Go recursion matching the ndim depth. memoryview.__getitem__ uses pointer arithmetic on the underlying []byte. Multi-dimensional strides are stored as []int64 on objects.MemoryView.