CPython internals
These pages are a self-contained tour of CPython 3.14. They are written so that you can finish them with the CPython source tree open and understand what every important piece does, without ever opening the gopy source.
CPython is the reference implementation of Python. It is a tree
of C source under python/cpython
plus a Python-side standard library under Lib/. The interpreter
runs in four big phases.
The four phases
| Phase | Input | Output | Source root |
|---|---|---|---|
| Parse | .py source bytes | An AST. | Parser/ |
| Compile | An AST. | A PyCodeObject with bytecode. | Python/ |
| Run | A PyCodeObject and arguments. | Program output, side effects. | Python/ceval.c |
| Garbage collect | Live objects. | Released cycles, finalisers run. | Modules/gcmodule.c |
The "Pipeline" sidebar covers Parse and Compile. The "Execution" sidebar covers Run. The "Runtime" sidebar covers the object model and the garbage collector. The pages are arranged so that reading them in order matches the order CPython executes them.
What to read
If you are new to the source, read the pages in this order:
- Pipeline overview
- Parser -> AST -> Symbol table
- Compiler -> Flow graph -> Assembler
- VM -> Frames -> Specializer -> Tier-2
- Objects -> Types -> GC
- Imports -> Generators -> Exceptions -> GIL -> Monitor
Every page cites C files with file:line triples that point at
the current main branch of upstream CPython. No commit pins,
because the lines slide as upstream refactors.
Reading the C
The CPython source uses a handful of conventions that are worth internalising before diving in.
- Naming. Public C API functions start with
Py, internal ones start with_Py. Static helpers have no prefix. Types end in_typeif they are thePyTypeObject, e.g.PyLong_Type. - Refcount discipline. Every
PyObject *carries a reference count. Functions document whether they "steal" (transfer ownership of) a reference, "borrow" (do not change it), or "return a new reference" (the caller owns the result). - The GIL. The interpreter holds a single lock while running
Python code. Releasing it is done explicitly via
Py_BEGIN_ALLOW_THREADS/Py_END_ALLOW_THREADS. include/cpython/. This directory holds the public-but-not-stable surface.include/internal/holds private headers. Files end in.hfor normal C and.hfor declarations CPython itself ships; generated headers end in.h.in.
Bibliography
- The CPython developer guide.
- Brandt Bucher's PEP 659 explainer.
- Mark Shannon's PEP 744 trace projector notes.
- The "Python under the hood" series on the Tenthousandmeters blog.