Assembler
The assembler is the last stage of compile. It takes the optimised
flow graph and produces a PyCodeObject ready to be cached or
executed.
Source map
| File | Role |
|---|---|
Python/assemble.c | The assembler. |
Objects/codeobject.c | PyCodeObject allocation and slots. |
Include/cpython/code.h | Public-internal PyCodeObject definition. |
Include/internal/pycore_code.h | Inline-cache layouts, magic numbers. |
Stage 1: jump resolution
Each block has a list of instructions; each jump has a block
pointer as its target. The assembler walks the blocks in
emit-order, assigning each instruction a byte offset (each
instruction is two bytes, plus two bytes for each inline cache
entry, plus two bytes for each EXTENDED_ARG prefix needed to fit
its operand).
After every offset is known, the assembler rewrites jump operands
to byte distances. JUMP_FORWARD, JUMP_BACKWARD, the
POP_JUMP_IF_* family, and FOR_ITER use relative offsets.
SEND and RESUME use absolute offsets within the code object.
If a jump's signed displacement does not fit in an 8-bit operand,
the assembler inserts an EXTENDED_ARG prefix and retries the
layout. Because inserting bytes shifts every following offset,
this is iterated to a fixed point.
Stage 2: location table
co_linetable is encoded in the PEP 657 compact format. Each
entry is one byte of varint header plus zero or more varint
deltas. The header byte's high bit marks short vs long encoding;
the next 3 bits are the form code (no column, one-byte column,
two-byte column, "no location at all", etc.); the bottom 4 bits
hold the byte-code length covered by this entry.
The compact format keeps the table small: a typical entry costs one byte per instruction, with column information in two or three bytes per change. Trace tooling decodes the table by walking it linearly and accumulating the line and column deltas.
Stage 3: exception table
co_exceptiontable is also varint-encoded. Each entry packs:
- The byte offset where the protected region starts (varint).
- The length of the region (varint).
- The byte offset to jump to on exception (varint).
- A composite "depth + lasti" byte that holds the stack depth to unwind to plus a flag indicating whether the table-installed exception should be pushed back onto the stack.
The eval loop binary-searches this table on every exception raised inside the code object's body.
Stage 4: pools
The code object carries several parallel pools, each a tuple indexed by oparg:
| Pool | Holds |
|---|---|
co_consts | Compile-time constants. Includes nested code objects. |
co_names | Names used by name-lookup opcodes (LOAD_GLOBAL, LOAD_ATTR, ...). |
co_varnames | Local variable names, indexed by LOAD_FAST. |
co_cellvars | Cell variable names (locals captured by inner scopes). |
co_freevars | Free variable names (locals captured from enclosing scopes). |
Cell variables and free variables share the slot space of the
fast locals: co_nlocalsplus = nlocals + ncellvars + nfreevars,
and LOAD_DEREF / STORE_DEREF operate on that combined space.
Stage 5: emit the PyCodeObject
_PyCode_New allocates the code object and copies in:
co_code-- the bytecode bytes as abytesobject.- The pools.
co_argcount,co_kwonlyargcount,co_posonlyargcount.co_stacksizefrom the flow-graph pass.co_flags(generator, coroutine, async generator, varargs, etc.).co_firstlineno,co_linetable,co_exceptiontable.co_filename,co_name,co_qualname.
co_code is immutable after this point. The eval loop reads from
it directly; the specializer rewrites it in place once warm.
Magic numbers
The .pyc magic number lives in Include/internal/pycore_magic_number.h.
It changes whenever the on-disk format of a code object's
serialisation changes, which forces a recompile of all .pyc
files. The number is bumped per release.
Reading order
The output PyCodeObject is the input to the VM.