Skip to main content

Assembler

The assembler is the last stage of compile. It takes the optimised flow graph and produces a PyCodeObject ready to be cached or executed.

Source map

FileRole
Python/assemble.cThe assembler.
Objects/codeobject.cPyCodeObject allocation and slots.
Include/cpython/code.hPublic-internal PyCodeObject definition.
Include/internal/pycore_code.hInline-cache layouts, magic numbers.

Stage 1: jump resolution

Each block has a list of instructions; each jump has a block pointer as its target. The assembler walks the blocks in emit-order, assigning each instruction a byte offset (each instruction is two bytes, plus two bytes for each inline cache entry, plus two bytes for each EXTENDED_ARG prefix needed to fit its operand).

After every offset is known, the assembler rewrites jump operands to byte distances. JUMP_FORWARD, JUMP_BACKWARD, the POP_JUMP_IF_* family, and FOR_ITER use relative offsets. SEND and RESUME use absolute offsets within the code object.

If a jump's signed displacement does not fit in an 8-bit operand, the assembler inserts an EXTENDED_ARG prefix and retries the layout. Because inserting bytes shifts every following offset, this is iterated to a fixed point.

Stage 2: location table

co_linetable is encoded in the PEP 657 compact format. Each entry is one byte of varint header plus zero or more varint deltas. The header byte's high bit marks short vs long encoding; the next 3 bits are the form code (no column, one-byte column, two-byte column, "no location at all", etc.); the bottom 4 bits hold the byte-code length covered by this entry.

The compact format keeps the table small: a typical entry costs one byte per instruction, with column information in two or three bytes per change. Trace tooling decodes the table by walking it linearly and accumulating the line and column deltas.

Stage 3: exception table

co_exceptiontable is also varint-encoded. Each entry packs:

  • The byte offset where the protected region starts (varint).
  • The length of the region (varint).
  • The byte offset to jump to on exception (varint).
  • A composite "depth + lasti" byte that holds the stack depth to unwind to plus a flag indicating whether the table-installed exception should be pushed back onto the stack.

The eval loop binary-searches this table on every exception raised inside the code object's body.

Stage 4: pools

The code object carries several parallel pools, each a tuple indexed by oparg:

PoolHolds
co_constsCompile-time constants. Includes nested code objects.
co_namesNames used by name-lookup opcodes (LOAD_GLOBAL, LOAD_ATTR, ...).
co_varnamesLocal variable names, indexed by LOAD_FAST.
co_cellvarsCell variable names (locals captured by inner scopes).
co_freevarsFree variable names (locals captured from enclosing scopes).

Cell variables and free variables share the slot space of the fast locals: co_nlocalsplus = nlocals + ncellvars + nfreevars, and LOAD_DEREF / STORE_DEREF operate on that combined space.

Stage 5: emit the PyCodeObject

_PyCode_New allocates the code object and copies in:

  • co_code -- the bytecode bytes as a bytes object.
  • The pools.
  • co_argcount, co_kwonlyargcount, co_posonlyargcount.
  • co_stacksize from the flow-graph pass.
  • co_flags (generator, coroutine, async generator, varargs, etc.).
  • co_firstlineno, co_linetable, co_exceptiontable.
  • co_filename, co_name, co_qualname.

co_code is immutable after this point. The eval loop reads from it directly; the specializer rewrites it in place once warm.

Magic numbers

The .pyc magic number lives in Include/internal/pycore_magic_number.h. It changes whenever the on-disk format of a code object's serialisation changes, which forces a recompile of all .pyc files. The number is bumped per release.

Reading order

The output PyCodeObject is the input to the VM.