Skip to main content

Flow graph

After codegen, the instruction sequence is a flat array with symbolic jump targets. The flow-graph pass slices the sequence into basic blocks linked by jump edges, runs the optimiser, analyses stack depth, and builds the exception table.

Source map

FileRole
Python/flowgraph.cThe basic block builder, the passes, the analyser.
Python/instruction_sequence.cThe instruction sequence (input to the flow graph).
Include/internal/pycore_flowgraph.hStruct definitions.

Building the graph

_PyCfg_FromInstructionSequence walks the instruction sequence. A new basic block starts at every:

  • Position that is the target of a jump (entry block of an arm).
  • Position immediately after a jump, return, raise, or RAISE_VARARGS.
  • Position immediately after a SETUP_* pseudo-instruction.

Each block holds its instructions inline plus pointers to its successor blocks. The graph is built bottom up: leaf blocks first, then jumps resolved to block pointers.

Passes

The passes run in a fixed order. Each iterates until it reaches a fixed point.

Jump threading

A jump whose target is an unconditional jump is rewritten to skip the intermediate block. Repeated until no chain remains. This catches the long jump chains that codegen emits for nested loops and short-circuit booleans.

Eliminate unconditional jumps

A block that ends in JUMP to a fall-through successor drops the jump. A JUMP_IF_* whose two arms point at the same block is replaced with a POP_TOP.

Constant folding

Compile-time numeric, string, and tuple operations are folded. Examples:

  • LOAD_CONST 2; LOAD_CONST 3; BINARY_OP + becomes LOAD_CONST 5.
  • BUILD_TUPLE 0 is left alone (the empty tuple is a singleton).
  • Numeric UNARY_NEGATIVE on a constant is folded.
  • String + with both operands constants is folded.

The optimiser is conservative: it only folds when the result fits the language's value type and when the operation cannot raise.

Dead code elimination

A block unreachable from the entry block is removed. A RETURN followed by code that no jump targets is trimmed.

Inline cache reservation

The Tier-1 instruction set has opcodes with inline caches (LOAD_ATTR, LOAD_GLOBAL, BINARY_OP, STORE_ATTR, COMPARE_OP, CALL, SEND, FOR_ITER, LOAD_SUPER_ATTR, STORE_SUBSCR, UNPACK_SEQUENCE, TO_BOOL, CONTAINS_OP, CALL_KW). The flow-graph pass reserves the cache slot count declared in pycore_code.h for each such opcode by inserting CACHE no-ops after it.

Pseudo-opcode resolution

SETUP_FINALLY, SETUP_CLEANUP, SETUP_WITH, POP_BLOCK are pseudo-opcodes that exist only at compile time. The flow-graph pass turns each SETUP_* / POP_BLOCK pair into an entry in the exception table, then drops the pseudo-instructions.

Stack-depth analysis

Each Tier-1 opcode has a static stack effect declared in bytecodes.c. _PyCfg_StackSize walks the block graph in breadth-first order, tracking the maximum stack depth observed. That maximum lands in co_stacksize and the eval loop uses it to pre-allocate the value stack inside a frame.

The analyser also detects stack underflow at compile time: a path that pops more than it pushed since entry triggers a SystemError, because the language is supposed to make stack underflow impossible.

Exception table

Each try / except / finally / with region produces an entry in co_exceptiontable. An entry is a quadruple:

FieldMeaning
startInclusive byte offset where the protected region starts.
endExclusive byte offset where it ends.
targetWhere to jump if an exception is raised in the region.
depth, lastiStack depth to unwind to plus whether to push lasti.

The format is a varint-packed table; the eval loop binary-searches it on exception unwind.

Location table

The instruction stream carries a parallel location table (co_linetable) that maps byte offsets to source positions. The flow-graph pass keeps the location of each instruction synced as blocks move. After assembly, the linearised location stream is packed into the PEP 657 compact format.

Reading order

The output is fed into Assembler.