Skip to main content

Lib/dis.py

cpython 3.14 @ ab2d84fe1023/Lib/dis.py

dis is the bytecode disassembler. Given any Python function, method, class, module, or code object, it produces an annotated listing of the bytecode instructions. The listing format is the one displayed in the Python REPL when you call dis.dis(f); it is also the format CPython's own test suite uses in many places to assert that the compiler produces a specific instruction sequence.

Beyond display, dis exposes the opcode metadata tables that static analysis tools rely on: opname (opcode number to name), opmap (name to number), hasfree, haslocal, hasname, hasjabs, hasjrel, hasconst, hascompare, and others. It also provides stack_effect, a function that returns the net stack depth change for a single opcode without executing it, which the compiler uses as a sanity check during code generation.

In CPython 3.14, dis integrated with the new co_positions iterator (PEP 657) so that get_instructions can report column offsets as well as line numbers for each instruction.

Map

LinesSymbolRolegopy
1-100opname, opmap, hasarg, HAVE_ARGUMENT, hasjrel, hasjabs, haslocal, hasfree, hasname, hasconst, hascompare, hasjump, InstructionOpcode tables imported from opcode module; Instruction namedtuple with 10 fields.(stdlib pending)
100-250_get_instructions_bytes, _disassemble_bytes, dis, disassembleCore disassembly path: iterate instructions, format width-adjusted columns, mark jump targets and line starts.(stdlib pending)
250-400Bytecode, get_instructionsBytecode is an iterable wrapper over a code object; get_instructions is the underlying generator yielding Instruction values.(stdlib pending)
400-550code_info, _get_code_object, findlinestarts, findlabelscode_info formats the code object header (name, argument counts, flags, constants, locals). findlinestarts and findlabels enumerate jump targets and line-number transitions.(stdlib pending)
550-700stack_effect, _inline_cache_entries, _parse_exception_table, _ExceptionTableEntrystack_effect returns the net stack change per opcode; _parse_exception_table decodes co_exceptiontable into _ExceptionTableEntry records.(stdlib pending)

Reading

Instruction namedtuple fields

cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L1-100

_Instruction = collections.namedtuple(
'Instruction',
'opname opcode arg argval argrepr offset start_offset cache_offset '
'starts_line line_number end_line_number col_offset end_col_offset '
'is_jump_target positions'
)

Key fields:

  • opname / opcode are the string name and integer opcode number.
  • arg is the raw operand after EXTENDED_ARG widening. argval is the resolved value (e.g., the actual constant for LOAD_CONST, the variable name for LOAD_FAST). argrepr is a human-readable string for the listing.
  • offset is the byte offset of the instruction in co_code. In 3.12+ all instructions are 2 bytes (word code), so offset is always even. start_offset is the offset before EXTENDED_ARG prefixes; cache_offset is the first byte of the inline-cache words that follow.
  • starts_line is a boolean: True when this instruction begins a new source line. line_number is the 1-based source line (None for instructions with no line info). end_line_number, col_offset, and end_col_offset come from PEP 657 and are None when the code object was compiled without fine-grained position info.
  • is_jump_target is True when any other instruction can jump to this offset, making it a label in the CFG sense.
  • positions is a dis.Positions namedtuple (lineno, end_lineno, col_offset, end_col_offset) populated from co_positions(). It duplicates the individual fields above but is the canonical form passed to coverage and debugger hooks.

get_instructions and co_positions integration

cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L250-400

def get_instructions(x, *, first_line=None, show_caches=False, adaptive=False):
co = _get_code_object(x)
linestarts = dict(findlinestarts(co))
if first_line is not None:
line_offset = first_line - co.co_firstlineno
else:
line_offset = 0
return _get_instructions_bytes(
co.co_code,
co.co_varnames, co.co_names,
co.co_consts, co.co_cellvars + co.co_freevars,
linestarts, line_offset=line_offset,
exception_entries=_parse_exception_table(co),
co_positions=co.co_positions(),
show_caches=show_caches,
adaptive=adaptive,
)

co.co_positions() returns an iterator that yields one (lineno, end_lineno, col_offset, end_col_offset) tuple per instruction. The iterator is generated from the compact co_linetable encoding (PEP 626) and the additional column table added by PEP 657. _get_instructions_bytes zips over co_positions() in lock-step with the instruction stream so each Instruction receives its full location data without a separate lookup.

The adaptive flag requests the live specialised bytecode (the _Py_GetSpecializationStats snapshot) rather than the canonical form stored in co_code. This is primarily used by CPython's internal test suite to verify that the adaptive interpreter is choosing the expected specializations.

stack_effect static analysis

cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L550-700

def stack_effect(opcode, oparg=None, /, *, jump=None):
if opcode in hasarg:
if oparg is None:
raise ValueError("stack_effect: oparg is required for opcode "
f"with arg {opname[opcode]!r}")
else:
oparg = 0
return _opcode.stack_effect(opcode, oparg, jump=jump)

stack_effect delegates to _opcode.stack_effect, a C function in Modules/_opcode.c that reads from the generated opcode metadata table (_PyOpcode_num_pushed and _PyOpcode_num_popped from pycore_opcode_metadata.h). The jump parameter selects between the effect on the taken-branch path and the effect on the not-taken path for conditional jumps; it defaults to None which returns the worst-case (maximum) effect.

The compiler uses stack_effect in compile.c's assemble_emit to maintain a running stackdepth counter, which feeds co_stacksize. Tools that rebuild a code object from a modified instruction list (such as coverage transformers and bytecode optimizers) call stack_effect to recompute co_stacksize after their edits.

HAVE_ARGUMENT is the threshold constant (currently 90) below which opcodes carry no argument. Instructions with opcode < HAVE_ARGUMENT have arg=0 and argval=None in their Instruction; instructions at or above it always have a meaningful arg. In 3.12 CPython moved to word code (all instructions are 2 bytes), so HAVE_ARGUMENT lost its role as the width boundary but is retained for backward compatibility with code that uses it as a filter.

_parse_exception_table

cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L550-700

def _parse_exception_table(code):
iterator = code.co_exceptiontable.__iter__()
entries = []
try:
while True:
start, end, target, depth_lasti = _read_exception_table_entry(iterator)
lasti = bool(depth_lasti & 1)
depth = depth_lasti >> 1
entries.append(_ExceptionTableEntry(start, end, target, depth, lasti))
except StopIteration:
return entries

co_exceptiontable is a varint-packed byte string. Each entry encodes four values: the start offset of the guarded region, its length (not end offset), the handler offset, and a packed (depth << 1 | lasti) byte. depth is the value stack depth at the handler entry point; lasti indicates whether the last instruction offset should be pushed onto the stack before jumping to the handler (used by RERAISE to recover the original raise location).

The same encoding is decoded in C by get_exception_handler in Python/ceval.c:1628, which uses a two-stage binary-then-linear search. dis decodes it sequentially since disassembly is not on the hot path.

gopy mirror

gopy does not yet ship a dis module. The equivalent functionality is split across two Go packages:

  • compile/flowgraph_stackdepth.go implements the co_stacksize computation that stack_effect underlies, using the same per-opcode push/pop tables derived from Python/bytecodes.c.
  • compile/flowgraph.go and compile/flowgraph_jumps.go implement findlabels and findlinestarts as internal passes in the compiler pipeline.

When the stdlib port reaches dis, the implementation will import a Go-side opcode metadata package rather than depending on _opcode.c, keeping the stdlib layer pure Python over a Go extension.