Lib/dis.py
cpython 3.14 @ ab2d84fe1023/Lib/dis.py
dis is the bytecode disassembler. Given any Python function, method,
class, module, or code object, it produces an annotated listing of the
bytecode instructions. The listing format is the one displayed in the
Python REPL when you call dis.dis(f); it is also the format CPython's
own test suite uses in many places to assert that the compiler produces
a specific instruction sequence.
Beyond display, dis exposes the opcode metadata tables that static
analysis tools rely on: opname (opcode number to name), opmap
(name to number), hasfree, haslocal, hasname, hasjabs,
hasjrel, hasconst, hascompare, and others. It also provides
stack_effect, a function that returns the net stack depth change for
a single opcode without executing it, which the compiler uses as a
sanity check during code generation.
In CPython 3.14, dis integrated with the new co_positions iterator
(PEP 657) so that get_instructions can report column offsets as well
as line numbers for each instruction.
Map
| Lines | Symbol | Role | gopy |
|---|---|---|---|
| 1-100 | opname, opmap, hasarg, HAVE_ARGUMENT, hasjrel, hasjabs, haslocal, hasfree, hasname, hasconst, hascompare, hasjump, Instruction | Opcode tables imported from opcode module; Instruction namedtuple with 10 fields. | (stdlib pending) |
| 100-250 | _get_instructions_bytes, _disassemble_bytes, dis, disassemble | Core disassembly path: iterate instructions, format width-adjusted columns, mark jump targets and line starts. | (stdlib pending) |
| 250-400 | Bytecode, get_instructions | Bytecode is an iterable wrapper over a code object; get_instructions is the underlying generator yielding Instruction values. | (stdlib pending) |
| 400-550 | code_info, _get_code_object, findlinestarts, findlabels | code_info formats the code object header (name, argument counts, flags, constants, locals). findlinestarts and findlabels enumerate jump targets and line-number transitions. | (stdlib pending) |
| 550-700 | stack_effect, _inline_cache_entries, _parse_exception_table, _ExceptionTableEntry | stack_effect returns the net stack change per opcode; _parse_exception_table decodes co_exceptiontable into _ExceptionTableEntry records. | (stdlib pending) |
Reading
Instruction namedtuple fields
cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L1-100
_Instruction = collections.namedtuple(
'Instruction',
'opname opcode arg argval argrepr offset start_offset cache_offset '
'starts_line line_number end_line_number col_offset end_col_offset '
'is_jump_target positions'
)
Key fields:
opname/opcodeare the string name and integer opcode number.argis the raw operand afterEXTENDED_ARGwidening.argvalis the resolved value (e.g., the actual constant forLOAD_CONST, the variable name forLOAD_FAST).argrepris a human-readable string for the listing.offsetis the byte offset of the instruction inco_code. In 3.12+ all instructions are 2 bytes (word code), sooffsetis always even.start_offsetis the offset beforeEXTENDED_ARGprefixes;cache_offsetis the first byte of the inline-cache words that follow.starts_lineis a boolean: True when this instruction begins a new source line.line_numberis the 1-based source line (None for instructions with no line info).end_line_number,col_offset, andend_col_offsetcome from PEP 657 and are None when the code object was compiled without fine-grained position info.is_jump_targetis True when any other instruction can jump to this offset, making it a label in the CFG sense.positionsis adis.Positionsnamedtuple (lineno,end_lineno,col_offset,end_col_offset) populated fromco_positions(). It duplicates the individual fields above but is the canonical form passed to coverage and debugger hooks.
get_instructions and co_positions integration
cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L250-400
def get_instructions(x, *, first_line=None, show_caches=False, adaptive=False):
co = _get_code_object(x)
linestarts = dict(findlinestarts(co))
if first_line is not None:
line_offset = first_line - co.co_firstlineno
else:
line_offset = 0
return _get_instructions_bytes(
co.co_code,
co.co_varnames, co.co_names,
co.co_consts, co.co_cellvars + co.co_freevars,
linestarts, line_offset=line_offset,
exception_entries=_parse_exception_table(co),
co_positions=co.co_positions(),
show_caches=show_caches,
adaptive=adaptive,
)
co.co_positions() returns an iterator that yields one
(lineno, end_lineno, col_offset, end_col_offset) tuple per
instruction. The iterator is generated from the compact
co_linetable encoding (PEP 626) and the additional column table
added by PEP 657. _get_instructions_bytes zips over
co_positions() in lock-step with the instruction stream so each
Instruction receives its full location data without a separate
lookup.
The adaptive flag requests the live specialised bytecode (the
_Py_GetSpecializationStats snapshot) rather than the canonical form
stored in co_code. This is primarily used by CPython's internal test
suite to verify that the adaptive interpreter is choosing the expected
specializations.
stack_effect static analysis
cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L550-700
def stack_effect(opcode, oparg=None, /, *, jump=None):
if opcode in hasarg:
if oparg is None:
raise ValueError("stack_effect: oparg is required for opcode "
f"with arg {opname[opcode]!r}")
else:
oparg = 0
return _opcode.stack_effect(opcode, oparg, jump=jump)
stack_effect delegates to _opcode.stack_effect, a C function in
Modules/_opcode.c that reads from the generated opcode metadata table
(_PyOpcode_num_pushed and _PyOpcode_num_popped from
pycore_opcode_metadata.h). The jump parameter selects between
the effect on the taken-branch path and the effect on the not-taken
path for conditional jumps; it defaults to None which returns the
worst-case (maximum) effect.
The compiler uses stack_effect in compile.c's assemble_emit
to maintain a running stackdepth counter, which feeds co_stacksize.
Tools that rebuild a code object from a modified instruction list (such
as coverage transformers and bytecode optimizers) call stack_effect to
recompute co_stacksize after their edits.
HAVE_ARGUMENT is the threshold constant (currently 90) below which
opcodes carry no argument. Instructions with opcode < HAVE_ARGUMENT
have arg=0 and argval=None in their Instruction; instructions at
or above it always have a meaningful arg. In 3.12 CPython moved to
word code (all instructions are 2 bytes), so HAVE_ARGUMENT lost its
role as the width boundary but is retained for backward compatibility
with code that uses it as a filter.
_parse_exception_table
cpython 3.14 @ ab2d84fe1023/Lib/dis.py#L550-700
def _parse_exception_table(code):
iterator = code.co_exceptiontable.__iter__()
entries = []
try:
while True:
start, end, target, depth_lasti = _read_exception_table_entry(iterator)
lasti = bool(depth_lasti & 1)
depth = depth_lasti >> 1
entries.append(_ExceptionTableEntry(start, end, target, depth, lasti))
except StopIteration:
return entries
co_exceptiontable is a varint-packed byte string. Each entry encodes
four values: the start offset of the guarded region, its length (not
end offset), the handler offset, and a packed (depth << 1 | lasti)
byte. depth is the value stack depth at the handler entry point;
lasti indicates whether the last instruction offset should be pushed
onto the stack before jumping to the handler (used by RERAISE to
recover the original raise location).
The same encoding is decoded in C by get_exception_handler in
Python/ceval.c:1628, which uses a two-stage binary-then-linear search.
dis decodes it sequentially since disassembly is not on the hot path.
gopy mirror
gopy does not yet ship a dis module. The equivalent functionality is
split across two Go packages:
compile/flowgraph_stackdepth.goimplements theco_stacksizecomputation thatstack_effectunderlies, using the same per-opcode push/pop tables derived fromPython/bytecodes.c.compile/flowgraph.goandcompile/flowgraph_jumps.goimplementfindlabelsandfindlinestartsas internal passes in the compiler pipeline.
When the stdlib port reaches dis, the implementation will import a
Go-side opcode metadata package rather than depending on _opcode.c,
keeping the stdlib layer pure Python over a Go extension.