Skip to main content

Include/internal/pycore_opcode_metadata.h

cpython 3.14 @ ab2d84fe1023/Include/internal/pycore_opcode_metadata.h

Include/internal/pycore_opcode_metadata.h is a machine-generated header that encodes per-opcode properties used at runtime. It is produced by Tools/cases_generator/ from the opcode definitions in Python/bytecodes.c. Every entry in the _PyOpcode_opcode_metadata array describes one opcode: its stack effect (inputs consumed, outputs produced), its instruction format (how many cache entries follow), and whether it is a pseudo-opcode or a specialization of a base opcode.

Map

LinesSymbolRole
1-30guard and includes#ifndef guard, pycore_opcode.h inclusion
31-120_PyOpcode_Caches arrayNumber of cache words each opcode consumes
121-300_PyOpcode_Deopt arrayMaps specialized opcodes back to their base opcode
301-500_PyOpcode_opcode_metadata arrayPer-opcode _PyOpcodeMetadata structs
501-600_PyOpcode_num_popped, _PyOpcode_num_pushedStack-depth helpers
601-680_PyOpcode_macro_expansionMacro expansion table for superinstructions

Reading

_PyOpcodeMetadata struct

Each array entry is a _PyOpcodeMetadata value carrying three fields used by the interpreter and tools.

// CPython: Include/internal/pycore_opcode_metadata.h:310 _PyOpcodeMetadata
struct _PyOpcodeMetadata {
int8_t n_popped;
int8_t n_pushed;
uint8_t valid_entry;
uint8_t instr_format;
};

n_popped and n_pushed encode the static stack effect. valid_entry is 1 for real opcodes and 0 for unused slots. instr_format names the operand layout (e.g., IF_AB means two 8-bit operands).

_PyOpcode_Caches: cache-word counts

The specializing adaptive interpreter stores cache words immediately after the opcode word in the bytecode stream. _PyOpcode_Caches[opcode] gives the number of 16-bit cache words for that opcode. LOAD_ATTR carries 4 cache words; CALL carries 3; simple opcodes like POP_TOP carry 0.

// CPython: Include/internal/pycore_opcode_metadata.h:40 _PyOpcode_Caches
extern const uint8_t _PyOpcode_Caches[256];
// e.g. _PyOpcode_Caches[LOAD_ATTR] == 4
// _PyOpcode_Caches[CALL] == 3
// _PyOpcode_Caches[POP_TOP] == 0

_PyOpcode_Deopt: specialization base mapping

When a specialized opcode (e.g., LOAD_ATTR_MODULE) needs to deoptimize back to the generic form, the interpreter looks up _PyOpcode_Deopt[specialized] to find the base opcode (LOAD_ATTR). This avoids a switch statement in the deoptimization path.

// CPython: Include/internal/pycore_opcode_metadata.h:180 _PyOpcode_Deopt
extern const uint8_t _PyOpcode_Deopt[256];
// e.g. _PyOpcode_Deopt[LOAD_ATTR_MODULE] == LOAD_ATTR
// _PyOpcode_Deopt[BINARY_OP_ADD_INT] == BINARY_OP

gopy notes

gopy's compile/ package maintains its own opcode metadata in compile/flowgraph_passes.go and the instruction format tables in vm/eval_gen.go. The _PyOpcode_Caches data is replicated in compile/codegen_stmt.go as cacheSize constants per opcode. The generated header is the authoritative source for these constants when porting new opcodes.

CPython 3.14 changes

3.14 added cache entries for CALL_KW and extended _PyOpcode_macro_expansion to cover new superinstruction combinations. The instr_format encoding was widened from 4 to 5 bits to accommodate new instruction shapes introduced by the Tier-2 optimizer.