pycore_global_strings.h
Pre-interned string storage for CPython's runtime. Every identifier or
special literal that the interpreter needs repeatedly (dunder names, keyword
strings, codec names) is stored once in _PyRuntime.cached_objects.strings
and accessed through the _Py_ID() or _Py_STR() macros. No allocation
happens at call sites; callers borrow a reference to a singleton that lives
for the lifetime of the interpreter.
Map
| Lines | Symbol | Role |
|---|---|---|
| 18–26 | STRUCT_FOR_ASCII_STR, STRUCT_FOR_STR, STRUCT_FOR_ID | Layout macros that embed a PyASCIIObject plus inline data |
| 31–823 | struct _Py_global_strings | Aggregate of literals and identifiers sub-structs, auto-generated |
| 33–56 | literals | Named special strings: <module>, utf-8, <lambda>, etc. |
| 58–814 | identifiers | Dunder names and other identifiers: __init__ through zstd_dict |
| 815–823 | ascii[128], latin1[128] | Fast single-character string table |
| 830–831 | _Py_ID(NAME) | Macro returning a borrowed PyObject* for a known identifier |
| 832–833 | _Py_STR(NAME) | Macro returning a borrowed PyObject* for a known literal |
| 834–837 | _Py_LATIN1_CHR(CH) | Macro returning a pre-interned single-character string |
| 849 | _Py_DECLARE_STR(name, str) | Documentation-only macro; expands to nothing |
Reading
Struct layout macros (lines 18–26)
Each string is stored as an anonymous struct embedding the full
PyASCIIObject header followed by inline character data. This lets the
linker place the string body immediately after its header with no extra
allocation.
// CPython: Include/internal/pycore_global_strings.h:18 STRUCT_FOR_ASCII_STR
#define STRUCT_FOR_ASCII_STR(LITERAL) \
struct { \
PyASCIIObject _ascii; \
uint8_t _data[sizeof(LITERAL)]; \
}
#define STRUCT_FOR_STR(NAME, LITERAL) \
STRUCT_FOR_ASCII_STR(LITERAL) _py_ ## NAME;
#define STRUCT_FOR_ID(NAME) \
STRUCT_FOR_ASCII_STR(#NAME) _py_ ## NAME;
STRUCT_FOR_ID stringifies NAME so the struct's _data array is
sized exactly to the identifier text including the NUL terminator.
Access macros (lines 830–837)
// CPython: Include/internal/pycore_global_strings.h:830 _Py_ID
#define _Py_ID(NAME) \
(_Py_SINGLETON(strings.identifiers._py_ ## NAME._ascii.ob_base))
#define _Py_STR(NAME) \
(_Py_SINGLETON(strings.literals._py_ ## NAME._ascii.ob_base))
#define _Py_LATIN1_CHR(CH) \
((CH) < 128 \
? (PyObject*)&_Py_SINGLETON(strings).ascii[(CH)] \
: (PyObject*)&_Py_SINGLETON(strings).latin1[(CH) - 128])
_Py_SINGLETON expands to _PyRuntime.cached_objects, so _Py_ID(__init__)
is a direct field access into the global runtime struct, not a hash-table
lookup.