`Objects/longobject.c`

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c

CPython integers use a variable-length digit array. Each digit is 30 bits wide on 64-bit platforms. Small integers (-5 to 256) are cached as immortal singletons at interpreter startup. In 3.12 and later, compact integers (those that fit in one digit) use a two-word layout where the digit is stored inline in ob_digit[0] without a separate heap allocation. The sign is encoded in the lv_tag field of PyLongObject. Arithmetic dispatches through the nb_* numeric slots. String conversion uses Knuth-style radix algorithms. The hash reduces the integer modulo _PyHASH_MODULUS (2^61 - 1 on 64-bit) so that equal integers and floats always produce equal hashes.

Map

Lines	Symbol	Role	gopy
1-200	`_PyLong_IsNonNegativeCompact`, `_PyLong_IsCompact`, `_PyLong_CompactValue`, digit layout macros	Tag-bit layout and inline compact-int accessors.	`objects/long.go`
200-600	`_PyLong_New`, `_PyLong_Copy`, `PyLong_FromLong`, `PyLong_FromUnsignedLong`, `PyLong_FromDouble`, `PyLong_FromVoidPtr`	Allocation and construction from C types.	`objects/long.go:NewLong`
600-1200	`PyLong_AsLong`, `PyLong_AsUnsignedLong`, `PyLong_AsLongLong`, `PyLong_AsDouble`, `PyLong_AsVoidPtr`, `_PyLong_AsInt`	Extraction to C numeric types with overflow checking.	`objects/long.go`
1200-1800	`PyLong_FromString`, `PyLong_FromUnicodeObject`	String parsing with arbitrary radix.	`objects/long.go:LongFromString`
1800-2400	`long_to_decimal_string`, `long_format`	Number-to-string conversion for `repr` and `str`.	`objects/long.go:longFormat`
2400-3000	`long_add`, `long_sub`, `long_mul`, `long_neg`, `long_abs`	Digit-level addition, subtraction, multiplication, negation, absolute value.	`objects/long.go:longAdd`
3000-3600	`long_divrem`, `long_div`, `long_mod`, `long_divmod`	Multi-precision division using Knuth Algorithm D.	`objects/long.go:longDivrem`
3600-4200	`long_pow`	Binary exponentiation (square-and-multiply).	`objects/long.go:longPow`
4200-4800	`long_lshift`, `long_rshift`, `long_and`, `long_xor`, `long_or`	Bit operations on the digit array.	`objects/long.go:longLshift`
4800-5400	`long_richcompare`, `long_hash`	Comparison and hash; hash reduces mod 2^61-1.	`objects/long.go:longHash`
5400-6871	`long_repr`, `long_new`, `_PyLong_Init`, `PyUnstable_Long_IsCompact`, `PyLong_GetSign`	Repr, constructor, small-int cache init, public compact-int API.	`objects/long.go:longRepr`

Reading

Compact integers (lines 1 to 200)

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c#L1-200

Since 3.12, integers whose absolute value is less than 2^30 use a two-word PyLongObject layout. The digit value is stored directly in ob_digit[0] and the sign is encoded in lv_tag. No separate heap allocation is needed for the digit array.

static inline int
_PyLong_IsCompact(const PyLongObject *op) {
    return op->long_value.lv_tag < (2 << NON_SIZE_BITS);
}

static inline Py_ssize_t
_PyLong_CompactValue(const PyLongObject *op) {
    Py_ssize_t sign = 1 - (op->long_value.lv_tag & 3);
    return sign * (Py_ssize_t)op->long_value.ob_digit[0];
}

The bottom two bits of lv_tag encode the sign: 0 means positive, 2 means negative, and 1 is reserved for the zero singleton. The remaining bits encode the digit count for non-compact integers, which allows the compact check to be a single unsigned compare.

Small-int cache: `_PyLong_Init` (lines 5400 to 6871)

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c#L5400-6871

At interpreter startup _PyLong_Init pre-allocates integer objects for the range -5 to 256. These objects are immortal: their reference counts are never decremented. PyLong_FromLong checks the requested value against this range and returns the cached pointer without touching the allocator:

#define NSMALLNEGINTS 5
#define NSMALLPOSINTS 257

static PyLongObject small_ints[NSMALLNEGINTS + NSMALLPOSINTS];

PyObject *
PyLong_FromLong(long ival)
{
    if (ival >= -NSMALLNEGINTS && ival < NSMALLPOSINTS) {
        PyLongObject *v = &small_ints[ival + NSMALLNEGINTS];
        return (PyObject *)Py_NewRef(v);
    }
    /* slow path: allocate */
    ...
}

The cache covers the most common loop counters, boolean-like integers, and ASCII code points. Returning a cached pointer avoids both allocation and deallocation for the lifetime of the interpreter.

`long_add` (lines 2400 to 2600)

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c#L2400-2600

Addition dispatches on whether the two operands have the same sign. For same-sign operands it walks the digit arrays left-to-right adding corresponding digits plus any carry. For different-sign operands it subtracts the smaller absolute value from the larger and copies the sign of the larger:

static PyLongObject *
x_add(PyLongObject *a, PyLongObject *b)
{
    Py_ssize_t size_a = Py_ABS(Py_SIZE(a));
    Py_ssize_t size_b = Py_ABS(Py_SIZE(b));
    PyLongObject *z = _PyLong_New(size_a < size_b ? size_b + 1 : size_a + 1);
    digit carry = 0;
    Py_ssize_t i;
    for (i = 0; i < size_b; ++i) {
        carry += a->long_value.ob_digit[i] + b->long_value.ob_digit[i];
        z->long_value.ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    for (; i < size_a; ++i) {
        carry += a->long_value.ob_digit[i];
        z->long_value.ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    z->long_value.ob_digit[i] = carry;
    return long_normalize(z);
}

PyLong_SHIFT is 30 on 64-bit platforms. PyLong_MASK is (1 << 30) - 1. long_normalize trims leading zero digits and sets ob_size.

`long_divrem`: Knuth Algorithm D (lines 3000 to 3200)

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c#L3000-3200

Multi-precision division uses Knuth's Algorithm D (TAOCP Vol. 2, Section 4.3.1). The divisor is left-shifted until its most-significant digit is at least 2^29, ensuring that trial quotient digits are accurate to within one. After division the remainder is right-shifted by the same amount:

static int
inplace_divrem1(digit *pout, digit *pin, Py_ssize_t size, digit n)
{
    twodigits rem = 0;
    pin += size;
    pout += size;
    while (--size >= 0) {
        digit hi;
        rem = (rem << PyLong_SHIFT) | *--pin;
        *--pout = hi = (digit)(rem / n);
        rem -= (twodigits)hi * n;
    }
    return (digit)rem;
}

Single-digit divisors use this fast path. For multi-digit divisors the full Algorithm D loop normalizes, computes trial digits, and corrects by at most one step.

`long_hash` (lines 4800 to 5000)

cpython 3.14 @ ab2d84fe1023/Objects/longobject.c#L4800-5000

The hash maps an arbitrary-precision integer to the same value as hash() of the equal float. It reduces the integer modulo _PyHASH_MODULUS (2^61 - 1 on 64-bit). Each 30-bit digit at position i contributes digit * 2^(30*i) mod M. The identity 2^61 ≡ 1 (mod M) allows reduction using only 30-bit shifts:

static Py_hash_t
long_hash(PyLongObject *v)
{
    Py_uhash_t x = 0;
    Py_ssize_t i = Py_ABS(Py_SIZE(v));
    while (--i >= 0) {
        x = ((x << PyLong_SHIFT) & _PyHASH_MODULUS) |
             x >> (_PyHASH_BITS - PyLong_SHIFT);
        x += v->long_value.ob_digit[i];
        if (x >= (Py_uhash_t)_PyHASH_MODULUS)
            x -= _PyHASH_MODULUS;
    }
    if (Py_SIZE(v) < 0)
        x = _PyHASH_MODULUS - x;
    if (x == (Py_uhash_t)-1)
        x = -2;
    return (Py_hash_t)x;
}

The rotate-and-add loop accumulates digits from most-significant to least-significant. The final check avoids returning -1, which CPython reserves as the error sentinel for tp_hash.

gopy mirror

objects/long.go. Compact representation is preserved. Digit width is 30 bits. The small-int cache is a []*Long slice indexed by value + 5. Arithmetic delegates to math/big.Int for values outside the compact fast path, with the digit-level algorithms ported directly for the hot cases.

CPython 3.14 changes

The compact integer layout has been stable since 3.12. PyUnstable_Long_IsCompact and PyUnstable_Long_CompactValue are the public C extension API for reading compact integers without touching the internal layout. PyLong_GetSign was added in 3.14 as a stable API that returns -1, 0, or 1. PEP 649 has no effect on integer objects.

Map​

Reading​

Compact integers (lines 1 to 200)​

Small-int cache: _PyLong_Init (lines 5400 to 6871)​

long_add (lines 2400 to 2600)​

long_divrem: Knuth Algorithm D (lines 3000 to 3200)​

long_hash (lines 4800 to 5000)​

gopy mirror​

CPython 3.14 changes​

Map