Skip to main content

Lib/random.py (part 2)

Source:

cpython 3.14 @ ab2d84fe1023/Lib/random.py

This annotation covers sampling and distribution methods. See lib_random_detail for Random.__init__, Mersenne Twister, random(), randint, seed, and getstate/setstate.

Map

LinesSymbolRole
1-100random.sampleSample k unique elements from a population
101-220random.choicesSample with replacement, optional weights
221-360random.shuffleShuffle a list in place (Fisher-Yates)
361-480random.gaussGaussian distribution via the Marsaglia-Bray algorithm
481-600SystemRandomOS-provided randomness via os.urandom

Reading

random.sample

# CPython: Lib/random.py:380 sample
def sample(self, population, k, *, counts=None):
"""Return k unique elements from population (no replacement)."""
if counts is not None:
population = list(_chain.from_iterable(_repeat(v, c) for v, c in zip(population, counts)))
n = len(population)
if not 0 <= k <= n:
raise ValueError("Sample larger than population or is negative")
# Small k: pick and exclude selected indices
result = [None] * k
setsize = 21 # Cutoff between two algorithms
if k > 5:
setsize += 4 ** ceil(log(k * 3, 4))
pool = list(population)
for i in range(k):
j = int(self.random() * (n - i))
result[i] = pool[j]
pool[j] = pool[n - i - 1] # Partial Fisher-Yates
return result

random.sample uses a partial Fisher-Yates shuffle when k is small relative to n. For larger k, it uses a set to track selected indices. The counts parameter allows sampling from a population with repetitions.

random.choices

# CPython: Lib/random.py:480 choices
def choices(self, population, weights=None, *, cum_weights=None, k=1):
"""Return k elements with replacement; weights adjust probability."""
if cum_weights is None:
if weights is None:
n = len(population)
return [population[int(self.random() * n)] for _ in _repeat(None, k)]
cum_weights = list(_accumulate(weights))
total = cum_weights[-1]
bisect = _bisect
return [population[bisect(cum_weights, self.random() * total)]
for _ in _repeat(None, k)]

Weighted sampling uses cumulative weights and binary search (bisect.bisect). choices(['a','b','c'], weights=[1,2,3]) samples 'c' ~50% of the time.

random.gauss

# CPython: Lib/random.py:580 gauss
def gauss(self, mu=0.0, sigma=1.0):
"""Gaussian distribution. Uses the Marsaglia polar method.
Faster than normalvariate; not thread-safe due to gauss_next caching."""
z = self.gauss_next
self.gauss_next = None
if z is None:
x2pi = self.random() * TWOPI
g2rad = _sqrt(-2.0 * _log(1.0 - self.random()))
z = _cos(x2pi) * g2rad
self.gauss_next = _sin(x2pi) * g2rad
return mu + z * sigma

The polar method generates two Gaussian values per call and caches the second (gauss_next). This is why gauss() is not thread-safe: two threads sharing a Random instance may both read and write gauss_next.

gopy notes

random.sample is module/random.Sample in module/random/module.go. It wraps Go's math/rand for the partial Fisher-Yates shuffle. random.choices uses sort.Search on cumulative weights. random.gauss caches the second value in module/random.Random.gaussNext. SystemRandom uses crypto/rand.