Skip to main content

Lib/hashlib.py

cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py

hashlib is a thin dispatcher. It tries to source every hash algorithm from _hashlib (the OpenSSL binding built at compile time). When an algorithm is missing from _hashlib it falls back to dedicated C extension modules (_sha256, _sha512, _sha3, _blake2, _md5) or raises ValueError. The public surface is hashlib.new(name, data=b"", **kwargs), a small set of constructor shortcuts (md5, sha1, sha256, sha512, blake2b, blake2s), algorithms_guaranteed, algorithms_available, pbkdf2_hmac, scrypt, and file_digest.

Map

LinesSymbolRolegopy
1-100__get_builtin_constructor, __get_openssl_constructor, __get_hash, newAlgorithm lookup chain: try _hashlib.new, fall back to __get_builtin_constructor which imports the algorithm-specific C module on demand.module/hashlib/ (pending)
100-200algorithms_guaranteed, algorithms_available, constructor shortcutsalgorithms_guaranteed is a frozenset hardcoded in the source; algorithms_available is built at import time by probing _hashlib.openssl_md_meth_names.module/hashlib/ (pending)
200-300pbkdf2_hmac, scrypt, file_digestpbkdf2_hmac and scrypt delegate to _hashlib when available. file_digest reads a file-like object in 65536-byte chunks and feeds them to an update-capable hash object.module/hashlib/ (pending)

Reading

hashlib.new algorithm lookup (lines 1 to 100)

cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L1-100

def __get_builtin_constructor(name):
cache = __builtin_constructor_cache
constructor = cache.get(name)
if constructor is not None:
return constructor
try:
if name in {'SHA1', 'sha1'}:
import _sha1
cache['SHA1'] = cache['sha1'] = _sha1.sha1
elif name in {'MD5', 'md5'}:
import _md5
cache['MD5'] = cache['md5'] = _md5.md5
...
except ImportError:
pass
constructor = cache.get(name)
if constructor is not None:
return constructor
raise ValueError('unsupported hash type ' + name)

def new(name, data=b'', **kwargs):
return __get_hash(name, data, **kwargs)

new() routes through __get_hash, which first calls _hashlib.new(name, ...). If that raises ValueError (algorithm not in OpenSSL) it calls __get_builtin_constructor(name) to get a constructor from one of the algorithm-specific C modules. The constructor result is cached in __builtin_constructor_cache so subsequent calls skip the import. Top-level names like hashlib.md5 and hashlib.sha256 are created the same way at import time using __get_hash via __py_new or __hash_new.

algorithms_available and algorithms_guaranteed (lines 100 to 200)

cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L100-200

algorithms_guaranteed is a frozenset listing the names that every CPython build must provide, regardless of OpenSSL version: md5, sha1, sha224, sha256, sha384, sha512, sha3_224, sha3_256, sha3_384, sha3_512, blake2b, blake2s. shake_128 and shake_256 are also included. algorithms_available is built by starting with a copy of algorithms_guaranteed and then adding every name in _hashlib.openssl_md_meth_names. The two sets can differ on FIPS-restricted builds where md5 and sha1 are present in the Python fallback but disabled in the OpenSSL provider.

file_digest chunked hashing (lines 200 to 300)

cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L200-300

def file_digest(fileobj, digest, /, *, _bufsize=2**16):
digestobj = digest() if callable(digest) else new(digest)
if hasattr(fileobj, "getbuffer"):
digestobj.update(fileobj.getbuffer())
return digestobj
buf = bytearray(_bufsize)
view = memoryview(buf)
while True:
size = fileobj.readinto(view)
if size == 0:
break
digestobj.update(view[:size])
return digestobj

file_digest accepts either a name string or a callable (a hash constructor) as the digest argument. It avoids materialising the whole file in memory by using readinto with a reused bytearray buffer of 64 KiB. For BytesIO objects it takes the shortcut of calling getbuffer() directly. The function was added in Python 3.11.

gopy mirror

The OpenSSL binding (_hashlib) maps cleanly to Go's crypto/sha256, crypto/sha512, crypto/md5, crypto/sha1, golang.org/x/crypto/sha3, and golang.org/x/crypto/blake2b. pbkdf2_hmac maps to golang.org/x/crypto/pbkdf2 and scrypt to golang.org/x/crypto/scrypt. file_digest is pure Python and ports directly. The main design question for gopy is whether to expose a single module/hashlib/ package that wraps all backends, or to mirror CPython's pattern of separate module/_sha256/ etc. packages assembled by a top-level dispatcher.