Lib/hashlib.py
cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py
hashlib is a thin dispatcher. It tries to source every hash algorithm from
_hashlib (the OpenSSL binding built at compile time). When an algorithm is
missing from _hashlib it falls back to dedicated C extension modules
(_sha256, _sha512, _sha3, _blake2, _md5) or raises ValueError. The
public surface is hashlib.new(name, data=b"", **kwargs), a small set of
constructor shortcuts (md5, sha1, sha256, sha512, blake2b,
blake2s), algorithms_guaranteed, algorithms_available, pbkdf2_hmac,
scrypt, and file_digest.
Map
| Lines | Symbol | Role | gopy |
|---|---|---|---|
| 1-100 | __get_builtin_constructor, __get_openssl_constructor, __get_hash, new | Algorithm lookup chain: try _hashlib.new, fall back to __get_builtin_constructor which imports the algorithm-specific C module on demand. | module/hashlib/ (pending) |
| 100-200 | algorithms_guaranteed, algorithms_available, constructor shortcuts | algorithms_guaranteed is a frozenset hardcoded in the source; algorithms_available is built at import time by probing _hashlib.openssl_md_meth_names. | module/hashlib/ (pending) |
| 200-300 | pbkdf2_hmac, scrypt, file_digest | pbkdf2_hmac and scrypt delegate to _hashlib when available. file_digest reads a file-like object in 65536-byte chunks and feeds them to an update-capable hash object. | module/hashlib/ (pending) |
Reading
hashlib.new algorithm lookup (lines 1 to 100)
cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L1-100
def __get_builtin_constructor(name):
cache = __builtin_constructor_cache
constructor = cache.get(name)
if constructor is not None:
return constructor
try:
if name in {'SHA1', 'sha1'}:
import _sha1
cache['SHA1'] = cache['sha1'] = _sha1.sha1
elif name in {'MD5', 'md5'}:
import _md5
cache['MD5'] = cache['md5'] = _md5.md5
...
except ImportError:
pass
constructor = cache.get(name)
if constructor is not None:
return constructor
raise ValueError('unsupported hash type ' + name)
def new(name, data=b'', **kwargs):
return __get_hash(name, data, **kwargs)
new() routes through __get_hash, which first calls
_hashlib.new(name, ...). If that raises ValueError (algorithm not in
OpenSSL) it calls __get_builtin_constructor(name) to get a constructor
from one of the algorithm-specific C modules. The constructor result is
cached in __builtin_constructor_cache so subsequent calls skip the import.
Top-level names like hashlib.md5 and hashlib.sha256 are created the same
way at import time using __get_hash via __py_new or __hash_new.
algorithms_available and algorithms_guaranteed (lines 100 to 200)
cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L100-200
algorithms_guaranteed is a frozenset listing the names that every CPython
build must provide, regardless of OpenSSL version:
md5, sha1, sha224, sha256, sha384, sha512,
sha3_224, sha3_256, sha3_384, sha3_512, blake2b, blake2s.
shake_128 and shake_256 are also included. algorithms_available is
built by starting with a copy of algorithms_guaranteed and then adding
every name in _hashlib.openssl_md_meth_names. The two sets can differ on
FIPS-restricted builds where md5 and sha1 are present in the Python
fallback but disabled in the OpenSSL provider.
file_digest chunked hashing (lines 200 to 300)
cpython 3.14 @ ab2d84fe1023/Lib/hashlib.py#L200-300
def file_digest(fileobj, digest, /, *, _bufsize=2**16):
digestobj = digest() if callable(digest) else new(digest)
if hasattr(fileobj, "getbuffer"):
digestobj.update(fileobj.getbuffer())
return digestobj
buf = bytearray(_bufsize)
view = memoryview(buf)
while True:
size = fileobj.readinto(view)
if size == 0:
break
digestobj.update(view[:size])
return digestobj
file_digest accepts either a name string or a callable (a hash
constructor) as the digest argument. It avoids materialising the whole
file in memory by using readinto with a reused bytearray buffer of 64
KiB. For BytesIO objects it takes the shortcut of calling getbuffer()
directly. The function was added in Python 3.11.
gopy mirror
The OpenSSL binding (_hashlib) maps cleanly to Go's crypto/sha256,
crypto/sha512, crypto/md5, crypto/sha1, golang.org/x/crypto/sha3,
and golang.org/x/crypto/blake2b. pbkdf2_hmac maps to
golang.org/x/crypto/pbkdf2 and scrypt to golang.org/x/crypto/scrypt.
file_digest is pure Python and ports directly. The main design question
for gopy is whether to expose a single module/hashlib/ package that
wraps all backends, or to mirror CPython's pattern of separate module/_sha256/
etc. packages assembled by a top-level dispatcher.