Skip to main content

Lib/zipfile/__init__.py

Source:

cpython 3.14 @ ab2d84fe1023/Lib/zipfile/__init__.py

zipfile reads and writes ZIP archives. ZipFile is the main class; ZipInfo holds per-entry metadata; ZipPath (Python 3.8+) provides a pathlib-compatible interface to ZIP contents. The module handles stored (no compression), deflate, bzip2, lzma, and (Python 3.14) zstandard compression.

Map

LinesSymbolRole
1-100Constants, compression methodsZIP_STORED, ZIP_DEFLATED, ZIP_BZIP2, ZIP_LZMA, ZIP_ZSTANDARD
101-400ZipInfoPer-entry metadata: filename, dates, CRC, compress size, flags
401-800ZipFile.__init__, _RealGetContentsOpen and parse the central directory
801-1200ZipFile.open, _open_to_writeExtract and compress entry streams
1201-1600ZipFile.read, ZipFile.write, ZipFile.writestrHigh-level read/write
1601-1900ZipFile.extractall, ZipFile.extractExtract to filesystem
1901-2100ZipFile.close, _write_end_recordWrite the central directory on close
2101-2400ZipPath, Pathpathlib-compatible ZIP browsing

Reading

Central directory parsing

ZIP archives have a central directory at the end of the file (to allow appending). _RealGetContents reads backward from the end to find the end-of-central-directory record (EOCD), then reads the central directory headers to populate self.filelist and self.NameToInfo.

# CPython: Lib/zipfile/__init__.py:1398 ZipFile._RealGetContents
def _RealGetContents(self):
...
# Find the EOCD signature
size = os.path.getsize(self.filename)
fp.seek(-22, 2) # EOCD is at most 22 bytes from end (ignoring comment)
data = fp.read()
start = data.rfind(stringEndArchive)
...

open for reading: decompression pipeline

# CPython: Lib/zipfile/__init__.py:1695 ZipFile.open
def open(self, name, mode='r', pwd=None, *, force_zip64=False):
...
zinfo = self.getinfo(name)
...
zef_file = _SharedFile(self.fp, zinfo.header_offset, ...)
...
if zinfo.compress_type == ZIP_DEFLATED:
zd = zlib.decompressobj(-15)
fileobj = _ZipWriteFile(zef_file, ...)
...

Each entry is decompressed through a streaming decompressor. The Zlib, BZ2, or LZMA decompressor wraps a _SharedFile file-like object.

Encryption

Password-protected ZIPs use the legacy ZIP crypto (a stream cipher based on three 32-bit keys updated with CRC32). Python 3.9+ also supports AES encryption via zipfile-aes or the pyzipper library.

ZipPath

# CPython: Lib/zipfile/__init__.py:2180 ZipPath.__truediv__
class ZipPath:
def __init__(self, root, at=''):
self._root = root if isinstance(root, ZipFile) else ZipFile(root)
self.at = at

def __truediv__(self, add):
return self.__class__(self._root, self.at + add)

Wraps a ZipFile and a path string to expose iterdir(), open(), read_text(), read_bytes(), etc.

gopy notes

Status: not yet ported. Go's archive/zip covers reading and writing ZIP archives. A gopy port would need ZipInfo as a Python-visible struct, a streaming decompressor pipeline, and the central directory reader/writer. ZipPath can be built on top of those primitives.