Skip to main content

Lib/shutil.py

shutil.py is the standard high-level file and archive utility library. It layers on top of os to provide copy-with-metadata, recursive tree operations, cross-device moves, and a pluggable archive registry. Platform detection at module load time selects the fastest available copy mechanism (sendfile, copy_file_range, fcopyfile, or plain read/write buffering).

Source:

cpython 3.14 @ ab2d84fe1023/Lib/shutil.py

Map

LinesSymbolRole
1-72imports, feature flags_ZLIB_SUPPORTED, _USE_CP_SENDFILE, _HAS_FCOPYFILE, etc.
75-96exception classesError, SameFileError, SpecialFileError, ReadError, RegistryError
98-..._fastcopy_fcopyfilemacOS fcopyfile(3) path
118-..._fastcopy_sendfileLinux/Android os.sendfile path
~200_copyfileobj_readintofallback buffered copy
493-531copy2copy data plus full metadata via copystat
533-543ignore_patternsfactory returning a glob-filter callable
545-..._copytreerecursive tree worker used by copytree
611-657copytreepublic entry point; acquires scandir context manager
810-856rmtreerecursive delete with onerror/onexc handler
876-...moverename-or-copy-then-delete with cross-device fallback
~1100archive format registry_ARCHIVE_FORMATS, register_archive_format
1184-...make_archivedispatch through registry, optional root_dir chdir
1386-...unpack_archivedispatch through unpack registry
1435-1468disk_usagePOSIX statvfs branch and Windows nt._getdiskusage branch

Reading

copy2 and metadata preservation

copy2 is the recommended copy function when full fidelity matters. It copies file content with the fast-copy path, then calls copystat to replay timestamps, permission bits, extended attributes, and (on Windows) file attributes. The Windows fast path uses _winapi.CopyFile2 directly and falls back to the generic path on privilege errors.

# CPython: Lib/shutil.py:493 copy2
def copy2(src, dst, *, follow_symlinks=True):
if os.path.isdir(dst):
dst = os.path.join(dst, os.path.basename(src))

if hasattr(_winapi, "CopyFile2"):
src_ = os.fsdecode(src)
dst_ = os.fsdecode(dst)
flags = _winapi.COPY_FILE_ALLOW_DECRYPTED_DESTINATION
if not follow_symlinks:
flags |= _winapi.COPY_FILE_COPY_SYMLINK
try:
_winapi.CopyFile2(src_, dst_, flags)
return dst
except OSError as exc:
# fall through on specific winerrors
...

copyfile(src, dst, follow_symlinks=follow_symlinks)
copystat(src, dst, follow_symlinks=follow_symlinks)
return dst

copytree, ignore_patterns, and dirs_exist_ok

copytree opens the source with os.scandir as a context manager so the directory handle is closed promptly. The actual recursion lives in _copytree, which receives the already-collected entries list. The ignore callable (often produced by ignore_patterns) is invoked once per directory with the directory path and the list of entry names, returning a set of names to skip. The dirs_exist_ok flag, added in Python 3.8, allows merging into an existing destination tree without raising FileExistsError.

# CPython: Lib/shutil.py:611 copytree
def copytree(src, dst, symlinks=False, ignore=None, copy_function=copy2,
ignore_dangling_symlinks=False, dirs_exist_ok=False):
sys.audit("shutil.copytree", src, dst)
with os.scandir(src) as itr:
entries = list(itr)
return _copytree(entries=entries, src=src, dst=dst, symlinks=symlinks,
ignore=ignore, copy_function=copy_function,
ignore_dangling_symlinks=ignore_dangling_symlinks,
dirs_exist_ok=dirs_exist_ok)

ignore_patterns is a simple factory: it returns a closure that runs fnmatch.filter against each pattern and unions the results.

# CPython: Lib/shutil.py:533 ignore_patterns
def ignore_patterns(*patterns):
def _ignore_patterns(path, names):
ignored_names = []
for pattern in patterns:
ignored_names.extend(fnmatch.filter(names, pattern))
return set(ignored_names)
return _ignore_patterns

rmtree error handler protocol

rmtree translates ignore_errors, onerror, and onexc into a single onexc callable before delegating to _rmtree_impl. The onerror protocol (receiving a sys.exc_info() triple) is deprecated but still supported via an adapter closure. When all three are absent the default onexc simply re-raises the active exception.

# CPython: Lib/shutil.py:810 rmtree
def rmtree(path, ignore_errors=False, onerror=None, *, onexc=None, dir_fd=None):
sys.audit("shutil.rmtree", path, dir_fd)
if ignore_errors:
def onexc(*args):
pass
elif onerror is None and onexc is None:
def onexc(*args):
raise
elif onexc is None:
# delegate to deprecated onerror
def onexc(*args):
func, path, exc = args
exc_info = type(exc), exc, exc.__traceback__
return onerror(func, path, exc_info)
_rmtree_impl(path, dir_fd, onexc)

move cross-device handling

move tries os.rename first. If that raises OSError (the typical signal for a cross-device move), it falls back: symlinks are recreated, directories are copied with copytree then removed with rmtree, and plain files are copied then unlinked. The copy_function parameter lets callers substitute a different copy routine (for example one that skips metadata).

# CPython: Lib/shutil.py:876 move
def move(src, dst, copy_function=copy2):
sys.audit("shutil.move", src, dst)
try:
os.rename(src, real_dst)
except OSError:
if os.path.islink(src):
linkto = os.readlink(src)
os.symlink(linkto, real_dst)
os.unlink(src)
elif os.path.isdir(src):
copytree(src, real_dst, copy_function=copy_function, symlinks=True)
rmtree(src)
else:
copy_function(src, real_dst)
os.unlink(src)

Archive registry and disk_usage

make_archive looks up the requested format name in _ARCHIVE_FORMATS, a dict populated by register_archive_format. The built-in entries are zip, tar, gztar, bztar, xztar, and (when available) zstdtar. The format function receives base_name and base_dir; if the format does not support a root_dir argument natively, make_archive performs a chdir around the call and restores the working directory in a finally block.

disk_usage is defined conditionally. On POSIX it calls os.statvfs and computes free space from f_bavail * f_frsize. On Windows it calls nt._getdiskusage. Both branches return the same namedtuple with total, used, and free fields. The name disk_usage is appended to __all__ inside each branch, so it only appears when the platform supports it.

# CPython: Lib/shutil.py:1443 disk_usage (POSIX)
def disk_usage(path):
st = os.statvfs(path)
free = st.f_bavail * st.f_frsize
total = st.f_blocks * st.f_frsize
used = (st.f_blocks - st.f_bfree) * st.f_frsize
return _ntuple_diskusage(total, used, free)
# CPython: Lib/shutil.py:1460 disk_usage (Windows)
def disk_usage(path):
total, free = nt._getdiskusage(path)
used = total - free
return _ntuple_diskusage(total, used, free)

gopy notes

Status: not yet ported.

Planned package path: module/shutil/.

The fast-copy selection logic (sendfile, copy_file_range, fcopyfile) maps to Go's io.Copy with optional syscall.Sendfile on Linux. copystat needs syscall.Chtimes plus os.Chmod; the extended-attribute path (os.getxattr/os.setxattr) can be stubbed initially. rmtree maps to os.RemoveAll for the happy path, with the onexc callback protocol added as a thin wrapper. The archive registry can be a Go map[string]archiveFormat with the same register/unregister API. disk_usage maps to syscall.Statfs_t on POSIX and syscall.GetDiskFreeSpaceEx on Windows, matching the two CPython branches exactly.