Skip to main content

Lib/mimetypes.py

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py

mimetypes translates between filenames (or URLs) and MIME content-type strings. The module maintains two parallel dictionaries: one for strict IANA types and one for non-strict (common but unofficial) types. Both map lowercase file extensions to type strings such as text/html or image/png.

On startup the module seeds its tables from a hard-coded types_map tuple, then optionally reads system mime.types files (e.g., /etc/mime.types on Linux) and, on Windows, the registry keys under HKEY_CLASSES_ROOT. The MimeTypes class encapsulates a single such database instance. Module-level convenience functions delegate to a global MimeTypes instance that is initialized lazily on first use.

The inverse direction, from type string to extension, is handled by guess_extension (returns one canonical extension) and guess_all_extensions (returns every known extension for that type). The module also exposes a command-line interface for ad-hoc queries.

Map

LinesSymbolRolegopy
1-60module header, knownfilesConstants: well-known system mime.types paths-
61-130types_mapBuilt-in extension-to-type table (strict and non-strict)-
131-200guess_typeGuess MIME type from URL or filename-
201-240guess_all_extensions, guess_extensionReverse lookup: type to extension(s)-
241-340MimeTypes.__init__, MimeTypes.guess_typePer-instance database and type resolution-
341-430MimeTypes.read, MimeTypes.readfpParse mime.types file format-
431-500MimeTypes.read_windows_registryHarvest types from Windows registry-
501-560init, _dbLazy global initialization-
561-600__main__ blockCLI entry point-

Reading

Module constants and system file list (lines 1 to 60)

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py#L1-60

The module opens with a list called knownfiles that enumerates the canonical paths where POSIX systems store MIME mappings (/etc/mime.types, /usr/local/etc/httpd/conf/mime.types, etc.). These are tried in order during init(). The boolean inited flag and the _db module global together gate the one-time initialization.

knownfiles = [
"/etc/mime.types",
"/etc/httpd/mime.types",
...
]
inited = False
_db = None

Built-in types_map (lines 61 to 130)

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py#L61-130

types_map is a two-element tuple of dicts. Index 0 holds strictly IANA-registered mappings; index 1 holds common but non-standard ones. Keys are lowercase dotted extensions (.html, .jpg). At import time these dicts seed every new MimeTypes instance before any OS files are consulted.

types_map = (
{'.html': 'text/html', '.png': 'image/png', ...}, # strict
{'.mid': 'audio/midi', ...}, # non-strict
)

guess_type (lines 131 to 200)

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py#L131-200

guess_type(url, strict=True) splits the URL with urllib.parse.splittype and posixpath utilities to isolate the file extension, strips any encoding suffix (.gz, .bz2), then looks up the extension in the active database. The return value is a (type, encoding) pair where encoding is None unless a compression suffix was present.

def guess_type(url, strict=True):
...
base, ext = posixpath.splitext(url)
while ext in suffix_map:
base, ext = posixpath.splitext(base)
...
return types_map[not strict].get(ext), encoding

MimeTypes.read and readfp (lines 341 to 430)

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py#L341-430

readfp parses a mime.types-format stream line by line. Lines beginning with # are comments. Each data line has the MIME type as the first token followed by zero or more extensions. read(filename, strict) simply opens the file and delegates to readfp. Both methods update self.types_map and the reverse types_map_inv dict.

def readfp(self, fp, strict=True):
while True:
line = fp.readline()
if not line:
break
words = line.split()
...
for ext in words[1:]:
self.add_type(words[0], '.' + ext, strict)

Windows registry reader (lines 431 to 500)

cpython 3.14 @ ab2d84fe1023/Lib/mimetypes.py#L431-500

read_windows_registry(strict=True) iterates subkeys of HKEY_CLASSES_ROOT using winreg. For each key whose name starts with a dot it queries the Content Type value and calls add_type. The method is a no-op on non-Windows platforms because the import of winreg is guarded.

def read_windows_registry(self, strict=True):
...
with winreg.OpenKey(winreg.HKEY_CLASSES_ROOT, '') as hkcr:
for subkeyname in enum_types(hkcr):
...
self.add_type(mimetype, subkeyname, strict)

gopy mirror

Not yet ported.