Skip to main content

Lib/codecs.py

cpython 3.14 @ ab2d84fe1023/Lib/codecs.py

Lib/codecs.py provides the codec infrastructure for encoding and decoding text. It wraps _codecs (C extension) and defines the base classes for stream codecs and incremental codecs.

Map

LinesSymbolRole
1-100Re-exports from _codecslookup, register, encode, decode, charmap_encode
101-300CodecInfo, CodecBase codec descriptor and stateless encoder/decoder
301-550IncrementalEncoder, IncrementalDecoderStateful codec base classes
551-750StreamWriter, StreamReader, StreamReaderWriterFile-like codec wrappers
751-1100open, EncodedFile, iterencode, iterdecodeConvenience entry points

Reading

CodecInfo named tuple

lookup(encoding) returns a CodecInfo(encode, decode, streamreader, streamwriter) named tuple. The four functions are the codec's complete API.

Error handler protocol

All encoding/decoding functions accept an errors parameter: 'strict', 'ignore', 'replace', 'xmlcharrefreplace', 'backslashreplace', 'namereplace', or a custom handler registered with codecs.register_error. Custom handlers receive a UnicodeEncodeError/UnicodeDecodeError and return (replacement, new_position).

IncrementalDecoder

IncrementalDecoder.decode(input, final=False) buffers partial sequences across calls. final=True signals end-of-stream; any remaining incomplete multi-byte sequence is either decoded with the error handler or raises UnicodeDecodeError.

iterencode / iterdecode

iterencode(iterator, encoding) and iterdecode(iterator, encoding) apply an incremental codec to a stream of chunks, yielding encoded/decoded chunks as they are produced. This avoids materialising the full input in memory.

gopy notes

Not yet ported. codecs is needed by io.TextIOWrapper for text file reading. The most critical paths are utf-8 encode/decode and the error handler dispatch. Planned path: module/codecs/. The _codecs C extension wraps the codec search functions directly; Go will use golang.org/x/text for non-UTF-8 encodings.