Skip to main content

string.py: String Constants, Formatter, and Template

cpython 3.14 @ ab2d84fe1023/

string exposes named character sets, a Formatter class that reimplements str.format logic in pure Python, and a Template class for $-style substitution. Most of Formatter is delegated to the C-level string.Formatter built into the interpreter, making the Python class mostly a customisation hook.

Map

LinesSymbolKindNotes
1-30module headersetupimports, __all__
32-55constantsdataascii_letters, ascii_lowercase, ascii_uppercase, digits, hexdigits, octdigits, printable, punctuation, whitespace
57-90Formatterclassdelegates to _string.formatter_* C helpers
91-115Formatter.formatmethodentry point, calls vformat
116-140Formatter.vformatmethodrecursion guard, calls _vformat
141-175Formatter._vformatmethoditerates literal text and field specs
176-200Formatter.get_valuemethodindex vs attribute lookup
201-215Formatter.check_unused_argsmethodno-op hook for subclasses
216-225Formatter.format_fieldmethodcalls format(value, format_spec)
226-235Formatter.convert_fieldmethodhandles !r, !s, !a
236-250Templateclass$-substitution, pattern class attribute, safe_substitute

Reading

String constants

The constants are plain module-level assignments. They are frequently imported directly and used in str.translate tables or regex character classes.

# Lib/string.py:32
ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = ascii_lowercase + ascii_uppercase
digits = '0123456789'
hexdigits = digits + 'abcdef' + 'ABCDEF'
octdigits = '01234567'
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
whitespace = ' \t\n\r\x0b\x0c'
printable = digits + ascii_letters + punctuation + whitespace

Formatter._vformat field iteration

_vformat is the loop that alternates between literal text and format fields. It calls the C helper _string.formatter_field_name_split to parse field names, then recurses for nested format specs.

# Lib/string.py:141
def _vformat(self, format_string, args, kwargs, used_args,
recursion_depth, auto_arg_index=0):
if recursion_depth < 0:
raise ValueError('Max string formatting recursion exceeded')
result = []
for literal_text, field_name, format_spec, conversion in \
self.parse(format_string):
if literal_text:
result.append(literal_text)
if field_name is not None:
obj, arg_used = self.get_field(field_name, args, kwargs)
used_args.add(arg_used)
obj = self.convert_field(obj, conversion)
# recurse for nested format specs
format_spec, auto_arg_index = self._vformat(
format_spec, args, kwargs, used_args,
recursion_depth - 1, auto_arg_index)
result.append(self.format_field(obj, format_spec))
return ''.join(result), auto_arg_index

Template and safe_substitute

Template compiles a single regex from the class-level pattern attribute. substitute raises KeyError on missing keys; safe_substitute leaves the placeholder text untouched instead.

# Lib/string.py:236 (simplified)
class Template:
delimiter = '$'
idpattern = r'(?a:[_a-z][_a-z0-9]*)'
braceidpattern = None
flags = re.IGNORECASE

def substitute(self, mapping=None, /, **kws):
# raises KeyError / ValueError on bad placeholder
...

def safe_substitute(self, mapping=None, /, **kws):
def convert(mo):
named = mo.group('named') or mo.group('braced')
if named is not None:
try:
return str(mapping[named])
except KeyError:
return mo.group() # leave placeholder intact
if mo.group('escaped') is not None:
return self.delimiter
if mo.group('invalid') is not None:
return mo.group()
raise ValueError('Unrecognized named group in pattern')
return self.pattern.sub(convert, self.template)

gopy notes

  • The nine string constants are straightforward Go const or var strings. printable is a var because it is assembled from other constants.
  • Formatter in CPython leans on _string.formatter_field_name_split and _string.formatter_parser (both C-level). The gopy port must implement these helpers in Go before Formatter.parse and get_field will work.
  • Template.pattern is a class attribute that subclasses can override. In Go this becomes a field on the Template struct set during construction, with a package-level default compiled regex as the fallback.
  • 3.14 added Template.get_identifiers() (returns all valid placeholder names) and Template.is_valid(). Both are thin wrappers over the compiled pattern and should be ported together.