string.py: String Constants, Formatter, and Template
string exposes named character sets, a Formatter class that reimplements
str.format logic in pure Python, and a Template class for $-style
substitution. Most of Formatter is delegated to the C-level
string.Formatter built into the interpreter, making the Python class mostly
a customisation hook.
Map
| Lines | Symbol | Kind | Notes |
|---|---|---|---|
| 1-30 | module header | setup | imports, __all__ |
| 32-55 | constants | data | ascii_letters, ascii_lowercase, ascii_uppercase, digits, hexdigits, octdigits, printable, punctuation, whitespace |
| 57-90 | Formatter | class | delegates to _string.formatter_* C helpers |
| 91-115 | Formatter.format | method | entry point, calls vformat |
| 116-140 | Formatter.vformat | method | recursion guard, calls _vformat |
| 141-175 | Formatter._vformat | method | iterates literal text and field specs |
| 176-200 | Formatter.get_value | method | index vs attribute lookup |
| 201-215 | Formatter.check_unused_args | method | no-op hook for subclasses |
| 216-225 | Formatter.format_field | method | calls format(value, format_spec) |
| 226-235 | Formatter.convert_field | method | handles !r, !s, !a |
| 236-250 | Template | class | $-substitution, pattern class attribute, safe_substitute |
Reading
String constants
The constants are plain module-level assignments. They are frequently imported
directly and used in str.translate tables or regex character classes.
# Lib/string.py:32
ascii_lowercase = 'abcdefghijklmnopqrstuvwxyz'
ascii_uppercase = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
ascii_letters = ascii_lowercase + ascii_uppercase
digits = '0123456789'
hexdigits = digits + 'abcdef' + 'ABCDEF'
octdigits = '01234567'
punctuation = r"""!"#$%&'()*+,-./:;<=>?@[\]^_`{|}~"""
whitespace = ' \t\n\r\x0b\x0c'
printable = digits + ascii_letters + punctuation + whitespace
Formatter._vformat field iteration
_vformat is the loop that alternates between literal text and format fields.
It calls the C helper _string.formatter_field_name_split to parse field
names, then recurses for nested format specs.
# Lib/string.py:141
def _vformat(self, format_string, args, kwargs, used_args,
recursion_depth, auto_arg_index=0):
if recursion_depth < 0:
raise ValueError('Max string formatting recursion exceeded')
result = []
for literal_text, field_name, format_spec, conversion in \
self.parse(format_string):
if literal_text:
result.append(literal_text)
if field_name is not None:
obj, arg_used = self.get_field(field_name, args, kwargs)
used_args.add(arg_used)
obj = self.convert_field(obj, conversion)
# recurse for nested format specs
format_spec, auto_arg_index = self._vformat(
format_spec, args, kwargs, used_args,
recursion_depth - 1, auto_arg_index)
result.append(self.format_field(obj, format_spec))
return ''.join(result), auto_arg_index
Template and safe_substitute
Template compiles a single regex from the class-level pattern attribute.
substitute raises KeyError on missing keys; safe_substitute leaves the
placeholder text untouched instead.
# Lib/string.py:236 (simplified)
class Template:
delimiter = '$'
idpattern = r'(?a:[_a-z][_a-z0-9]*)'
braceidpattern = None
flags = re.IGNORECASE
def substitute(self, mapping=None, /, **kws):
# raises KeyError / ValueError on bad placeholder
...
def safe_substitute(self, mapping=None, /, **kws):
def convert(mo):
named = mo.group('named') or mo.group('braced')
if named is not None:
try:
return str(mapping[named])
except KeyError:
return mo.group() # leave placeholder intact
if mo.group('escaped') is not None:
return self.delimiter
if mo.group('invalid') is not None:
return mo.group()
raise ValueError('Unrecognized named group in pattern')
return self.pattern.sub(convert, self.template)
gopy notes
- The nine string constants are straightforward Go
constorvarstrings.printableis avarbecause it is assembled from other constants. Formatterin CPython leans on_string.formatter_field_name_splitand_string.formatter_parser(both C-level). The gopy port must implement these helpers in Go beforeFormatter.parseandget_fieldwill work.Template.patternis a class attribute that subclasses can override. In Go this becomes a field on theTemplatestruct set during construction, with a package-level default compiled regex as the fallback.- 3.14 added
Template.get_identifiers()(returns all valid placeholder names) andTemplate.is_valid(). Both are thin wrappers over the compiled pattern and should be ported together.