Skip to main content

doctest: extraction, execution, and directive handling

doctest finds Python interactive examples embedded in docstrings and runs them, comparing actual output to expected output. Two public entry points, testmod and testfile, drive the full pipeline: find, compile, run, and report.

Map

LinesSymbolRole
1-100module constants, _EXAMPLE_RERegex that recognises >>> and ... continuation lines
101-250ExampleDataclass holding source, want, lineno, and indent
251-450DocTestContainer of Example objects plus name and filename
451-750DocTestFinderRecursively walks modules, classes, and functions to collect DocTest objects
751-900DocTestParserParses a docstring into a list of Example and plain-text items
901-1200DocTestRunnerExecutes examples, captures stdout, and checks results
1201-1450OutputCheckerCompares got/want with optional directives; produces diff on failure
1451-1600DebugRunnerSubclass that raises DocTestFailure instead of recording failures
1601-1800directive constants (ELLIPSIS, etc.), register_optionflagBitmask flags for per-example behaviour
1801-2000DocTestSuite, DocFileSuiteunittest.TestSuite wrappers
2001-2200testmod, testfileTop-level convenience functions
2201-2600DocTestCase, _DocTestRunner, script modeunittest.TestCase adapter and __main__ runner

Reading

DocTestFinder.find recursive extraction

find walks a live Python object graph. For a module it inspects __dict__; for a class it recurses into each method and nested class. Each docstring is handed to DocTestParser.get_doctest:

# Lib/doctest.py (simplified)
def _find(self, tests, obj, name, module, source_lines, globs, seen):
if id(obj) in seen:
return
seen[id(obj)] = True
test = self._get_test(obj, name, module, globs, source_lines)
if test is not None:
tests.append(test)
# Recurse into members
for valname, val in self._find_lineno_iter(obj, source_lines):
self._find(tests, val, f'{name}.{valname}',
module, source_lines, globs, seen)

The seen dict prevents infinite loops when objects have circular __dict__ references. source_lines is the module source split by lines, used to compute accurate lineno values for each example.

DocTestRunner.run output capture and diff

run sets up a temporary sys.stdout replacement, compiles each example's source with compile(..., 'single'), and execs it in the test's globs dict:

# Lib/doctest.py (simplified)
def __run(self, test, compileflags, out):
for i, example in enumerate(test.examples):
# Redirect stdout
save_stdout = sys.stdout
sys.stdout = self._fakeout
try:
exec(compile(example.source, '<doctest>', 'single',
compileflags, True), test.globs)
got = self._fakeout.getvalue()
finally:
sys.stdout = save_stdout
self._fakeout.truncate(0)
outcome = self._check_output(example, got, self.optionflags)
if outcome is FAILURE:
self.report_failure(out, test, example, got)

When got does not match example.want, OutputChecker.output_difference builds a unified diff and returns it as a string for the failure report.

ELLIPSIS and NORMALIZE_WHITESPACE directives

Directives are parsed from inline comments in the expected output, e.g. # doctest: +ELLIPSIS. They modify optionflags for that single example:

# Lib/doctest.py (simplified)
_OPTION_DIRECTIVE_RE = re.compile(
r'#\s*doctest:\s*([^\n\'"]*)$', re.MULTILINE)

def _find_options(self, source):
options = {}
for m in _OPTION_DIRECTIVE_RE.finditer(source):
for flag in m.group(1).split(','):
flag = flag.strip()
if flag[:1] == '+':
options[_nameToOption[flag[1:]]] = True
elif flag[:1] == '-':
options[_nameToOption[flag[1:]]] = False
return options

With ELLIPSIS active, OutputChecker.check_output converts ... tokens in want into .* and uses re.match. With NORMALIZE_WHITESPACE, any run of whitespace in both got and want is collapsed to a single space before comparison.

gopy notes

  • DocTestParser relies on re module features (named groups, multiline flags) that are already ported; no blockers there.
  • exec with 'single' compile mode prints expression results automatically via sys.displayhook. The gopy exec path must honour displayhook for doctest output to match.
  • The _fakeout capture object is a StringIO-like buffer. The gopy port can use the existing io.StringIO implementation in module/io.
  • DocTestSuite and DocFileSuite depend on unittest. Port order: doctest parser and runner first, unittest integration deferred to a later milestone.