Skip to main content

Lib/xml/etree/ElementTree.py

Source:

cpython 3.14 @ ab2d84fe1023/Lib/xml/etree/ElementTree.py

xml.etree.ElementTree is the standard XML processing API. Element represents an XML node with a tag, attribute dict, text, tail, and a list of child elements. The pure-Python implementation is replaced at import time by _elementtree (a C extension) for performance.

Map

LinesSymbolRole
1-100Element classTree node: tag, attrib, text, tail, children
101-250ElementTreeDocument wrapper with parse, write, find*, iter
251-450XMLParser, TreeBuilderSAX-based incremental parser
451-650parse, fromstring, fromstringlist, tostring, tostringlistHigh-level I/O
651-850SubElement, Comment, ProcessingInstructionConstructors
851-1050XPath: findall, find, findtext, iterfindSubset XPath 1.0
1051-1670indent, iterparse, register_namespace, QNameUtilities

Reading

Element storage

# CPython: Lib/xml/etree/ElementTree.py:112 Element.__init__
def __init__(self, tag, attrib={}, **extra):
self.tag = tag
self.attrib = {**attrib, **extra}
self.text = None
self.tail = None
self._children = []

text is the text before the first child (or the whole text content for leaf nodes). tail is the text after the closing tag, inside the parent.

parse and XMLParser

# CPython: Lib/xml/etree/ElementTree.py:580 parse
def parse(source, parser=None):
tree = ElementTree()
tree.parse(source, parser)
return tree

Internally uses xml.parsers.expat (or the C _elementtree accelerator). The XMLParser object drives expat's SAX events and builds the Element tree via TreeBuilder.

Subset XPath

find, findall, and iterfind support a subset of XPath 1.0: / for children, // for descendants, [@attr] for attribute tests, [tag] for child element tests, [position] for index.

# CPython: Lib/xml/etree/ElementTree.py:852 Element.findall
def findall(self, path, namespaces=None):
return list(self.iterfind(path, namespaces))

tostring

# CPython: Lib/xml/etree/ElementTree.py:1032 tostring
def tostring(element, encoding='us-ascii', method='xml',
*, xml_declaration=None, default_namespace=None, short_empty_elements=True):
stream = io.BytesIO()
ElementTree(element).write(stream, encoding, xml_declaration,
default_namespace, method=method,
short_empty_elements=short_empty_elements)
return stream.getvalue()

iterparse

Yields (event, element) pairs as the XML is parsed, allowing O(1) memory use for large files by discarding already-processed sub-trees.

gopy notes

Status: not yet ported. Go's encoding/xml handles basic XML marshaling. A faithful ElementTree port needs an Element struct with tag, attrib, text, tail, and []Element children, plus the expat-based incremental parser. The C _elementtree extension is not needed if Go implements the whole thing natively.