`. Returns an iterable yielding all
matching elements in document order. *namespaces* is an optional mapping
from namespace prefix to full name.
.. versionadded:: 3.2
.. method:: itertext()
Creates a text iterator. The iterator loops over this element and all
subelements, in document order, and returns all inner text.
.. versionadded:: 3.2
.. method:: makeelement(tag, attrib)
Creates a new element object of the same type as this element. Do not
call this method, use the :func:`SubElement` factory function instead.
.. method:: remove(subelement)
Removes *subelement* from the element. Unlike the find\* methods this
method compares elements based on the instance identity, not on tag value
or contents.
:class:`Element` objects also support the following sequence type methods
for working with subelements: :meth:`~object.__delitem__`,
:meth:`~object.__getitem__`, :meth:`~object.__setitem__`,
:meth:`~object.__len__`.
Caution: Elements with no subelements will test as ``False``. In a future
release of Python, all elements will test as ``True`` regardless of whether
subelements exist. Instead, prefer explicit ``len(elem)`` or
``elem is not None`` tests.::
element = root.find('foo')
if not element: # careful!
print("element not found, or element has no subelements")
if element is None:
print("element not found")
.. versionchanged:: 3.12
Testing the truth value of an Element emits :exc:`DeprecationWarning`.
Prior to Python 3.8, the serialisation order of the XML attributes of
elements was artificially made predictable by sorting the attributes by
their name. Based on the now guaranteed ordering of dicts, this arbitrary
reordering was removed in Python 3.8 to preserve the order in which
attributes were originally parsed or created by user code.
In general, user code should try not to depend on a specific ordering of
attributes, given that the `XML Information Set
`_ explicitly excludes the attribute
order from conveying information. Code should be prepared to deal with
any ordering on input. In cases where deterministic XML output is required,
e.g. for cryptographic signing or test data sets, canonical serialisation
is available with the :func:`canonicalize` function.
In cases where canonical output is not applicable but a specific attribute
order is still desirable on output, code should aim for creating the
attributes directly in the desired order, to avoid perceptual mismatches
for readers of the code. In cases where this is difficult to achieve, a
recipe like the following can be applied prior to serialisation to enforce
an order independently from the Element creation::
def reorder_attributes(root):
for el in root.iter():
attrib = el.attrib
if len(attrib) > 1:
# adjust attribute order, e.g. by sorting
attribs = sorted(attrib.items())
attrib.clear()
attrib.update(attribs)
.. _elementtree-elementtree-objects:
ElementTree Objects
^^^^^^^^^^^^^^^^^^^
.. class:: ElementTree(element=None, file=None)
ElementTree wrapper class. This class represents an entire element
hierarchy, and adds some extra support for serialization to and from
standard XML.
*element* is the root element. The tree is initialized with the contents
of the XML *file* if given.
.. method:: _setroot(element)
Replaces the root element for this tree. This discards the current
contents of the tree, and replaces it with the given element. Use with
care. *element* is an element instance.
.. method:: find(match, namespaces=None)
Same as :meth:`Element.find`, starting at the root of the tree.
.. method:: findall(match, namespaces=None)
Same as :meth:`Element.findall`, starting at the root of the tree.
.. method:: findtext(match, default=None, namespaces=None)
Same as :meth:`Element.findtext`, starting at the root of the tree.
.. method:: getroot()
Returns the root element for this tree.
.. method:: iter(tag=None)
Creates and returns a tree iterator for the root element. The iterator
loops over all elements in this tree, in section order. *tag* is the tag
to look for (default is to return all elements).
.. method:: iterfind(match, namespaces=None)
Same as :meth:`Element.iterfind`, starting at the root of the tree.
.. versionadded:: 3.2
.. method:: parse(source, parser=None)
Loads an external XML section into this element tree. *source* is a file
name or :term:`file object`. *parser* is an optional parser instance.
If not given, the standard :class:`XMLParser` parser is used. Returns the
section root element.
.. method:: write(file, encoding="us-ascii", xml_declaration=None, \
default_namespace=None, method="xml", *, \
short_empty_elements=True)
Writes the element tree to a file, as XML. *file* is a file name, or a
:term:`file object` opened for writing. *encoding* [1]_ is the output
encoding (default is US-ASCII).
*xml_declaration* controls if an XML declaration should be added to the
file. Use ``False`` for never, ``True`` for always, ``None``
for only if not US-ASCII or UTF-8 or Unicode (default is ``None``).
*default_namespace* sets the default XML namespace (for "xmlns").
*method* is either ``"xml"``, ``"html"`` or ``"text"`` (default is
``"xml"``).
The keyword-only *short_empty_elements* parameter controls the formatting
of elements that contain no content. If ``True`` (the default), they are
emitted as a single self-closed tag, otherwise they are emitted as a pair
of start/end tags.
The output is either a string (:class:`str`) or binary (:class:`bytes`).
This is controlled by the *encoding* argument. If *encoding* is
``"unicode"``, the output is a string; otherwise, it's binary. Note that
this may conflict with the type of *file* if it's an open
:term:`file object`; make sure you do not try to write a string to a
binary stream and vice versa.
.. versionchanged:: 3.4
Added the *short_empty_elements* parameter.
.. versionchanged:: 3.8
The :meth:`write` method now preserves the attribute order specified
by the user.
This is the XML file that is going to be manipulated::
Example page
Moved to example.org
or example.com.
Example of changing the attribute "target" of every link in first paragraph::
>>> from xml.etree.ElementTree import ElementTree
>>> tree = ElementTree()
>>> tree.parse("index.xhtml")
>>> p = tree.find("body/p") # Finds first occurrence of tag p in body
>>> p
>>> links = list(p.iter("a")) # Returns list of all links
>>> links
[, ]
>>> for i in links: # Iterates through all found links
... i.attrib["target"] = "blank"
...
>>> tree.write("output.xhtml")
.. _elementtree-qname-objects:
QName Objects
^^^^^^^^^^^^^
.. class:: QName(text_or_uri, tag=None)
QName wrapper. This can be used to wrap a QName attribute value, in order
to get proper namespace handling on output. *text_or_uri* is a string
containing the QName value, in the form {uri}local, or, if the tag argument
is given, the URI part of a QName. If *tag* is given, the first argument is
interpreted as a URI, and this argument is interpreted as a local name.
:class:`QName` instances are opaque.
.. _elementtree-treebuilder-objects:
TreeBuilder Objects
^^^^^^^^^^^^^^^^^^^
.. class:: TreeBuilder(element_factory=None, *, comment_factory=None, \
pi_factory=None, insert_comments=False, insert_pis=False)
Generic element structure builder. This builder converts a sequence of
start, data, end, comment and pi method calls to a well-formed element
structure. You can use this class to build an element structure using
a custom XML parser, or a parser for some other XML-like format.
*element_factory*, when given, must be a callable accepting two positional
arguments: a tag and a dict of attributes. It is expected to return a new
element instance.
The *comment_factory* and *pi_factory* functions, when given, should behave
like the :func:`Comment` and :func:`ProcessingInstruction` functions to
create comments and processing instructions. When not given, the default
factories will be used. When *insert_comments* and/or *insert_pis* is true,
comments/pis will be inserted into the tree if they appear within the root
element (but not outside of it).
.. method:: close()
Flushes the builder buffers, and returns the toplevel document
element. Returns an :class:`Element` instance.
.. method:: data(data)
Adds text to the current element. *data* is a string. This should be
either a bytestring, or a Unicode string.
.. method:: end(tag)
Closes the current element. *tag* is the element name. Returns the
closed element.
.. method:: start(tag, attrs)
Opens a new element. *tag* is the element name. *attrs* is a dictionary
containing element attributes. Returns the opened element.
.. method:: comment(text)
Creates a comment with the given *text*. If ``insert_comments`` is true,
this will also add it to the tree.
.. versionadded:: 3.8
.. method:: pi(target, text)
Creates a process instruction with the given *target* name and *text*.
If ``insert_pis`` is true, this will also add it to the tree.
.. versionadded:: 3.8
In addition, a custom :class:`TreeBuilder` object can provide the
following methods:
.. method:: doctype(name, pubid, system)
Handles a doctype declaration. *name* is the doctype name. *pubid* is
the public identifier. *system* is the system identifier. This method
does not exist on the default :class:`TreeBuilder` class.
.. versionadded:: 3.2
.. method:: start_ns(prefix, uri)
Is called whenever the parser encounters a new namespace declaration,
before the ``start()`` callback for the opening element that defines it.
*prefix* is ``''`` for the default namespace and the declared
namespace prefix name otherwise. *uri* is the namespace URI.
.. versionadded:: 3.8
.. method:: end_ns(prefix)
Is called after the ``end()`` callback of an element that declared
a namespace prefix mapping, with the name of the *prefix* that went
out of scope.
.. versionadded:: 3.8
.. class:: C14NWriterTarget(write, *, \
with_comments=False, strip_text=False, rewrite_prefixes=False, \
qname_aware_tags=None, qname_aware_attrs=None, \
exclude_attrs=None, exclude_tags=None)
A `C14N 2.0 `_ writer. Arguments are the
same as for the :func:`canonicalize` function. This class does not build a
tree but translates the callback events directly into a serialised form
using the *write* function.
.. versionadded:: 3.8
.. _elementtree-xmlparser-objects:
XMLParser Objects
^^^^^^^^^^^^^^^^^
.. class:: XMLParser(*, target=None, encoding=None)
This class is the low-level building block of the module. It uses
:mod:`xml.parsers.expat` for efficient, event-based parsing of XML. It can
be fed XML data incrementally with the :meth:`feed` method, and parsing
events are translated to a push API - by invoking callbacks on the *target*
object. If *target* is omitted, the standard :class:`TreeBuilder` is used.
If *encoding* [1]_ is given, the value overrides the
encoding specified in the XML file.
.. versionchanged:: 3.8
Parameters are now :ref:`keyword-only `.
The *html* argument is no longer supported.
.. method:: close()
Finishes feeding data to the parser. Returns the result of calling the
``close()`` method of the *target* passed during construction; by default,
this is the toplevel document element.
.. method:: feed(data)
Feeds data to the parser. *data* is encoded data.
.. method:: flush()
Triggers parsing of any previously fed unparsed data, which can be
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
The implementation of :meth:`flush` temporarily disables reparse deferral
with Expat (if currently enabled) and triggers a reparse.
Disabling reparse deferral has security consequences; please see
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
:meth:`!flush`
has been backported to some prior releases of CPython as a security fix.
Check for availability using :func:`hasattr` if used in code running
across a variety of Python versions.
.. versionadded:: 3.13
:meth:`XMLParser.feed` calls *target*\'s ``start(tag, attrs_dict)`` method
for each opening tag, its ``end(tag)`` method for each closing tag, and data
is processed by method ``data(data)``. For further supported callback
methods, see the :class:`TreeBuilder` class. :meth:`XMLParser.close` calls
*target*\'s method ``close()``. :class:`XMLParser` can be used not only for
building a tree structure. This is an example of counting the maximum depth
of an XML file::
>>> from xml.etree.ElementTree import XMLParser
>>> class MaxDepth: # The target object of the parser
... maxDepth = 0
... depth = 0
... def start(self, tag, attrib): # Called for each opening tag.
... self.depth += 1
... if self.depth > self.maxDepth:
... self.maxDepth = self.depth
... def end(self, tag): # Called for each closing tag.
... self.depth -= 1
... def data(self, data):
... pass # We do not need to do anything with data.
... def close(self): # Called when all data has been parsed.
... return self.maxDepth
...
>>> target = MaxDepth()
>>> parser = XMLParser(target=target)
>>> exampleXml = """
...
...
...
...
...
...
...
...
...
... """
>>> parser.feed(exampleXml)
>>> parser.close()
4
.. _elementtree-xmlpullparser-objects:
XMLPullParser Objects
^^^^^^^^^^^^^^^^^^^^^
.. class:: XMLPullParser(events=None)
A pull parser suitable for non-blocking applications. Its input-side API is
similar to that of :class:`XMLParser`, but instead of pushing calls to a
callback target, :class:`XMLPullParser` collects an internal list of parsing
events and lets the user read from it. *events* is a sequence of events to
report back. The supported events are the strings ``"start"``, ``"end"``,
``"comment"``, ``"pi"``, ``"start-ns"`` and ``"end-ns"`` (the "ns" events
are used to get detailed namespace information). If *events* is omitted,
only ``"end"`` events are reported.
.. method:: feed(data)
Feed the given bytes data to the parser.
.. method:: flush()
Triggers parsing of any previously fed unparsed data, which can be
used to ensure more immediate feedback, in particular with Expat >=2.6.0.
The implementation of :meth:`flush` temporarily disables reparse deferral
with Expat (if currently enabled) and triggers a reparse.
Disabling reparse deferral has security consequences; please see
:meth:`xml.parsers.expat.xmlparser.SetReparseDeferralEnabled` for details.
:meth:`!flush`
has been backported to some prior releases of CPython as a security fix.
Check for availability using :func:`hasattr` if used in code running
across a variety of Python versions.
.. versionadded:: 3.13
.. method:: close()
Signal the parser that the data stream is terminated. Unlike
:meth:`XMLParser.close`, this method always returns :const:`None`.
Any events not yet retrieved when the parser is closed can still be
read with :meth:`read_events`.
.. method:: read_events()
Return an iterator over the events which have been encountered in the
data fed to the
parser. The iterator yields ``(event, elem)`` pairs, where *event* is a
string representing the type of event (e.g. ``"end"``) and *elem* is the
encountered :class:`Element` object, or other context value as follows.
* ``start``, ``end``: the current Element.
* ``comment``, ``pi``: the current comment / processing instruction
* ``start-ns``: a tuple ``(prefix, uri)`` naming the declared namespace
mapping.
* ``end-ns``: :const:`None` (this may change in a future version)
Events provided in a previous call to :meth:`read_events` will not be
yielded again. Events are consumed from the internal queue only when
they are retrieved from the iterator, so multiple readers iterating in
parallel over iterators obtained from :meth:`read_events` will have
unpredictable results.
.. note::
:class:`XMLPullParser` only guarantees that it has seen the ">"
character of a starting tag when it emits a "start" event, so the
attributes are defined, but the contents of the text and tail attributes
are undefined at that point. The same applies to the element children;
they may or may not be present.
If you need a fully populated element, look for "end" events instead.
.. versionadded:: 3.4
.. versionchanged:: 3.8
The ``comment`` and ``pi`` events were added.
Exceptions
^^^^^^^^^^
.. class:: ParseError
XML parse error, raised by the various parsing methods in this module when
parsing fails. The string representation of an instance of this exception
will contain a user-friendly error message. In addition, it will have
the following attributes available:
.. attribute:: code
A numeric error code from the expat parser. See the documentation of
:mod:`xml.parsers.expat` for the list of error codes and their meanings.
.. attribute:: position
A tuple of *line*, *column* numbers, specifying where the error occurred.
.. rubric:: Footnotes
.. [1] The encoding string included in XML output should conform to the
appropriate standards. For example, "UTF-8" is valid, but "UTF8" is
not. See https://www.w3.org/TR/2006/REC-xml11-20060816/#NT-EncodingDecl
and https://www.iana.org/assignments/character-sets/character-sets.xhtml.