inserting newlines in xml file generated via xml.etree.ElementTree in python

The easiest solution I think is switching to the lxml library. In most circumstances you can just change your import from import xml.etree.ElementTree as etree to from lxml import etree or similar.

You can then use the pretty_print option when serializing:

tree.write(filename, pretty_print=True)

(also available on etree.tostring)


There is no pretty printing support in ElementTree, but you can utilize other XML modules.

For example, xml.dom.minidom.Node.toprettyxml():

Node.toprettyxml([indent=""[, newl=""[, encoding=""]]])

Return a pretty-printed version of the document. indent specifies the indentation string and defaults to a tabulator; newl specifies the string emitted at the end of each line and defaults to \n.

Use indent and newl to fit your requirements.

An example, using the default formatting characters:

>>> from xml.dom import minidom
>>> from xml.etree import ElementTree
>>> tree1=ElementTree.XML('<tips><tip>1</tip><tip>2</tip></tips>')
>>> ElementTree.tostring(tree1)
'<tips><tip>1</tip><tip>2</tip></tips>'
>>> print minidom.parseString(ElementTree.tostring(tree1)).toprettyxml()
<?xml version="1.0" ?>
<tips>
    <tip>
        1
    </tip>
    <tip>
        2
    </tip>
</tips>

>>> 

UPDATE 2022 - python 3.9 and later versions

For python 3.9 and later versions the standard library includes xml.etree.ElementTree.indent:

Example:

import xml.etree.ElementTree as ET

root = ET.fromstring("<fruits><fruit>banana</fruit><fruit>apple</fruit></fruits>""")
tree = ET.ElementTree(root)
    
ET.indent(tree, '  ')
# writing xml
tree.write("example.xml", encoding="utf-8", xml_declaration=True)

Thanks Michał Krzywański for this update!

BEFORE python 3.9

I found a new way to avoid new libraries and reparsing the xml. You just need to pass your root element to this function (see below explanation):

def indent(elem, level=0):
    i = "\n" + level*"  "
    if len(elem):
        if not elem.text or not elem.text.strip():
            elem.text = i + "  "
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
        for elem in elem:
            indent(elem, level+1)
        if not elem.tail or not elem.tail.strip():
            elem.tail = i
    else:
        if level and (not elem.tail or not elem.tail.strip()):
            elem.tail = i

There is an attribute named "tail" on xml.etree.ElementTree.Element instances. This attribute can set an string after a node:

"<a>text</a>tail"

I found a link from 2004 telling about an Element Library Functions that uses this "tail" to indent an element.

Example:

root = ET.fromstring("<fruits><fruit>banana</fruit><fruit>apple</fruit></fruits>""")
tree = ET.ElementTree(root)
    
indent(root)
# writing xml
tree.write("example.xml", encoding="utf-8", xml_declaration=True)

Result on "example.xml":

<?xml version='1.0' encoding='utf-8'?>
<fruits>
    <fruit>banana</fruit>
    <fruit>apple</fruit>
</fruits>

Tags:

Python

Xml