Pretty print XML in java 8

I guess that the problem is related to blank text nodes (i.e. text nodes with only whitespaces) in the original file. You should try to programmatically remove them just after the parsing, using the following code. If you don't remove them, the Transformer is going to preserve them.

original.getDocumentElement().normalize();
XPathExpression xpath = XPathFactory.newInstance().newXPath().compile("//text()[normalize-space(.) = '']");
NodeList blankTextNodes = (NodeList) xpath.evaluate(original, XPathConstants.NODESET);

for (int i = 0; i < blankTextNodes.getLength(); i++) {
     blankTextNodes.item(i).getParentNode().removeChild(blankTextNodes.item(i));
}

In reply to Espinosa's comment, here is a solution when "the original xml is not already (partially) indented or contain new lines".

Background

Excerpt from the article (see References below) inspiring this solution:

Based on the DOM specification, whitespaces outside the tags are perfectly valid and they are properly preserved. To remove them, we can use XPath’s normalize-space to locate all the whitespace nodes and remove them first.

Java Code

public static String toPrettyString(String xml, int indent) {
    try {
        // Turn xml string into a document
        Document document = DocumentBuilderFactory.newInstance()
                .newDocumentBuilder()
                .parse(new InputSource(new ByteArrayInputStream(xml.getBytes("utf-8"))));

        // Remove whitespaces outside tags
        document.normalize();
        XPath xPath = XPathFactory.newInstance().newXPath();
        NodeList nodeList = (NodeList) xPath.evaluate("//text()[normalize-space()='']",
                                                      document,
                                                      XPathConstants.NODESET);

        for (int i = 0; i < nodeList.getLength(); ++i) {
            Node node = nodeList.item(i);
            node.getParentNode().removeChild(node);
        }

        // Setup pretty print options
        TransformerFactory transformerFactory = TransformerFactory.newInstance();
        transformerFactory.setAttribute("indent-number", indent);
        Transformer transformer = transformerFactory.newTransformer();
        transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8");
        transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes");
        transformer.setOutputProperty(OutputKeys.INDENT, "yes");

        // Return pretty print xml string
        StringWriter stringWriter = new StringWriter();
        transformer.transform(new DOMSource(document), new StreamResult(stringWriter));
        return stringWriter.toString();
    } catch (Exception e) {
        throw new RuntimeException(e);
    }
}

Sample usage

String xml = "<root>" + //
             "\n   "  + //
             "\n<name>Coco Puff</name>" + //
             "\n        <total>10</total>    </root>";

System.out.println(toPrettyString(xml, 4));

Output

<root>
    <name>Coco Puff</name>
    <total>10</total>
</root>

References

  • Java: Properly Indenting XML String published on MyShittyCode
  • Save new XML node to file