Is there a stylesheet or Windows commandline tool for controllable XML formatting, specifically putting attributes one-per-line?

Here's a PowerShell script to do it. It takes the following input:

<?xml version="1.0" encoding="utf-8"?>
<Node>
    <ChildNode value1="5" value2="6" value3="happy" />
</Node>

...and produces this as output:

<?xml version="1.0" encoding="utf-8"?>
<Node>
  <ChildNode
    value1="5"
    value2="6"
    value3="happy" />
</Node>

Here you go:

param(
    [string] $inputFile = $(throw "Please enter an input file name"),
    [string] $outputFile = $(throw "Please supply an output file name")
)

$data = [xml](Get-Content $inputFile)

$xws = new-object System.Xml.XmlWriterSettings
$xws.Indent = $true
$xws.IndentChars = "  "
$xws.NewLineOnAttributes = $true

$data.Save([Xml.XmlWriter]::Create($outputFile, $xws))

Take that script, save it as C:\formatxml.ps1. Then, from a PowerShell prompt type the following:

C:\formatxml.ps1 C:\Path\To\UglyFile.xml C:\Path\To\NeatAndTidyFile.xml

This script is basically just using the .NET framework so you could very easily migrate this into a C# application.

NOTE: If you have not run scripts from PowerShell before, you will have to execute the following command at an elevated PowerShell prompt before you will be able to execute the script:

Set-ExecutionPolicy RemoteSigned

You only have to do this one time though.

I hope that's useful to you.


Here's a small C# sample, which can be used directly by your code, or built into an exe and called at the comand-line as "myexe from.xml to.xml":

    using System.Xml;

    static void Main(string[] args)
    {
        XmlWriterSettings settings = new XmlWriterSettings {
            NewLineHandling = NewLineHandling.Entitize,
            NewLineOnAttributes = true, Indent = true, IndentChars = "  ",
            NewLineChars = Environment.NewLine
        };

        using (XmlReader reader = XmlReader.Create(args[0]))
        using (XmlWriter writer = XmlWriter.Create(args[1], settings)) {
            writer.WriteNode(reader, false);
            writer.Close();
        }
    }

Sample input:

<Node><ChildNode value1='5' value2='6' value3='happy' /></Node>

Sample output (note you can remove the <?xml ... with settings.OmitXmlDeclaration):

<?xml version="1.0" encoding="utf-8"?>
<Node>
  <ChildNode
    value1="5"
    value2="6"
    value3="happy" />
</Node>

Note that if you want a string rather than write to a file, just swap with StringBuilder:

StringBuilder sb = new StringBuilder();
using (XmlReader reader = XmlReader.Create(new StringReader(oldXml)))
using (XmlWriter writer = XmlWriter.Create(sb, settings)) {
    writer.WriteNode(reader, false);
    writer.Close();
}
string newXml = sb.ToString();

Try Tidy over on SourceForge. Although its often used on [X]HTML, I've used it successfully on XML before - just make sure you use the -xml option.

http://tidy.sourceforge.net/#docs

Tidy reads HTML, XHTML and XML files and writes cleaned up markup. ... For generic XML files, Tidy is limited to correcting basic well-formedness errors and pretty printing.

People have ported to several platforms and it available as an executable and callable library.

Tidy has a heap of options including:

http://api.html-tidy.org/tidy/quickref_5.0.0.html#indent

indent-attributes
Top Type: Boolean
Default: no Example: y/n, yes/no, t/f, true/false, 1/0
This option specifies if Tidy should begin each attribute on a new line.

One caveat:

Limited support for XML

XML processors compliant with W3C's XML 1.0 recommendation are very picky about which files they will accept. Tidy can help you to fix errors that cause your XML files to be rejected. Tidy doesn't yet recognize all XML features though, e.g. it doesn't understand CDATA sections or DTD subsets.

But I suspect unless your XML is really advanced, the tool should work fine.