XSLT - remove whitespace from template

In XSLT, white-space is preserved by default, since it can very well be relevant data.

The best way to prevent unwanted white-space in the output is not to create it in the first place. Don't do:

<xsl:template match="foo">
  foo
</xsl:template>

because that's "\n··foo\n", from the processor's point of view. Rather do

<xsl:template match="foo">
  <xsl:text>foo</xsl:text>
</xsl:template>

White-space in the stylesheet is ignored as long as it occurs between XML elements only. Simply put: never use "naked" text anywhere in your XSLT code, always enclose it in an element.

Also, using an unspecific:

<xsl:apply-templates />

is problematic, because the default XSLT rule for text nodes says "copy them to the output". This applies to "white-space-only" nodes as well. For instance:

<xml>
  <data> value </data>
</xml>

contains three text nodes:

  1. "\n··" (right after <xml>)
  2. "·value·"
  3. "\n" (right before </xml>)

To avoid that #1 and #3 sneak into the output (which is the most common reason for unwanted spaces), you can override the default rule for text nodes by declaring an empty template:

<xsl:template match="text()" />

All text nodes are now muted and text output must be created explicitly:

<xsl:value-of select="data" />

To remove white-space from a value, you could use the normalize-space() XSLT function:

<xsl:value-of select="normalize-space(data)" />

But careful, since the function normalizes any white-space found in the string, e.g. "·value··1·" would become "value·1".

Additionally you can use the <xsl:strip-space> and <xsl:preserve-space> elements, though usually this is not necessary (and personally, I prefer explicit white-space handling as indicated above).


As far as removing tabs but retaining separate lines, I tried the following XSLT 1.0 approach, and it works rather well. Your use of version 1.0 or 2.0 largely depends on which platform you're using. It looks like .NET technology is still dependant on XSLT 1.0, and so you're limited to extremely messy templates (see below). If you're using Java or something else, please refer to the much cleaner XSLT 2.0 approach listed towards the very bottom.

These examples are meant to be extended by you to meet your specific needs. I'm using tabs here as an example, but this should be generic enough to be extensible.

XML:

<?xml version="1.0" encoding="UTF-8"?>
<text>
        adslfjksdaf

                dsalkfjdsaflkj

            lkasdfjlsdkfaj
</text>

...and the XSLT 1.0 template (required if you use .NET):

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet  
    xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">   
 <xsl:template name="search-and-replace">
   <xsl:param name="input"/>
   <xsl:param name="search-string"/>
   <xsl:param name="replace-string"/>
   <xsl:choose>
    <xsl:when test="$search-string and 
                    contains($input,$search-string)">
       <xsl:value-of
           select="substring-before($input,$search-string)"/>
       <xsl:value-of select="$replace-string"/>
       <xsl:call-template name="search-and-replace">
         <xsl:with-param name="input"
               select="substring-after($input,$search-string)"/>
         <xsl:with-param name="search-string"
               select="$search-string"/>
         <xsl:with-param name="replace-string"
               select="$replace-string"/>
       </xsl:call-template>
    </xsl:when>
    <xsl:otherwise>
      <xsl:value-of select="$input"/>
    </xsl:otherwise>
   </xsl:choose>
  </xsl:template>                
  <xsl:template match="text">
   <xsl:call-template name="search-and-replace">
     <xsl:with-param name="input" select="text()" />
     <xsl:with-param name="search-string" select="'&#x9;'" />
     <xsl:with-param name="replace-string" select="''" />
   </xsl:call-template>    
  </xsl:template>
</xsl:stylesheet>

XSLT 2.0 makes this trivial with the replace function:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
      xmlns:xs="http://www.w3.org/2001/XMLSchema"
      exclude-result-prefixes="xs"
      version="2.0">
 <xsl:template match="text">
  <xsl:value-of select="replace(text(), '&#x9;', '')" />
 </xsl:template>
</xsl:stylesheet>

By default, XSLT templates have <xsl:preserve-space> set, which will keep whitespace in your output. You can add <xsl:strip-space elements="*"> to tell it to where to delete whitespace.

You may also need to include a normalize-space directive, like so:

<xsl:template match="text()">
    <xsl:value-of select="normalize-space(.)"/>
</xsl:template> 

Here are examples for preserve-space and strip-space from W3 Schools.

Tags:

Xml

Xslt