Dealing with invalid XML hexadecimal characters

The following code removes XML invalid characters from a string and returns a new string without them:

public static string CleanInvalidXmlChars(string text) 
{ 
     // From xml spec valid chars: 
     // #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]     
     // any Unicode character, excluding the surrogate blocks, FFFE, and FFFF. 
     string re = @"[^\x09\x0A\x0D\x20-\xD7FF\xE000-\xFFFD\x10000-x10FFFF]"; 
     return Regex.Replace(text, re, ""); 
}

byte[] toEncodeAsBytes
            = System.Text.ASCIIEncoding.ASCII.GetBytes(toEncode);
      string returnValue
            = System.Convert.ToBase64String(toEncodeAsBytes);

is one way of doing this

Tags:

C#

.Net 3.5

Xml