Stopping XSS when using WebAPI

NOTE: Please read the whole answer, I switched from AntiXss library to HtmlSanitizer. Also test test test! I'm not a security expert.

According to the official documentation, you can just do the following in your web.config:

<httpRuntime encoderType="System.Web.Security.AntiXss.AntiXssEncoder" /> 

You don't need to install the AntiXss library anymore since it is now included in .NET 4.5.

UPDATE:

Turns out it wasn't enough to set the encoderType in web.config, what I ended up doing was intercepting the deserialized json and validating it like so:

public class AntiXssConverter : JsonConverter
{
    public override bool CanConvert(Type objectType)
    {
        return objectType == typeof(string);
    }

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        var stringValue = (string) reader.Value;
        ThrowIfForbiddenInput(stringValue);
        return stringValue;
    }

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        var token = JToken.FromObject(value);
        token.WriteTo(writer);
    }

    private static void ThrowIfForbiddenInput(string value)
    {
        if (string.IsNullOrWhiteSpace(value))
        {
            return;
        }

        var encoded = AntiXssEncoder.HtmlEncode(value, true);
        if (value != encoded)
        {
            throw new Exception("Forbidden input. The following characters are not allowed: &, <, >, \", '");
        }
    }
}

Use converter like this:

config.Formatters.JsonFormatter.SerializerSettings.Converters = new List<JsonConverter>
{
    new AntiXssConverter()
};

If the data contains any illegal characters, I simply throw an exception because I don't want to accept it in my backend. Others might want to simply sanitize the input.

Another thing to do just in-case, is to configure WebAPI to escape HTML output, like so:

config.Formatters.JsonFormatter.SerializerSettings.StringEscapeHandling = 
    StringEscapeHandling.EscapeHtml;

This covered everything for me.

SECOND UPDATE:

I've decided to change from using the AntiXss library to using HtmlSanitizer because AntiXss was way too restrictive by encoding all foreign characters (ä, ö, etc...) and I couldn't get it to allow them even though the unicode block was in the whitelist.

The other nice thing about this library is that it's unit tested with the OWASP XSS Filter Evasion Cheat Sheet. Click here for more info.

THIRD UPDATE:

If you decide to use the JsonConverter way above, it's possible to bypass it by simply setting a different Content-Type on the client-side (such as "application/x-www-form-urlencoded") and the request would go through to the server.

In order to avoid this, I cleared all other formatters, leaving only the JSON one, like so:

config.Formatters.Clear();
config.Formatters.Add(new JsonMediaTypeFormatter());

Then, in order to ignore my XSS converter on specific properties (like Password fields for example), I found a great solution from the following answer, which was to create a dummy "NoConverter" class which would default to using the default converter for specific properties:

public class NoConverter : JsonConverter
{
    public override bool CanConvert(Type objectType)
    {
        throw new NotImplementedException();
    }

    public override object ReadJson(JsonReader reader, Type objectType, object existingValue, JsonSerializer serializer)
    {
        throw new NotImplementedException();
    }

    public override void WriteJson(JsonWriter writer, object value, JsonSerializer serializer)
    {
        throw new NotImplementedException();
    }

    public override bool CanRead => false;
    public override bool CanWrite => false;
}

Usage:

[JsonConverter(typeof(NoConverter))]
public string NewPassword { get; set; }

I may still have missed something, I'm by no means an expert web developer, but it has been an interesting ride... :-)


There are two main schools of thought to protect against XSS attacks.

  • Output encoding
  • Input validation

For output encoding, Server.HtmlEncode(p.message) should do the trick (so what you have currently in your example will work, don't need to do the Regex replace if you don't want to. The output encoding will prevent XSS). Here I am assuming you want to do HTML encoding and not Url encoding or the like.

Looks like you are using the .NET MVC framework. You could use DataAnnotations to preform white-list validation (allow only safe characters) versus black-listing. I would look at using the RegularExpressionAttribute. For example:

public class MyModel
{
   [RegularExpression(@"^[a-zA-Z''-'\s]{1,400}$", ErrorMessage = "Characters are not allowed.")]
   public string Message { get; set; }
}

Hope this helps.


As your code stands right now, a user could just inject JavaScript that doesn't use a script tag.

There is a common list of XSS vulnerabilities that could be used.

Right now you accept a 'string', and all you parse out are HTML tags. Unfortunately, there are a lot of XSS attacks that don't rely on HTML.

For instance, adding the following to a GET Request in Firefox: %22onmouseover=prompt%28%29// will allow the person to inject JavaScript.

Your best bet is to use the AntiXss library from Microsoft, and specifically encode the parameters for GET and POST requests.

(I have to head to work, but I'll post more code later on how to do this).