How to prevent Javascript injection attacks within user-generated HTML

Currently the best option is to use a Content Security Policy header like this:

Content-Security-Policy: default-src 'self';

This will prevent loading of both inline and external scripts, styles, images, etc., so only resources from the same origin will be loaded and executed by the browser.

However, it will not work on old browsers.

The only really safe way to go is to use a white-list. Encode everything, then convert the allowed codes back.

I have seen rather advanced attempts to only disallow dangerous code, and it still doesn't work well. It's quite some feat to try to safely catch everything that anyone can think of, and it is prone to do annoying replacements of some things that aren't dangerous at all.

You think that's it? Check this out.

Whatever approach you take, you definitely need to use a whitelist. It's the only way to even come close to being safe about what you're allowing on your site.

EDIT:

I'm not familiar with .NET, unfortunately, but you can check out stackoverflow's own battle with XSS (https://blog.stackoverflow.com/2008/06/safe-html-and-xss/) and the code that was written to parse HTML posted on this site: Archive.org link - obviously you might need to change this because your whitelist is bigger, but that should get you started.

Whitelist for elements and attributes is the only acceptable choice in my opinion. Anything not on your whitelist should be stripped out or encoded (change <>&" to entities). Also be sure to check the values within the attributes you allow.

Anything less and you are opening yourself up to problems - known exploits or those that will be discovered in the future.

How to prevent Javascript injection attacks within user-generated HTML

Tags:

Javascript

Html

Regex

Parsing

Code Injection

Related

Recent Posts