Is filtering of user input data enough, or should it be parsed?

I agree with @Jorn's answer about the validation.

However, you're still forgetting a very important step here, and that is output encoding.
E.g. HTML encoding (or Attribute encoding, or Javascript encoding, etc) before outputting anything... In fact, this is arguably even more important than the input validation (arguably, not absolutely, and definitely not in all situations...)
In any event it shouldn't be either/or, its definitely both strict input validation + output encoding.

Now, if you are referring to including "safe" HTML tags in the output (then its not very clear in your question), then you should still encode every thing, and then decode the specific tags you're looking for without any of the tag's attributes.

P.s. If you're referring to a .NET app, MS's AntiXSS (in WPL) provides a .GetSafeHTMLXXX set of methods.


For input validation, I recommend a whitelist approach combined with pass-or-reject. So define what is valid, and accept only valid input, reject everything else.

If you build a rich text editor that sends html to your server, you can use JavaScript to sanitize the input, so that pasting html from Word could end up working. Yet your server won't accept any unsanitized input.

Trying to clean up is subject to being faulty, as that is usually much more complex then just figuring out if input is valid or not.

Whitelisting is essential to good security. You can always to decide to accept more input, while keeping in control. In a blacklist approach, you'll always have to catch up to the latest hack.

For output, see other answers.


your question "User input data, is filtering enough or should it be parsed?" is more general than the XSS case. I guess, you wanted to be specific for XSS but usually user input can cause many additional exploits like SQL Injection, Path Traversal, XSS... As your question gets more specific on XSS, I would refer to what is already said and go for input filtering and output encoding (always server side!!) with some known countermeasures like the provided by OWASP.

Still, I would like to point you out that the best way to solve SQL Injection that can be caused by the unchecked input is in parametrized queries (SQL Injection Prevention Cheat Sheet) then if you consider path traversal vulnerabilities, my approach would be some whitelisting approach.