Difference between validating and sanitizing inputs in an Express.JS app with a Hapi.JS Joi module?

What's the difference between validation and sanitization?

Sanitization

Sanitizing Inputs means checking input before storing it in a database or using it for any other purpose to prevent malicious code injection.

A basic example would be SQL Injection which is to be taken in account if you want to store/verify data. Suppose you are testing login credentials submitted by user in your database. Your query might be something like

SELECT * FROM `users` WHERE `username`='$user' AND `pass`='$pass'

where $user and $pass are the username and password which user enters.

If you are not sanitizing user input and user enters something like this:

username -> admin' AND 1=1 OR 1='1
password -> pass

Your query would become:

SELECT * FROM `users` WHERE `username`='admin' AND 1=1 OR 1='1' AND `pass`='pass'

which on execution selects admin field and logs in user as admin.

But if you are sanitizing user input, your query would be:

SELECT * FROM `users` WHERE `username`='admin\' AND 1=1 OR 1=\'1' AND `pass`='pass'

which will not give the user access to any account until or unless username and password matches to a database entry.

Validation

Validation is the checking or verification of any data that comes, which helps verify the data has not been compromised or corrupted during transmission.

Like if you are taking mobile platform as an argument then you only want to allow Android or IOS as a value and other values are not valid.If some critical input is needed from the user that cannot be empty then checking it comes into validation.

But if the user gives ANDROID & IOS n input then sanitization will make that ANDROID & IOS. that will not allow the user to break the code and logic

should I sanitize inputs for an Express API?

Yes, you should always sanitize data as if you are exposing it as a rest API then the user can insert malicious data into the input of mobile app. It will be better to be ready for all the edge cases and user can do anything. (:wink:)

How can I sanitize inputs with Joi or some other Express compatible library?

With the Joi you can sanitize variable with addition options

validate(value, schema, {escapeHtml: true}, [callback])


Sanitising is for preventing malicious code

example for XSS sanitising <script>alert(1)</script>
is changed to &lt;script&gt;alert(1)&lt;/script&gt; so that it will be displayed on the browser and not executed

And Validation is for general checks like if an input is a valid email, phone number etc

example email validation,
length > 5, @ should be present,. should be present after @ etc

update to question 2

It is a really good practice to sanitise all the input from the user.
A great rule to remember never to trust data from user.


What's the difference between validation and sanitization?

Validation is verifying that the data being submitted meets a rule or set of rules defined by the developer for a particular input field.

// checks that 22 is a number and must be >=99
Joi.validate(22, Joi.number().min(99));

Validation prevents unexpected or bad data entry.

Sanitization only cares about making sure the data being submitted doesn't contains any code. Like change all single quotation marks in a string to double quotation marks or change < to &lt;

Sanitization prevents malicious code injection or execution.

Should I sanitize inputs for an Express API?

Yes you should.

I'm trying to understand if I should validate as well as sanitize.

Yes should validate as well as sanitize your data as combining these two techniques provides in-depth defense to your application. One more thing validation should always happens before sanitization.

How can I sanitize inputs with Joi or some other Express compatible library?

Joi is a validation library. It is perfect of validating data. But for sanitization if rather go with something like string.js for methods like escapeHTML() and module xss-filters for xss sanitization.