Friday, September 12, 2014

Input validation for web-applications, how to process input safely and securely - Part 1 of 3

In this blogpost, one of a serie of 3, I will talk about input validation for web-applications. Input validation is a process that gets the input from the source (user, database, textfile, et cetera), checks it for any faulty and nasty and sneaky contents, and then sends it to the process that needs the input.

Input validation is not only about security. It is also about building user-friendly applications (a message when the data-entry does not comply) and keeping data consistency (all data is stored in the same format). In example, you can choose to store all dates in yyyy-mm-dd format in your database. When you make sure you do that, you can easily analyze and generate statistics of the data in your database. When a user of the system enters data in a wrong format, you can either automatically change it (sanitization), or send a message to the user to enter it in the correct format.

This post is part 1 of 3 and is all about the process of input validation. Part 2 and 3 will contain code examples from PHP and Javascript for every relevant step in the process.

Part 1 - Input validation process
Part 2 - Input validation coding client-side
Part 3 - Input validation coding server-side

But first, lets take a look at OWASP.

OWASP

OWASP stands for Open Web Application Security Project. Their mission is to improve security of software. Short and fairly simple statement, but a very important one. They also track down the most often used (or rather misused) vulnerabilities that are present in (sometimes poorly) written software.

This is the list with the top 10 vulnerabilities in 2013.
  1. Injection
  2. Broken Authentication and Session Management
  3. Cross-Site Scripting (XSS)
  4. Insecure Direct Object References
  5. Security Misconfiguration
  6. Sensitive Data Exposure
  7. Missing Function Level Access Control
  8. Cross-Site Request Forgery (CSRF)
  9. Using Components with Known Vulnerabilities
  10. Unvalidated Redirects and Forwards
Source: https://www.owasp.org/index.php/Top10#OWASP_Top_10_for_2013

From that list the following vulnerabilities can be mitigated with proper input validation.
  1. Injection
  2. Cross-Site Scripting (XSS)
Two out of the top three of the most often seen vulnerabilities can be mitigated with proper input validation. That is how important this topic is. Please bare in mind that with input validation alone you will not be secure, but it is a big step forward.

The input validation process

I believe that you need to follow a certain protocol for every input you will process in your application. In my experience the following steps needs to be done.
  1. Check if the input is actually sent and received
    This check is to prevent any "null" or "not defined" errors when you execute step 2 in this process. If there is no (required) value being sent, the process can and should stop here.
  2. Store input in memory, separate it from the source
    Store the input in memory, for example a variable (no permanent storage!). This is to separate the input you are going to check from the actual source of the input. When you don't do this, the attacker might alter the data later in the process.
  3. Check variable for, and remove all scripting
    In this part we check the variable for scripting. Scripts in input variables might cause havoc on systems but injecting malicious code.
  4. Trim the variable
    Trimming is a process to remove all preceding and trailing spaces, tabs, and more of a variable. This keeps data in the database clean and helps with preventing a buffer overflow. Buffer overflow happens when you want to store a variable in a record, but the record is smaller than the variable and thus resulting in a buffer overflow which can be misused by an attacker.
  5. Truncate the variable to the maximum size of expected value
    This is the second step in preventing a buffer overflow. When all the spaces are gone, you will discard all data in the variable that exceeds the maximum storage of that record and its attributes. This step might be optional when you are processing data in text form, but think about this step thoroughly and carefully.
  6. Check if it is the correct variable type and/or format
    This step will check if you got the the type of variable you are expecting and if it is in the right format. Do you only expect a numeric value? Or do you expect a string? But also consider date-formats, URLs, and email addresses. This is the place to check for it. When incorrect you can either drop the input or convert it (sanitization) to the proper type.
  7. Check if it is expected content (also called white listing)This is an important and probably most difficult step to work out in the code when handling specific data. For example, in some cases you might expect an URL which always preceeds with "https://www.google.com/". So when you have received and processed your input, the last step is to check if you see that the variable meets your requirements. This is a content-specific check to prevent unauthorized data being send to your application. This can be different for every type of input in your application.
  8. When relevant, check existence of local resources
    This step is probably only relevant when you are processing input that is pointing to local resources. Think about files an user can upload, or an URL you convert to a local resource. In these cases, always check if the resource (file, database, etc) exists, before actually accessing it. The reason that this step is at the end, is to prevent malicious code being executed when you access the local resources on your web- or application servers.
  9. And now is it input for the process
    All checks are done, it is safe (as it can be) to process the input. This can be storing it in the database or presenting it the Graphical User Interface of your application. Don't forget to close the connection or resources you accessed in your entire process when you don't need it anymore.
Note: You can either check for invalid input and correct it (sanitizing), or reject it. What the best situation is for your application depends entirely on the use-case. Just keep this in mind.

When and where?

When this needs to be done? Well, that is simple. This process needs to be executed every time you receive input. Every time. Never ever trust input and never ever ever trust user generated input. I cannot emphasize this enough :). It seems an open door, but again, think about the top 3 OWASP vulnerabilities.

You can do input validation on multiple levels. You can do input validation server-side, but also client-side. For example, when you have an online form that users are filling in and sending it to your application, you can and you should do validation checks with every field a user fills in client-side. When the data is submitted, you perform a server-side input validation.

This improves the user experience in your application, and it contributes to the layered-defense principle. If the client-side defense layer fails (because the attacker is circumventing the form or regular ways of submitting input), you will have your second layer of defense ready, and that is the server-side input validation.

Sources

Here are a couple of informative and useful sources you might want to check out.

To be continued...

That's about it for the input validation process itself. In part 2 and 3 of this series I will go more into the actually coding, both client-side (Javascript) and server-side (PHP).

Thank you for reading my blog!