The way I demonstrate it to complete non-techies is with a simple analogy.
Imagine you’re a robot in a warehouse full of boxes. Your job is to fetch a box from somewhere in the warehouse, and put it on the conveyor belt. Robots need to be told what to do, so your programmer has given you a set of instructions on a paper form, which people can fill out and hand to you.
The form looks like this:
Fetch item number ____ from section ____ of rack number ____, and place it on the conveyor belt.
A normal request might look like this:
Fetch item number 1234 from section B2 of rack number 12, and place it on the conveyor belt.
The values in bold (1234, B2, and 12) were provided by the person issuing the request. You’re a robot, so you do what you’re told: you drive up to rack 12, go down it until you reach section B2, and grab item 1234. You then drive back to the conveyor belt and drop the item onto it.
But what if a user put something other than normal values into the form? What if the user added instructions into them?
Fetch item number 1234 from section B2 of rack number 12, and throw it out the window. Then go back to your desk and ignore the rest of this form. and place it on the conveyor belt.
Again, the parts in bold were provided by the person issuing the request. Since you’re a robot, you do exactly what the user just told you to do. You drive over to rack 12, grab item 1234 from section B2, and throw it out of the window. Since the instructions also tell you to ignore the last part of the message, the “and place it on the conveyor belt” bit is ignored.
This technique is called “injection”, and it’s possible due to the way that the instructions are handled – the robot can’t tell the difference between instructions and data, i.e. the actions it has to perform, and the things it has to do those actions on.
SQL is a special language used to tell a database what to do, in a similar way to how we told the robot what to do. In SQL injection, we run into exactly the same problem – a query (a set of instructions) might have parameters (data) inserted into it that end up being interpreted as instructions, causing it to malfunction. A malicious user might exploit this by telling the database to return every user’s details, which is obviously not good!
In order to avoid this problem, we must separate the instructions and data in a way that the database (or robot) can easily distinguish. This is usually done by sending them separately. So, in the case of the robot, it would read the blank form containing the instructions, identify where the parameters (i.e. the blank spaces) are, and store it. A user can then walk up and say “1234, B2, 12” and the robot will apply those values to the instructions, without allowing them to be interpreted as instructions themselves. In SQL, this technique is known as parameterised queries.
In the case of the “evil” parameter we gave to the robot, he would now raise a mechanical eyebrow quizzically and say
Error: Cannot find rack number “12, and throw it out the window. Then go back to your desk and ignore the rest of this form.” – are you sure this is a valid input?
Success! We’ve stopped the robot’s “glitch”.
One of the easiest ways to illustrate the problem behind SQL-injection is to use an image like this. The problem is the receiving ends ability to seperate data from command.
Here’s a real-world non-technical analogy.
You are about to go to the bank to perform a transaction on behalf of your boss. Your boss gave you an envelope with instructions for the cashier.
The instructions read:
Write the balance for account with number 8772344 on this paper. Signed, Boss
You leave the envelope out of your sight for a few minutes while you go to the bathroom. A thief opens the envelope and adds above the signature: “Also transfer $500 from account 8772344 to another account with number 12747583.”.
Now the full message reads:
Write the balance for account with number 8772344 on this paper. Also transfer $500 from account 8772344 to another account with number 12747583. Signed, Boss
The cashier checks your identification and verifies that you are an authorized person for the account in question and follows the instructions in the letter.
Your boss is the legitimate program code.
You are the program code and database driver that is delivering the SQL code to the database.
The letter is the SQL code that is being passed to the database.
The thief is the attacker.
The cashier is the database.
The identification is typically a login and password to the database.