BeeBen's Web Programming PagesBee

Please note that these pages date from 2003 and are near prehistoric in internet terms. It was good stuff when it was written, but old hat now.

These pages not maintained and I no longer deal with queries about them. They remain here for historical interest.

Website security

Assuming that your service provider is taking care of the basic security of the web server and the database server, there are still a number of vulnerabilities which it is the web programmer's responsibility to protect against. Among the possible attacks on your website are the following.

The key to protecting against each of these attacks is the same: diligent input validation.

 

Input validation

Trust no-one. It is vital to check any user input to your scripts to make sure it matches what you expect to see. Check that if you expect an integer value you receive an integer value. Check that input strings don't contain invalid characters. Remember that any user can specify parameters on the command line for a HTTP GET, or can use simple tools to fiddle with parameters for a HTTP POST. Even cookie data can be edited. All user input is potentially malicious! The validation job is made easier if you allow as little arbitrary input as possible.

For example, the M'Cheyne calendar is a finite state program in that only a limited number of input choices is valid for each parameter. So I offer the user a choice from select boxes and check boxes and explicitly reject any input that does not match a valid choice. Invalid input could occur if malicious user had edited the GET parameters or generated spurious PUT or Cookie parameters.

$bibles = array (
    'niv'  => 'NIV',
    'esv'  => 'ESV',
    ...etc...
);
$bible_default = 'niv';

if (!(isset($bible) && valid($bible,$bibles))) { $bible = $bible_default; }

function valid($value,$values) {
    return array_key_exists($value,$values);
}

There is no way that malicious input can get round this kind of checking. Another example is the GIF server which uses a regular expression to check the input string (this is Perl)

$ENV{REQUEST_URI} =~ m:.*/(..?)-([0-9a-f]{6})\.gif$:;
my ($type, $rgb) = ($1, $2);

Unless the expression matches, $type and $rgb will not be set, and an image will not be generated.

If you must allow arbitrary input to your scripts then see below for suggestions for sanitising it.

 

Cross-site scripting

Imagine that an attacker entices a victim to click on a malicious URL: perhaps he embedded it in an email, perhaps he posted it on Usenet, perhaps it's on his website. That URL points to a page on your website which has a cross-site scripting (XSS) vulnerability. Because of your vulnerability the malicious URL causes your server to send the victim some malicious JavaScript, or VBScript or ActiveX, which runs on his machine. The attacker then uses this to steal the victim's cookies for your website, to hijack his account at your site or even potentially to exploit security flaws on his machine.

Obviously you want to avoid this kind of thing. XSS vulnerabilities occur when information encoded in the URL for a page is returned to the user embedded in the web page he requests. So, for example, if your program just outputs all the HTTP GET parameters it receives as a web page then it is vulnerable to XSS. There is nothing to stop the attacker embedding JavaScript or other code in the GET parameters. For more on this see the Cross-Site Scripting FAQ.

The principle solution, as ever, is paranoid input-validation. In addition it's important to consider output validation too: even legitimate input may not be safe to output raw to a web page.

So, in PHP always htmlspecialchars() any input data before outputting it back to the user. Error messages are a particular potential source of problems.

In some applications I have written I have taken care to store any user-supplied data in the database already htmlspecialchars()d. This means I can later print it anyhow I like without worrying about XSS, which makes the coding less error prone.

In some cases, though, you may not know whether your data have already been htmlspecialchars()d or not. For example, I have a page which displays informational messages for one application. In the normal course of events it is invoked by a URL containing already HTML specialchars()d data, so it would be tempting not to bother re-encoding the text before printing it. An attacker, however, would be at liberty to supply any malicious input that would then be printed without translation. This is BAD.

My PHP solution is as follows

echo htmlsafe($GET_text);

function htmlsafe($string) {

    // Legitimate input strings should already be htmlspecialchars()d,
    // so this function will do nothing.  But we must make sure in
    // case of malicious input.

    return htmlspecialchars(
             str_replace(
               array('<','>','"',''','&'),
               array(   '<',   '>',     '"',     "'",    '&'), $string),
             ENT_QUOTES);

}
 

SQL security

SQL injection

If you are not careful with your SQL queries it might be possible for a malicious user to run arbitrary queries on your database, including modifying data and dropping tables. They can do this by sending malicious data to your script which, if your script is vulnerable, gets included in your SQL queries. These data can so corrupt your query strings that it may be possible for the attacker to execute arbitrary operations on your database.

As mentioned above, I often store data in MySQL already htmlspecialchars()d. The ENT_QUOTES argument to htmlspecialchars() is useful as it forces the single quote character to be encoded as "&#039;". This should make the data safe for insertion into the SQL database.

Other parameters need to be checked for validity. For example, I generally define the first column of my SQL tables to be an auto-incrementing integer index. This can greatly simplify handling the tables. Use indexes to refer to table rows, and before operating on any row check it exists (this code uses the database class on my MySQL page):

$db->query("SELECT idx FROM $table");
while ($row = $db->get_row()) {
    array_push($nums, $row['idx']);
}

if (!array_search($idx, $nums)) { die "Error!"; }

Note that $idx does not appear in the query, and the query and array building typcally need to be done only once per web page. This should be pretty fast, especially considering that the database connection will be open already. If you get an error here you can be sure someone is attempting an attack. (Or your code is buggy.)

If you choose not to htmlspecialchars() your data before entering it into the database, you should at least do an addslashes() to it. This will precede database special characters with a backslash which makes an arbitrary string safe for including in an SQL query. You will want to do a stripslashes() on it on extraction.

Note that the PHP configuration parameter "magic_quotes_gpc" is on by default. This means that PHP will automatically do an addslashes() to all user input data (get, post or cookie). I actually prefer to turn this off and handle matters for myself, which can be enforced in the directory .htaccess file,

.htaccess

php_flag magic_quotes_gpc off

Error messages

Another important point is wherever possible not to let users see SQL error messages. They give away a great deal of information potentially useful to attackers. Instead test for possible errors and inconsistencies before submitting a query and provide your own friendly error messages.

Database privileges

You should also not grant database users more privileges than they need. The MySQL manual is somewhat dense on this, but here's as example where I create two types of user, normaluser with read-only privileges, and adminuser with some write privileges (but not DROP, for example).

mysql> GRANT SELECT ON database.* TO normaluser@localhost
          IDENTIFIED BY 'yyy';
mysql> GRANT SELECT, UPDATE, INSERT, DELETE ON database.*
          TO adminuser@localhost IDENTIFIED BY 'xxx';
mysql> FLUSH PRIVILEGES;

The users are identified with HTTP authentication as described on my htaccess page, and the database connection is established accordingly:

$server = '';
$database = 'database';
if ($auth->admin) {
    # An account with administrator priviledges
    $user = 'adminuser';
    $password = 'xxx';
} else {
    # An account with normal, limited priviledges
    $user = 'normaluser';
    $password = 'yyy';
}
if (!mysql_connect($server, $user, $password)) {
    return mysql_error();
}

Your database class, in particular any authentication routines like these, are best stored outside your normal web-directory heirarchy if possible, which avoids any possibility that some kind of server configuration could be exploited to deliver up your password details to an attacker. The Apache virtual host facility can be used to create an inaccessible area on your website: just point all your domains at subdirectories of your top level directory, and put your sensitive data into the top level directory.

 

Unauthorised command execution

If your website scripts invoke external programs such as Unix commands you need to be particularly careful about how they are called. A program like sendmail, which is often used to return forms to the webmaster for example, could also be abused, say, to send your server's password file to an attacker. Other programs inevitably contain bugs that an attacker could use to gain command line access to the server, or use to attack other machines.

Be especially careful about checking any user supplied input that is passed to a system command run by your script. Basically, don't do it.

For example, the email us script on our family website calls the server's email client to send messages to us. Limiting the possible recipients to a choice from a select box not only prevents the program from acting as a Spam (UCE) relay, but it also ensures that the call to sendmail is safe. If arbitrary arguments to sendmail were allowed all sorts of havoc could ensue.

Skin

Valid XHTML 1.0!
Valid CSS2!

Copyright © 2003 Ben Edgington.