Home

PHP

Safe String Theory for the Web

Apr 03, 2008

One of the major things that really bugs me about the web is how poor the average web programmer handles strings. Here we are, changing the way the world works on top of text based protocols and languages like HTTP, MIME, JavaScript and CSS, yet some of the biggest issues that still plague us are cross-site scripting and mangled text due to aggressive filtering, mismatched encodings or overzealous escaping.

Almost two years ago I said I'd write down some formal notes on how to avoid issues like XSS, but I never actually posted anything. See, once I sat down to actually try and untangle the do's and don'ts, I found it extremely hard to build up a big coherent picture.

But here we are now, and I'm going to try anyway. The text is aimed at people who have had to deal with these issues, who are looking for a bit of formalism to frame their own solutions in.

Vancouver PHP Conference

Feb 12, 2007

Ahoy from the Vancouver PHP conference. I gave a talk titled "A Closer Look at Drupal 5" earlier. Overall response was positive, although according to Boris I wouldn't have managed to squeeze everything in 1 hour if I hadn't put on my zippy fast presentation speaking voice, so there might have been some information overload at times.

Oh well.. I figure anyone generally only remembers at most 50% of a talk, so I might as well blast you with a bunch of things and hope some of it sticks ;).

Thanks to Dries and James for letting me use their earlier presentations as a base.

The slides are no longer available by Dries' request, as he has had problems with people stealing slides without permission before. Sorry.

XSS & friends: Text Handling in PHP applications

Jun 26, 2006

Update: I jotted down some initial theory in my Safe String Theory for the Web post.

For a while now, a lot of talk has been going on about XSS, aka Cross Site Scripting. In October 2005, an XSS worm nearly took down MySpace. Most XSS attacks however are not as benevolent as that. They can be used to steal passwords and other sensitive information, perform distributed Denial-of-Service attacks on sites or generate fraudulent advertisement income.

XSS problems are still rampant in many web applications today though, with PHP applications being especially vulnerable. This has caused some to conclude that XSS problems are even impossible to avoid or at least impractical to completely audit for. However, from a purely technical standpoint, XSS problems are not unique at all. They belong to a wider class of security problems which stem from incorrect handling of user-supplied data (e.g. SQL command injection or e-mail header injection).

So, what makes the web so tricky to secure? Is it because web programmers are inherently 'stupid' and can't 'code properly'? I don't think so.

However, I do think that most web languages (such as PHP) tend to promote a bad approach to coding and by extension, to security. By letting the programmer jump in directly, learning as they go, most people never build-up a complete overview of the programming environment, but simply tweak code 'until it works'. The same applies to security issues: when a bug is found, those people will just tweak a particular line of code until the problem goes away. They won't see the big picture and will make similar mistakes later.

Another serious problem in my opinion is that there is no well-defined vocabulary for the tools used to solve these problems. Umbrella words such as 'filtering' are all too often used and stand in the way of a more precise description. With only vague notions about 'validation', 'special characters' and 'escaping', you cannot understand what's really going on. Such a lack of insight also prevents people from seeing beyond individual issues.

So I've decided I want to build up a more formalized explanation to text handling. Expect one or more blog posts about this in the future. At least the next time people "lock up" on me, I can point them somewhere.

BarCamp Fun

Oct 21, 2005

Teh Boris wants Drupal to get more noticability as a PHP CMS. Nothing wrong with that ;).

Highlights from BarCamp: flock, ascii goatse, free food, anti-phishing and generic hacking fun.

Summer of Code - Ajax Functionality for Drupal

Sep 01, 2005


This last summer I was sponsored by Google as part of their Summer of Code progam to work on Drupal. My goal was to introduce various AJAX functionalities to Drupal.

The official project description was:

"Drupal has recently begun to find meaningful ways to introduce AJAX functionality with the goal of improving the user experience. Work with Drupal's usability experts to identify the next steps and help implement new dynamic functions based on interaction with the XMLHttpRequest object."

I focused on the following Ajax-powered features:

  • Inline Editing of posts: Though I built a working prototype module, I decided not to develop this feature further because it is not flexible enough to work as a generic Drupal module. It would break on too many configurations and has limited usefulness anyhow.
  • Uploading of files: allows you to attach files to Drupal nodes (with upload.module) without having to reload the page.
  • Sorting tables inside a page: this changes the sort order of a table without reloading the entire page. It is not client-side sorting as you'd expect at first sight: because most tables in Drupal are spread across multiple pages, client-side sorting is not very useful.
  • Switching between multiple pages: this was implemented on top of the sorting functionality, and only works on paged tables (this covers most of the useful pagers though).
  • Progressbar widget: a typical progressbar that fetches the status from the server through Ajax.

The resulting code can be found in my sandbox in the Drupal contributions repository. Note however that most of the code is in patches against the (rapidly changing) Drupal HEAD, so they are likely to go out of date soon.

The file uploader is now already part of the Drupal HEAD, and at least the tablesorter is sitting in the patch queue being reviewed. I will try and keep them up to date.

A big thanks goes to Google for organising the Summer of Code!

Proposal for Implementing Unicode in PHP

Jun 03, 2005

On the Drupal team, I am known as an encoding nut: whenever there's an encoding issue or a question about Unicode, people tend to knock on my door. Usually any fix or answer from me is accompanied by a lot of cursing to the unfortunate inquirer about how "PHP is horrible when it comes to string handling" and how it seems that "the entire PHP dev team has its head planted firmly into the ground when it comes to Unicode".

To which the reply is more than often: "Why don't you fix it yourself?".

Well, I'm not a PHP language developer. To be honest I have no interest or time for becoming one. But I do know a lot about encodings and Unicode, so I decided to write this article describing the problem and possible solutions. That way, maybe others can take some of these ideas and put them into practice. At the very least, it should answer a lot of questions that people have about Unicode and PHP.

Right now, the message from the PHP developers seems to be that "PHP supports Unicode, but some assembly is required". In fact, it is a lot worse. Please, read on.

Images