XSS & friends: Text Handling in PHP applications
Update: I jotted down some initial theory in my Safe String Theory for the Web post.
For a while now, a lot of talk has been going on about XSS, aka Cross Site Scripting. In October 2005, an XSS worm nearly took down MySpace. Most XSS attacks however are not as benevolent as that. They can be used to steal passwords and other sensitive information, perform distributed Denial-of-Service attacks on sites or generate fraudulent advertisement income.
XSS problems are still rampant in many web applications today though, with PHP applications being especially vulnerable. This has caused some to conclude that XSS problems are even impossible to avoid or at least impractical to completely audit for. However, from a purely technical standpoint, XSS problems are not unique at all. They belong to a wider class of security problems which stem from incorrect handling of user-supplied data (e.g. SQL command injection or e-mail header injection).
So, what makes the web so tricky to secure? Is it because web programmers are inherently 'stupid' and can't 'code properly'? I don't think so.
However, I do think that most web languages (such as PHP) tend to promote a bad approach to coding and by extension, to security. By letting the programmer jump in directly, learning as they go, most people never build-up a complete overview of the programming environment, but simply tweak code 'until it works'. The same applies to security issues: when a bug is found, those people will just tweak a particular line of code until the problem goes away. They won't see the big picture and will make similar mistakes later.
Another serious problem in my opinion is that there is no well-defined vocabulary for the tools used to solve these problems. Umbrella words such as 'filtering' are all too often used and stand in the way of a more precise description. With only vague notions about 'validation', 'special characters' and 'escaping', you cannot understand what's really going on. Such a lack of insight also prevents people from seeing beyond individual issues.
So I've decided I want to build up a more formalized explanation to text handling. Expect one or more blog posts about this in the future. At least the next time people "lock up" on me, I can point them somewhere.