Simon Willison’s Weblog


Easier form validation with PHP

17th June 2003

Let’s talk about form validation. Here’s what I would class as the ideal validation system for a form in a web application:

  1. The form is displayed; you fill it in.
  2. You submit the form to the server.
  3. If you missed something out or provided invalid input, the form is redisplayed pre-filled with the valid data you already entered.
  4. The redisplayed form tells you what you got wrong. It also flags the fields that were incorrect.
  5. Loop until you fill the form in correctly.

Writing this once in PHP is trivial, but takes quite a bit of very dull code. Writing this for more than one form quickly becomes a tedious nightmare of duplicating and slightly editing code, which is why so few forms bother.

I’ve been trying to figure out an elegant way of automating as much of this code as possible on and off for more than two years. I’ve tried systems that generate the whole form based on a bunch of criteria (too inflexible), systems that describe the validation rules (fine but they don’t help with the redisplay logic) and I’ve looked at solutions available in other language (such as ASP.NET), all to no avail. Over the past couple of days I’ve been working on the problem again, and for the first time I think I’m actually getting somewhere.

My latest attempt (sparked by this article on Evolt) involves embedding validation and redisplay rules in the markup of the form itself. The form is written in XHTML, but with a number of additional tags and elements. Any form field elements can have a number of additional attributes which specify the validation rules of the form. For example:

<input type="text" name="name" 
  compulsory="yes" validate="alpha" callback="uniqueName" 
  size="20" />

The yellow highlighting in the above line marks the additional elements. In this case, they mean that the field is compulsory, must contain only letters, and should be checked for final validity by calling the user defined uniqueName() function.

There are a few other validation attributes, but the above gives a good idea of how the system works. The second problem is how to display errors. Part of the solution here is my custom <error> element, which can be used to associate an error message with an error in a particular field. In my tests I’ve been using this to display an exclamation mark next to invalid fields:

<input type="text" name="name" 
  compulsory="yes" validate="alpha" callback="uniqueName" 
  size="20" />
<error for="name"> !</error>

This associates the text inside the <error> element (in this case the single exclamation mark) with the “name” field.

So far, all I’ve done is add a bunch of invalid markup to an otherwise valid chunk of XHTML. The key is how this markup is processed. FormML (for want of a better name) is never passed to the browser; instead it is processed by my PHP FormProcessor class before being displayed. This class strips out all of the FormML tags, and also applies logic to the rest of the form based on the information from the tags. Because the whole thing is XHTML, this can be relatively easily achieved using PHP’s built in XML parser. The XML is modified on the fly to create the XHTML that is sent to the browser. In the above example, the contents of the <error> element would only be displayed if the form was being redisplayed and the user had not filled in their name correctly. In addition, the class populates the value attribute of each input element with the previously entered data and adds an “invalid” class to the element to allow it to be styled appropriately.

The next problem is to display a list of descriptive error messages describing the problem. This took slightly longer to work out: I needed a way of setting an error message for each potential problem (remember there are several ways a field can be invalid: it could have been left unfilled, or it could have failed a regular expression check, or it could have failed a callback function), and I also needed some way of indicating how these errors should be displayed. My eventual solution was to introduce a new <errormsg> element for specifying error messages, and a simple templating system (based on a few more custom tags) for indicating where these errors should be displayed. I’m not entirely happy with the way this works at the moment, but here’s what I’m using:

<erroritem> <li><message /></li></erroritem>
<errormsg field="name" test="compulsory">You did not enter your name</errormsg>
<errormsg field="name" test="alpha">Your name must consist <em>only</em> of letters</errormsg>

The <errormsg> elements can be placed anywhere in the markup, and will be removed before display. The <errorlist> fragment defines the template for the list of error messages, with the <erroritem> element enclosing the “template” for each error and the <message> element showing where the actual error message (as defined elsewhere) should be displayed. Note that the <errormsg> elements specify the field and the test that they relate to. It is not necessary to provide custom error messages for every possible field/test combination; the system generates moderately intelligent error messages in the event that one has not been provided.

That’s a lot of custom markup to define the behaviour of a form. The good news is that the actual PHP used to display, validate and redisplay the form is incredibly simple. Here it is:

$formML = '... formML XML code goes here ...';
$processor = new FormProcessor($formML);
if ($processor->validate()) {
    // The form has been submitted and is valid - process the data
    echo 'Form data is OK!';
} else {
    // This will display the form for the first time OR redisplay it
    // with relevant error messages

And that’s all there is to it.

The code is still under very heavy development. At the moment it’s messy, has several minor bugs (and possibly some major ones I haven’t yet uncovered), isn’t fully tested and is almost certainly not ready for deployment. Never-the-less, you can play with a demo form that uses it or grab the code here:

Incidentally, I haven’t mentioned javascript in the above in an attempt to keep things simple. I know client side validation is a great addition to stuff like this (provided it’s backed up by solid server side logic) and one of my longer term aims is to dynamically add the necessary javascript to the form during the processing phase, thus skipping the need to write any boring validation code by hand.

This is Easier form validation with PHP by Simon Willison, posted on 17th June 2003.

Next: Gorgeous CSS Rollovers

Previous: HTML Definition Lists

Previously hosted at