Simon Willison’s Weblog

Subscribe

The pirate’s code

20th September 2003

So, now that “talk like a pirate” day has sadly come to an end, it’s time to reveal the five minute code hack that rendered my front page semi-legible for the best part of a day. It was actually pretty simple:

function piratify($text) {
    return preg_replace_callback('/>(.*?)</s', 'aaaar', $text);
}

function aaaar($text) {
    $text = $text[1];
    $a = array(
        '/\\bis\\b/' => 'be',            # is => be
        '/\\b([tT])he /' => "\\1'",      # the => t'
        '/\\bam\\b/' => 'be',            # am => be
        '/(\\w)v(\\w)/' => "\\1'\\2",    # v => ' (in words)
        '/ing\\b/' => "in'",             # ing => in'
        '/(\\w)ar(\\w)/' => "\\1aar\\2", # ar => aar (in words)
    );
    foreach($a as $re => $new) {
        $text = preg_replace($re, $new, $text);
    }
    return '>'.$text.'<';
}

My first attempt simply applied the 6 regular expressions shown above, but they mangled links within my entries as well. The solution was to use preg_replace_callback to target only text occuring outside of HTML tags (defined as anything between a > and a <). This turned a five minute hack in to half an hour of frenzied debugging as I'd already posted the change to my site! In fact, the whole lot was written at 2am in the morning with my friend Tristan after a night out with Andy. Some how cider makes for easier construction of regular expressions.

I’m not the only person to have written a piratify function: Dougal Campbell has one as well (also mentioned here). I’m looking forward to seeing his released code.

This is The pirate’s code by Simon Willison, posted on 20th September 2003.

Next: Auto-complete text boxes

Previous: New virus?

Previously hosted at http://simon.incutio.com/archive/2003/09/20/pirateCode