Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

The pirate’s code

So, now that “talk like a pirate” day has sadly come to an end, it’s time to reveal the five minute code hack that rendered my front page semi-legible for the best part of a day. It was actually pretty simple:

function piratify($text) {
    return preg_replace_callback('/>(.*?)</s', 'aaaar', $text);
}

function aaaar($text) {
    $text = $text[1];
    $a = array(
        '/\\bis\\b/' => 'be',            # is => be
        '/\\b([tT])he /' => "\\1'",      # the => t'
        '/\\bam\\b/' => 'be',            # am => be
        '/(\\w)v(\\w)/' => "\\1'\\2",    # v => ' (in words)
        '/ing\\b/' => "in'",             # ing => in'
        '/(\\w)ar(\\w)/' => "\\1aar\\2", # ar => aar (in words)
    );
    foreach($a as $re => $new) {
        $text = preg_replace($re, $new, $text);
    }
    return '>'.$text.'<';
}

My first attempt simply applied the 6 regular expressions shown above, but they mangled links within my entries as well. The solution was to use preg_replace_callback to target only text occuring outside of HTML tags (defined as anything between a > and a <). This turned a five minute hack in to half an hour of frenzied debugging as I’d already posted the change to my site! In fact, the whole lot was written at 2am in the morning with my friend Tristan after a night out with Andy. Some how cider makes for easier construction of regular expressions.

I’m not the only person to have written a piratify function: Dougal Campbell has one as well (also mentioned here). I’m looking forward to seeing his released code.

This is The pirate’s code by Simon Willison, posted on 20th September 2003.

View blog reactions

Next: Auto-complete text boxes

Previous: New virus?

9 comments

  1. You forgot to take off the "Aaaaaaaarrr, ya scurvy landlubbers!" in the header :)

    Jim Dabell - 20th September 2003 00:03 - #

  2. Oops. Actually, I'm tempted to leave it in there - it adds character :) Still, leaving piratisms around your site once "talk like a pirate" day is over is probably bad luck, akin to leaving your christmas decorations up well in to January. Aah well.

    Simon Willison - 20th September 2003 00:07 - #

  3. Yeah, my code still has the mangle-html-tags problem, as I haven't tried integrating your callback idea yet. I've been trying to think of an elegant way to put it in, because my code already has a chain of several functions that depend upon each other, and I hate adding another link in that chain :)

    But, I'll probably do it anyways, just to get it done with and released. I also implemented some other filters. I wrote the pirate filter on my own, but I borrowed the regexps for the other filters from Kalsey's MovableJive plugin.

    Dougal Campbell - 20th September 2003 15:18 - #

  4. nice hack for replacing only text outside of html tags. i used smth similar before, but without the callback, the code wasn't as nice as this ;)

    btw, why didn't you use arrays as parameters to the preg_replace() function? i think it could only work faster, and the code would be shorter

    (and if you want, you can still use associated array, just pass array_keys() and array_values() of the $a array to the preg_replace() function)

    zombie - 21st September 2003 00:41 - #

  5. Since I run my blog with Blosxom, I was able to use a ready-made plugin, blog-like-a-pirate, by Rob Hague. It works in much the same way as your code.

    Jason Clark - 22nd September 2003 18:34 - #

  6. Pirates don't write blogs, bloggers shouldn't write like pirates. Aaargh, I've said me piece.

    Jonathan Holley - 23rd September 2003 06:47 - #

  7. I finally got my code cleaned up, and I've posted it on my blog. You might be interested in the content matching regex I came up with to replace yours. Mine has two improvements: 1) it doesn't require you to hack the '>' and '<' chars back in at the end of your filter function, and 2) it will match text that is not between HTML tags (so it can filter plain text).

    Dougal Campbell - 4th October 2003 16:57 - #

  8. What Iam Asking? What is the search you are giving? So That India is not developed all ways.You must not talk about other sites. my sify site xhtml you are telling you don't worry myself know how to handle mysite. But What myself is asking . Shall my search yahoo took to the shopping a ship. That affects yahoo site for me and show myself is not seeing the site of www. yahoo.com . If it is send a reply to my mail e_margratteicsc@sify.com. and mention the phone number I made the call and talk to the people whoever it may be. Iam not bather and take care for anybody. My head bow down before God Of Almighty Only Who gave Life to me and He is my Shelter And My Prosperity. So My playing site nobody has the power to talk. yahoo customer care if the "TOS" violation can report in yahoo registered page. It is a human right to open mail in all sites. so who is co-ordinator of www.yahoo.com If you think you are a Good man your heart is pure send a male to me not backlinks. Backlinks is the one satan doing Satan is under my feet. because of htp Iam unable to go to yahoo site. Who know which http there are so many http who is the head let them send a mail if he or she is really human being. Yahoo site also not wrong and Regarding myself is also not wrong.If it is the Journal ofNewyork Times please send his number in Chennai I really Give a big ..... Don't wantto tell. I myself Go and Make Him Blue And White

    Dr.Margrateicsc - 7th November 2003 14:27 - #

  9. This post had become over-run with comments asking for people to help hijack hotmail accounts, so I've closed the thread.

    Simon Willison - 20th April 2005 16:41 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/09/20/pirateCode

A django site