The pirate’s code
So, now that “talk like a pirate” day has sadly come to an end, it’s time to reveal the five minute code hack that rendered my front page semi-legible for the best part of a day. It was actually pretty simple:
function piratify($text) {
return preg_replace_callback('/>(.*?)</s', 'aaaar', $text);
}
function aaaar($text) {
$text = $text[1];
$a = array(
'/\\bis\\b/' => 'be', # is => be
'/\\b([tT])he /' => "\\1'", # the => t'
'/\\bam\\b/' => 'be', # am => be
'/(\\w)v(\\w)/' => "\\1'\\2", # v => ' (in words)
'/ing\\b/' => "in'", # ing => in'
'/(\\w)ar(\\w)/' => "\\1aar\\2", # ar => aar (in words)
);
foreach($a as $re => $new) {
$text = preg_replace($re, $new, $text);
}
return '>'.$text.'<';
}
My first attempt simply applied the 6 regular expressions shown above, but they mangled links within my entries as well. The solution was to use preg_replace_callback to target only text occuring outside of HTML tags (defined as anything between a > and a <). This turned a five minute hack in to half an hour of frenzied debugging as I’d already posted the change to my site! In fact, the whole lot was written at 2am in the morning with my friend Tristan after a night out with Andy. Some how cider makes for easier construction of regular expressions.
I’m not the only person to have written a piratify function: Dougal Campbell has one as well (also mentioned here). I’m looking forward to seeing his released code.
You forgot to take off the "Aaaaaaaarrr, ya scurvy landlubbers!" in the header :)
Jim Dabell - 20th September 2003 00:03 - #
Simon Willison - 20th September 2003 00:07 - #
Yeah, my code still has the mangle-html-tags problem, as I haven't tried integrating your callback idea yet. I've been trying to think of an elegant way to put it in, because my code already has a chain of several functions that depend upon each other, and I hate adding another link in that chain :)
But, I'll probably do it anyways, just to get it done with and released. I also implemented some other filters. I wrote the pirate filter on my own, but I borrowed the regexps for the other filters from Kalsey's MovableJive plugin.
Dougal Campbell - 20th September 2003 15:18 - #
nice hack for replacing only text outside of html tags. i used smth similar before, but without the callback, the code wasn't as nice as this ;)
btw, why didn't you use arrays as parameters to the preg_replace() function? i think it could only work faster, and the code would be shorter
(and if you want, you can still use associated array, just pass array_keys() and array_values() of the $a array to the preg_replace() function)
zombie - 21st September 2003 00:41 - #
Jason Clark - 22nd September 2003 18:34 - #
Jonathan Holley - 23rd September 2003 06:47 - #
I finally got my code cleaned up, and I've posted it on my blog. You might be interested in the content matching regex I came up with to replace yours. Mine has two improvements: 1) it doesn't require you to hack the '>' and '<' chars back in at the end of your filter function, and 2) it will match text that is not between HTML tags (so it can filter plain text).
Dougal Campbell - 4th October 2003 16:57 - #
Dr.Margrateicsc - 7th November 2003 14:27 - #
Simon Willison - 20th April 2005 16:41 - #