Simon Willison’s Weblog


A few more thoughts on plinks

30th May 2004

From the comments on my plinks entry, it seems some people are seeing ugly green hash marks all over the place. If that includes you, you need to force-reload my stylesheet to ensure you are getting the copy with the plink hiding styles.

One of the things I missed in last night’s 1am coding frenzy was the idea of globally unique identifiers for every paragraph, as described by Chris Dent. This leads in to a fascinating concept called Transclusion, which originated with Ted Nelson (the father of hypertext) and involves content that is managed by reference.

Now interesting though Transclusion is I’m not convinced that it’s a useful addition to my blog. However, there is a far more pressing need for globally unique paragraph idenfifiers that has only just cropped up: my index page. On it, I display a number of different entries at once. IDs in XHTML must be unique for the current document, so if I have two entries on the front page that contain paragraphs with clashing identifiers I lose validity and, most probably, God kills a kitten.

There are two ways of solving this. Firstly, I could give every paragraph on the site a globally unique identifier—something Chris calls a Node ID. That doesn’t really tempt me: it’s quite a bit of work, and as I’m not currently interested in Transclusion (although maybe I should be) I don’t gain anything from it other than a valid index page. The second alternative is the one I’ve gone for: I’m simply stripping all paragraph IDs from the entries when they are displayed on the front page of the site (and for the entries-by-day views as well). It’s a little hackish and it means my CMS is now doing a bit of lifting when previously it was blissfully unaware of the numbers, but at least it solves the problem at hand. I kind of like the ID of the addressable paragraphs only existing on the “official” entry page in any case.

Here’s the PHP I use to strip out the IDs:

$entrytext = preg_replace('/<p id="p-[^"]+"/', "<p", $entrytext);

One of the many benefits of writing software for yourself is that you can often take huge liberties: I know for a fact that this naive regular expression (as opposed to a more resilient technique using an XML tool of some sort) will work on all 1420 entries on this site because, well, I wrote them all.

This is A few more thoughts on plinks by Simon Willison, posted on 30th May 2004.

Next: Wikipedia enhancements

Previous: plinks - a purple numbers variant

Previously hosted at