Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

A few more thoughts on plinks

From the comments on my plinks entry, it seems some people are seeing ugly green hash marks all over the place. If that includes you, you need to force-reload my stylesheet to ensure you are getting the copy with the plink hiding styles.

One of the things I missed in last night’s 1am coding frenzy was the idea of globally unique identifiers for every paragraph, as described by Chris Dent. This leads in to a fascinating concept called Transclusion, which originated with Ted Nelson (the father of hypertext) and involves content that is managed by reference.

Now interesting though Transclusion is I’m not convinced that it’s a useful addition to my blog. However, there is a far more pressing need for globally unique paragraph idenfifiers that has only just cropped up: my index page. On it, I display a number of different entries at once. IDs in XHTML must be unique for the current document, so if I have two entries on the front page that contain paragraphs with clashing identifiers I lose validity and, most probably, God kills a kitten.

There are two ways of solving this. Firstly, I could give every paragraph on the site a globally unique identifier—something Chris calls a Node ID. That doesn’t really tempt me: it’s quite a bit of work, and as I’m not currently interested in Transclusion (although maybe I should be) I don’t gain anything from it other than a valid index page. The second alternative is the one I’ve gone for: I’m simply stripping all paragraph IDs from the entries when they are displayed on the front page of the site (and for the entries-by-day views as well). It’s a little hackish and it means my CMS is now doing a bit of lifting when previously it was blissfully unaware of the numbers, but at least it solves the problem at hand. I kind of like the ID of the addressable paragraphs only existing on the “official” entry page in any case.

Here’s the PHP I use to strip out the IDs:


$entrytext = preg_replace('/<p id="p-[^"]+"/', "<p", $entrytext);

One of the many benefits of writing software for yourself is that you can often take huge liberties: I know for a fact that this naive regular expression (as opposed to a more resilient technique using an XML tool of some sort) will work on all 1420 entries on this site because, well, I wrote them all.

This is A few more thoughts on plinks by Simon Willison, posted on 30th May 2004.

View blog reactions

Next: Wikipedia enhancements

Previous: plinks - a purple numbers variant

17 comments

  1. Joe Clark uses the post title as a part of the unique IDs. He is also ahead of the rest using this for (almost) every element.

    Anne - 30th May 2004 22:36 - #

  2. Though it may require a tricky combination of server-side code and javascript depending on the exact results you are seeking, but if you're up to it, make each paragraph on the front page have a link to itself on the archive page.

    For example, you may want to try using classes:

    <p class="p-3">Blah blah blah. <!-- JS-generated link --> <a href="archive/page#p-3">#</a></p>

    Lenny Domnitser - 30th May 2004 23:59 - #

  3. Lenny: that's certainly a possibility.

    Simon Willison - 31st May 2004 00:03 - #

  4. I find the effect too subtle to be noticed. For ages I could not see any plinks at all. I was expecting the whole paragraph to change background colour.

    Also, Opera 7.50 does not show them as links.

    Chris Hester - 31st May 2004 00:20 - #

  5. I have been thinking about this for some time. I really like the Transclusion, as it provides a universal means to identify, link to, and quote direct, or near direct text (the next step would add the verse).

    I like the javascript approach, but ultimately I would like to have PHP add these elements. I am also thinking of implementing my permalink identifier and the paragraph, that way the entry can be parsed along with the paragraph number. Things do get out of whack when edits are made, unless they are entered as updates after the last paragraph.

    What I have been playing with would be <p id="1244p7">. This is the 1244th entry and paragraph 7.

    The use of CSS is brilliant in this manner as it make it easy, once one knows to look for the faint grey hash. This would make things relatively easy to add and not interfere with the visual reading of the text.

    vanderwal - 31st May 2004 03:12 - #

  6. Vanderwal, that's a nice though, but it needs amending somewhat. IDs cannot begin with numbers. <p id="e1244p7"> perhaps.

    Lach - 31st May 2004 03:54 - #

  7. Am I the only one to find it ironic that each paragraph has a link to itself, while the comments don't have permalinks (visiblem ones, at least)?

    Lenny Domnitser - 31st May 2004 05:18 - #

  8. It's funny you sould say that Lenny, because I added those about half an hour before you posted. Older entry pages are cached though so it's likely you couldn't see the permalinks on this page when you added your comment as the page had not yet been rebuilt.

    Simon Willison - 31st May 2004 05:42 - #

  9. Safari's not showing anything up either unfortunately.

    Wesley Mason - 31st May 2004 09:09 - #

  10. There is an interview with Doug Engelbart (the inventor of hyper-text) and his original idea was "you could give somebody a link right into anything, so you could actually have things that point right to a character or a word or something". He's right, why stop at paragraphs, why not sentences, lists, tables etc. Interesting interview and worth a read.

    CpILL - 31st May 2004 15:53 - #

  11. There is indeed value for having each content-structure (headings, lists as well as paragraphs) addressable. Indeed I'm convinced enough that the next cms-type system should have this feature - although something with friendlier link names. Joe Clark is on to something with his basic-esque line number increments, but I'd prefer a system based around friendly identifiers for headings, tables, lists, and friendly + line numbers for paragraphs, table cells (co-ordinates?) and list items.

    I'm not convinced about the approach to displaying these identifiers. You mention your dislike of other approaches as being too website dependant - I can agree. Your solution is an improvement in that regard - hidden until needed.

    Although I think a better solution may be to make the javascript parts of the functionality as a bookmarklet. This has the advantage of one bookmark being usable on sites other than your own. A bookmarklet that works on incution.com will also automatically work on Joe Clark's material.

    As I understand it, fragment identifiers are done as an id attribute in any element in XHTML, and using the name attribute on anchors in HTML. A bookmarklet that "enabled"/"disabled" the display of fragment identifiers would be useful.

    Do you see any drawbacks with that idea? I guess the one downer is that sites require something obvious to indicate the presence of fragment identifiers. I hesitate to suggest the use of a small icon - partly because of the backlash against the orange XML icon.

    The second drawback I am aware of is the id attribute is heavily used to create hooks into markup for CSS, not particularly for identifying document fragments.

    Isofarro - 31st May 2004 18:22 - #

  12. named anchors by Jesse Ruderman does exactly that.

    Simon Willison - 31st May 2004 18:29 - #

  13. Since you're using JavaScript for the implementation, and most users find the hash marks to be distracting, why not a different approach? My idea: A button would exist on the page that the user can click on to begin the "create a hyperlink" process. The user can then simply click on the paragraph that they wish to hyperlink to. A text box would then appear (either on the page or in a pop-up) with the full url to the hyperlinked item.

    Jonathan Snook - 31st May 2004 19:43 - #

  14. I can see the potential for abuse however, as illustrated by this article from Groklaw. If SCO had just marked this up in one location and then referenced it in various other places via Transclusion, the original content would be probably never be found.

    s.oteric - 1st June 2004 11:47 - #

  15. I think that transclusion won't work on the web.

    Michael Day - 2nd June 2004 01:28 - #

  16. I thought about this issue before, and there is a need for this type of thing, BUT, this is not useful for changing content. It is suitable for articles (usually long) that explain something in detail, and referring to paragraphs for those cases are really useful. Javascript solution is also the correct solution here.

    Alex - 2nd June 2004 10:59 - #

  17. One question is for whom the plinks are intended. This system seems to me way too unobtrusive for the casual reader. But if the point is to be able to create hypertextual structures within your own writing, the unobtrusiveness makes the kinetic sculpture aspect of hypertxtuality seem all the more magical.

    Kathryn Cramer - 3rd June 2004 00:36 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2004/05/30/morePlinks

A django site