Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Simple mini-languages with PHP

I linked to PDML the other day in my blogmarks, but beyond a cursory glance I hadn’t really dug in to what makes it tick. Dumky over at Curiosity is bliss points out that it makes use of an ingenious output buffering trick. To create a PDML document, you add a single line to the top of a page that includes and executes the PDML library (written in PHP). The rest of the document is written in the custom PDML markup language. The script uses output buffering to capture the rest of the page, then executes a callback function that actually processes the page content (see ob_start() for details).

As Dumky points out, this can be used to implement mini-languages for pretty much anything—and PHP 5’s excellent XML support means most of the parser work is handled for you. It could also act as a neat way of hooking in to things like server-side XSLT processors.

This is Simple mini-languages with PHP by Simon Willison, posted on 12th May 2004.

View blog reactions

Next: Supplemental Results

Previous: W3C Internationalisation Guidelines

12 comments

  1. That's very similar to how I ensure validity on my sites that may have legacy scripts or content. A couple of functions run on the output buffer (like texturize) to pretty up the content and fix obvious XHTML mistakes and problems. I haven't found a problem yet that couldn't be fixed with regular expressions. ;)

    Matt - 13th May 2004 00:35 - #

  2. And now that I think of it, that would be a handy way to implement templates in WordPress...

    Matt - 13th May 2004 00:37 - #

  3. Think Tidy fits into this as well. If you configure Apache to give old .html pages to PHP, use output buffering and Tidy and there's pretty much nothing you can't do.

    Harry Fuecks - 13th May 2004 01:41 - #

  4. Harry Fuecks: You could use a .htaccess file to process the .html files as PHP script, and prepend a PHP include:
    # .htaccess in directory you want prepended files
    # Force html files to be of type php:
    <Files *.html>
    ForceType application/x-httpd-php
    </Files>
    
    # prepend 'myPhpInclude.php' to all files:
    php_value auto_prepend_file myPhpInclude.php

    Mathieu 'P 01' HENRI - 13th May 2004 01:51 - #

  5. Hi Mathieu - exactly - have ranted about this before here. The problem I was try to solve was adding PHP manual-like user comments to generated API docs (e.g. phpDocumentor)

    Harry Fuecks - 13th May 2004 02:04 - #

  6. I do like the output buffering mechanism used here, but I'd also like to hear some interesting ways to implement PDML in a site. I've listed a few thoughts on my PHP blog site. Using this library with Smarty templates gives me a couple good ideas right off the bat.

    John Herren - 13th May 2004 06:33 - #

  7. I use output_buffering for catching XML and then putting it through XSLT, indeed. About the PDML: it's just a pity that there's already a W3C standard that does the same thing: XSL-FO. Thus, PDML is reinventing the wheel...

    Manuzhai - 13th May 2004 06:42 - #

  8. it's just a pity that there's already a W3C standard that does the same thing: XSL-FO. Thus, PDML is reinventing the wheel...

    On the one hand yes but on the other, XSL-FO reminds me of a joke about asking a Yorkshire farmer for directions and getting the answer "You don't want to start from here".

    I'd argue the XSL-FO is re-inventing the wheel. (X)HTML is already a good spec for marking up documents (particularily because many people know it) and it's largely possible to translate it's formatting directly to PDF. My main criticism of PDML is it adds additional tags to HTML but it's certainly going to be easier to use than XSL-FO.

    Also think such "transformations" should be completely independant of source formatting syntax - it should be possible to translate RTF, Word, HTML or whatever to PDF, for example.

    Harry Fuecks - 13th May 2004 08:51 - #

  9. I'm not saying XSL-FO is flawless... It's rather bloated and very hard to use (I have tried to do some stuff with it, but I just couldn't get it right). I agree that it should be way easier and more like XHTML, but then it adds a layer that is probably very useful in print businesses and the like. PDML, on the other hand, seems too simplistic and limited to me. The stuff in their examples isn't even XML! Parsing it would probably be a whole lot easier if it was.

    Manuzhai - 13th May 2004 09:48 - #

  10. it's just a pity

    qq,q,q - 13th May 2004 14:15 - #

  11. # .htaccess in directory you want prepended files
    # Force html files to be of type php:
    <Files *.html>
    ForceType application/x-httpd-php
    </Files>
    
    # prepend 'myPhpInclude.php' to all files:
    php_value auto_prepend_file myPhpInclude.php

    Even better, just turn this on for files with a .pdml extension. Then you can still use plain html for html, and if you want other prepending / appending files with PHP you can set them up to have a special file extension too.

    # .htaccess in directory you want prepended files
    # Force pdml files to be of type php:
    <Files *.pdml>
    ForceType application/x-httpd-php
    php_value auto_prepend_file myPhpInclude.php
    </Files>

    Lach - 14th May 2004 02:33 - #

  12. doh!

    Mathieu 'P 01' HENRI - 14th May 2004 10:54 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2004/05/12/simpleMiniLanguages

A django site