Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Supporting Conditional GET in PHP

This site’s RSS feeds now support Conditional GET. Since the feeds are dynamically generated on every request, adding support took a bit of hacking around with PHP. Here’s the function I came up with (based on the excellent description provided by Charles Miller in the article linked above):

function doConditionalGet($timestamp) {
    // A PHP implementation of conditional get, see 
    //   http://fishbowl.pastiche.org/archives/001132.html
    $last_modified = substr(date('r', $timestamp), 0, -5).'GMT';
    $etag = '"'.md5($last_modified).'"';
    // Send the headers
    header("Last-Modified: $last_modified");
    header("ETag: $etag");
    // See if the client has provided the required headers
    $if_modified_since = isset($_SERVER['HTTP_IF_MODIFIED_SINCE']) ?
        stripslashes($_SERVER['HTTP_IF_MODIFIED_SINCE']) :
        false;
    $if_none_match = isset($_SERVER['HTTP_IF_NONE_MATCH']) ?
        stripslashes($_SERVER['HTTP_IF_NONE_MATCH']) : 
        false;
    if (!$if_modified_since && !$if_none_match) {
        return;
    }
    // At least one of the headers is there - check them
    if ($if_none_match && $if_none_match != $etag) {
        return; // etag is there but doesn't match
    }
    if ($if_modified_since && $if_modified_since != $last_modified) {
        return; // if-modified-since is there but doesn't match
    }
    // Nothing has changed since their last request - serve a 304 and exit
    header('HTTP/1.0 304 Not Modified');
    exit;
}

Usage is simple: Work out the timestamp that the page content was last modified and call doConditionalGet($timestamp);. It will send the 304 header for you and exit if the client claims to have seen the content already—otherwise control will return to your main script and you can serve content as normal. Slightly inelegant, but it does the job.

Unfortunately I don’t have a Conditional-GET supporting RSS aggregator to hand so I have no idea if it works or not (so far I’ve only tested it by watching the headers sent with LiveHTTPHeaders). I’d be grateful if someone could confirm that this has had the desired effect.

Update: I’ve changed the above code sample (and my implementation) to send the ETag header as ETag rather than etag.

This is Supporting Conditional GET in PHP by Simon Willison, posted on 23rd April 2003.

View blog reactions

Next: Acrobot

Previous: Entry Titles

17 comments

  1. I submitted code to do this to the HTTP PEAR module maintainer last year, but my email was ignored. I think the best place to do it is in Apache anyway - see what the client and mod_php provide, and if they match, terminate the process and send the 304. That solves the problem on a wide scale, across multiple languages.

    Jim - 23rd April 2003 17:33 - #

  2. Aggie says "Channel has not changed since last read", NewzCrawler says "Not modified", so I'd say it's working right.

    Phil Ringnalda - 23rd April 2003 17:44 - #

  3. Just curious, why didn't you do funky caching? http://simon.incutio.com/archive/2002/11/16/funkyC achingExplained

    Sam Ruby - 23rd April 2003 19:07 - #

  4. Lazyness. For the moment I'm sticking with generating everything dynamically simply because that's what I'm most comfortable with. Funky caching is one of the many things on my "must play with that some day" list - I'll almost certainly use it for a future project and I may even move my blog to it some day but at the moment there isn't really any need to.

    Simon Willison - 23rd April 2003 19:26 - #

  5. It doesn't work with my homegrown aggregator. There are two problems.

    The first is that your etag header is being sent as "etag" and not "ETag". My HTTP Client (from PEAR) implementation is case sensitive, so your etag header is not seen.

    However, your "Last-Modified" header is being seen. Unfortunately, my implementation is partially broken and is sending a blank If-None-Match header to you. I should actually send nothing at all. However, since I am sending a blank header, your implementation is seeing that, and therefore, your conditional ($if_none_match && $if_none_match != $etag) is applying to my case, and so you are sending me the document again.

    If either one of us would fix our ends, it would work for me. However, since we're both slightly, and only slightly, broken... it doesn't.

    I'll fix my end.

    Reverend Jim - 23rd April 2003 22:06 - #

  6. I am now receiving a status code 304 from you.

    Reverend Jim - 23rd April 2003 22:10 - #

  7. I've changed the header I'm sending as well - thanks for the feedback.

    Simon Willison - 23rd April 2003 23:05 - #

  8. shameless plug: try cgi_buffer; it automagically handles ETag validation as well as gives you persistent connections. Works with PHP, Perl and Python.

    Mark Nottingham - 24th April 2003 02:31 - #

  9. Thanks Mark, looks good.

    Jim - 24th April 2003 03:19 - #

  10. Magpie is an aggregator and RSS parser in PHP that uses Snoopy to support doing a conditional GET.

    kellan - 24th April 2003 04:22 - #

  11. How about using Apache's own mod_rewrite to handle this? As in, rewrite the request if the file doesn't exist. Dump the results to the file so it's served statically. Just delete the file if/when the content driving it might have changed. This way the pages are built and saved only on the first request.

    Bill Kearney - 26th April 2003 15:42 - #

  12. Bill - that's exactly what I do.

    Sam Ruby - 26th April 2003 16:30 - #

  13. Just wondering if there was a time when PHP started handling the http request headers for if-modified-since and if-none-match. I took your script (changing to $HTTP_SERVER_VARS for my older PHP) and added it to my news script when an individual story is hit like http://www.digitalhit.com/news/main/2003/7/30/toro nto-rocks I checked it on the cacheability engine and though it tossed out the last-modified and etag headers, the cacheability engine said it couldn't validate the information...which leads me to believe it's not sending out the 304's when hit again. Any way I can test this on my own?

    Ian Evans - 20th August 2003 09:00 - #

  14. Hey, old posting but i discovered a 'little' bug...

    You are using $last_modified = substr(date('r', $timestamp), 0, -5).'GMT'; to get $last_modified. date('r') returns only 1 digit for the day sometimes, and this seems to be a problem, the header will be "Jan, 01 1970 ..."
    Instead it would be better to use the following: $last_modified = gmdate("D, d M Y H:i:s", $timestamp).' GMT';

    Jan Piotrowski - 9th April 2004 14:48 - #

  15. WRT Reverend Jim's comment, treating HTTP headers as case-sensitive isn't just slightly broken, what if every header was in a different case to what your client expects?

    However the blank If-Modified-Since header raises a good point about conditional GET code, if any error occurs in the conditional GET code or if the If-Modified_Since is invalid in some way then it's important that you treat it as if there was no If-Modified-Since (or If-None-Match) header and do the normal 200 OK response. Doing so is always safe in this context and can be used as a short cut (so you don't have to accept all 3 of the datetime formats RFC 2616 says you should accept).

    If you are doing something like this and you can't be sure that the page that would be generated at that time isn't octet-to-octet identical to the one that was generated earlier then you should probably use weak ETags.

    Jon Hanna - 7th May 2004 13:54 - #

  16. There is the gmdate() function in PHP to return a GMT date. In your case:
    $last_modified = gmdate('D, d M Y H:i:s \G\M\T', $timestamp);

    Alex - 29th June 2004 01:14 - #

  17. Surely this (bandwidth issues) wouldn't be such a problem if syndication was used as it was intended?

    tim - 21st July 2004 19:47 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/04/23/conditionalGet

A django site