Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

Neat tip for clean URLs

Here’s one of the neatest tips for clean URLs I’ve seen yet, from Thijs van der Vossen. He’s come up with a mod_rewrite rule that checks to see if the requested file exists if you add .html on to the end of it, and serves it up if that’s the case. I’m posting the full code snippet here because it’s just too good to risk losing to link-rot in the distant future:

RewriteEngine on 
RewriteBase /
RewriteCond %{REQUEST_FILENAME}.html -f
RewriteRule (.*) $1\.html [L] 

This is Neat tip for clean URLs by Simon Willison, posted on 6th August 2003.

View blog reactions

Next: Notepad popups

Previous: More links

18 comments

  1. It's better to map it the other way and add an [R] on the end. That way the canonical URL is the neater one, and requests for the .html version are redirected - which means only one HTTP object, and thus far better cachability. Better yet, serve a 301 Moved Permanently so that software can pick up the change to point to the new URL.

    For what it's worth, I prefer to use a .content extension for things that are presented to the end-user, and an .action extension for things that just process data (along with .style, .script and so on).

    Jim Dabell - 6th August 2003 20:37 - #

  2. Simon, one of the reasons for using clean url's is my intention to keep all items online forever, regardless of the server technology used at any given time. Link-rot is therefore extremely unlikely. ;-)

    Thijs van der Vossen - 6th August 2003 22:22 - #

  3. This is unecessary, just add Options +Multiviews and let Apache's fast content negotiation engine (as opposed to mod_rewrite) handle this.

    Matt - 6th August 2003 22:41 - #

  4. Jim, either I cannot follow your comment, or you do not fully understand the rule. Why is it better to 'map it the other way'? The intention of the rule is to serve the file .../2003/07/clean_url.html from the server filesystem when a request for the url http://www.vandervossen.net/2003/07/clean_url is made. Adding a [R] to the last rule would redirect clients requesting http://www.vandervossen.net/2003/07/clean_url to http://www.vandervossen.net/2003/07/clean_url.html, but that's exactly what I don't want to do.

    Thijs van der Vossen - 6th August 2003 22:58 - #

  5. Matt is right. This special case is handled by +MultiViews very fine. I have been thinking about the url issue for months and have finally come up with the following scheme: RewriteEngine On

    # / or index.html is requested so call delegator.php with a special parameter
    RewriteRule ^$ delegator.php?url=index/&host=%{HTTP_HOST}&from=%{ REQUEST_URI}index/ [QSA,L]
    RewriteRule ^index\.html$ delegator.php?url=index/&host=%{HTTP_HOST}&from=%{ REQUEST_URI} [QSA,L]

    # then if we have a request which is neither file nor dir send the request to delegator.php
    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule (.*) delegator.php?url=$1&host=%{HTTP_HOST}&from=%{REQU EST_URI} [QSA,L]


    This catches anything which is not a file, if you put it into a directory (within .htaccess of course). Please see discussion on sitepointforums for more details. http://www.sitepointforums.com/showthread.php?thre adid=117103

    PS: Simon, the textarea for posting is way too small and no PRE tag is allowed :(

    Sasa Velickovic - 6th August 2003 23:05 - #

  6. There are a few issues here. Firstly, there's the thought that a .html extension is not portable when you want to switch to PHP or something. This is not true. File extensions mean nothing in terms of HTTP. You can configure a server to run .html files through PHP easily.

    So now it comes down to aesthetics. A lot of people prefer URLs without extensions as they are "cleaner". That's understandable, I moved to the scheme I described above for similar reasons.

    This does not mean you should just serve up the object whenever you think you can find a corresponding file though. When you serve a file through two different addresses, you are actually serving two completely separate objects in terms of HTTP. /foo and /foo.html may be the same file on your server, but they are completely different documents to a web browser, a proxy, or any HTTP client.

    The most immediate effect of this is to interfere with caching for no good reason. This bogs down your server with requests and file transfers that are completely unnecessary, and can be very annoying for people with intermittent access to the web, for instance, people who use WWWoffle.

    What the documents you refer to advocate, and which I do as well, is designing a good, stable scheme for URLs, and implementing a sensible way of moving to that scheme.

    So if you have already published under the .html extensions, you need to construct 301 notices to inform people of their new location, rather than having two separate objects. I don't know if you can do 301s through mod_rewrite or not, but [R] is a decent substitute, certainly a lot better than a duplicate object.

    When I referred to mapping it the other way, perhaps I wasn't being clear. What I meant was that in the face of /foo.html, you should provide a 301 notice pointing to /foo. You can then use whatever server mechanism you choose (such as mod_redirect) to serve it.

    Multiviews is a different animal, and isn't really designed for this. You can run into trouble if you have different types of files in the same directory. It also reduces the effectiveness of caching, by serving individual objects to each unique Accept header that the server sees.

    Jim Dabell - 7th August 2003 00:28 - #

  7. Jim, I agree it is not so nice to have the same file available on two different adresses.

    I did not write a rule to redirect foo.html to foo because I did not yet have any inbound links -- or any visitors -- at the time I implemented the rewrite rule.

    The best thing to do IMHO is not to redirect, but to respond with a 404-Not Found when someone requests something with a .html extension.

    Thijs van der Vossen - 7th August 2003 08:37 - #

  8. The best thing to do IMHO is not to redirect, but to respond with a 404-Not Found when someone requests something with a .html extension.

    Absolutely, if you have not published anything under the .html URL. Of course, it's different if people have bookmarked the page, or linked to it or anything. No sense in breaking things when you can clearly signal the move.

    Jim Dabell - 7th August 2003 12:50 - #

  9. Sasa, I do the same type of thing you do, but most of what you're doing is completely unnecessary. This is what I use: RewriteCond %{REQUEST_FILENAME} !-f RewriteCond %{REQUEST_FILENAME} !-d RewriteRule . dispatcher.php The query string and all environment variables are *already* available from your PHP script, and there's no reason to do special cases for the index page, etc.

    Keith - 7th August 2003 14:41 - #

  10. Grrr.... sorry for not previewing.

    RewriteCond %{REQUEST_FILENAME} !-f
    RewriteCond %{REQUEST_FILENAME} !-d
    RewriteRule . dispatcher.php

    Keith - 7th August 2003 14:45 - #

  11. Thijs, you might want to review your mod_rewrite because just adding a trailing slash to a url confuses it in a way that doesn't seem to be pleasant for your webserver.

    Thorn - 7th August 2003 14:58 - #

  12. Yes, I know. I have absolutely no idea why this is happening. Any suggestions would be greatly appreciated. ;-)

    Thijs van der Vossen - 7th August 2003 15:18 - #

  13. Well, I'm not much of an Apache guy but it looks as if it gets caught in a loop of redirects or something. You know, one page redirecting to another which in its turn redirects back to the first page.
    I don't know if it's possible with Apache to have the mod_rewrite also apply to the standard HTTP error pages. If so, you might want to take look at these error pages as the culprit for the redirecting behaviour.
    I cannot verify if it eventually times out (our own proxy server times out before that) but when it doesn't, I might be on to something ;-)

    Thorn - 7th August 2003 16:04 - #

  14. It does not appear to be a redirection problem, checked this with Mozilla's live http headers and by doing a 'raw' http request using telnet. In fact, the server does not respond with anything at all...

    Thijs van der Vossen - 7th August 2003 17:01 - #

  15. It is an internal redirection problem. This is what I got in the rewrite logs:

    217.19.22.138 - - [07/Aug/2003:18:33:09 +0200] [www.vandervossen.net/sid#81256f4][rid#830119c/ini tial/redir#69] (3) [per-dir /var/www/vandervossen/htdocs/] add path-info postfix: /var/www/vandervossen/htdocs/2003/08/meesterlijk -> /var/www/vandervossen/htdocs/2003/08/meesterlijk/. html.html.html.html.html.html

    I don't know why the rewrite module thinks %{REQUEST_FILENAME}.html is a file, but adding RewriteCond %{REQUEST_URI} !/$ appears to fix this problem.

    Thijs van der Vossen - 7th August 2003 17:45 - #

  16. I've implemented this solution on my site, but have two problems:

    1. My server also freaks out when a trailing slash is added to a clean URL. I'm concerned that some users will think URLs without a file extension lead to a directory; because of this, they may become confused if adding a trailing slash doesn't work.
    2. I'd rather not have two URLs point to the same resource (clean and with extension). If anyone can point me to a resource showing me how to serve files saved without an .html extension as HTML, I'd appreciate it. I'm a bit new to this.

    Wayne Burkett - 11th August 2003 07:40 - #

  17. I've been playing around with this technique, and I've run into a wall. Apparently, when you type the following address:

    www.example.com/usability/sample_document

    ... when "usability.shtml" exists, there is no "usability" directory, and "sample_document" would otherwise return a 404 error, the %{REQUEST_FILENAME} variable contains the full path to "usability" -- which when the %{REQUEST_FILENAME}.shtml -f condition is tested, returns true instead of false. This sends the mod_rewrite into an infinite loop. It never reaches a 404 error!

    Any suggestions anyone? Has anyone else run into this problem? Thanks for any help! I wasn't sure where else to post this specific problem. :)

    Ben Clark - 6th April 2004 21:09 - #

  18. felt they weren't real. It's in the real world that anxiety takes root." LIKE many people, I get about a dozen emails a day bearing news good and bad. The bad is that my penis is too small, too soft and lacking the endurance to satisfy a fruit fly. The good is I can build a longer, stronger and everlasting erection for a few hundred dollars — by taking miracle pills. Example: "Get ready to be stopped by women in the street. Your entire image will emanate increased size! This is what you always needed to lead a happier, more fulfilling life." What's being promised is akin to Jack's magic beans, except penis-enlargement pills don't work so spectacularly. To get the extra inches requires at least a six-month commitment. But the pills need to be taken with an exercise program — "jelq" — including drills similar to stretching hamstrings before jogging. To see what it takes to become a Mr Big, go to enlargepenisguide.com. You'll find a nude man, a fairly happy man one imagines, pretending to be a clock, with what appears to be a baby's arm grafted to his pubic bone as the minute hand. By the time I found this impressive fellow, I'd already paid $106 for a month's supply of SizePro (chosen because of its professional-sounding name) and followed these instructions: "Type your name, the number of inches you want to gain, and the reason(s) you want to gain those inches in the blanks below. And read the completed statement out loud to reinforce the commitment that will lead to your ultimate success." And so my colleagues heard me pledge earnestly: "I, John Elder, have decided I want to gain two inches in length and one inch in girth (I felt modest ambition would minimise disappointment). My reasons are vanity. And I'm committed to a good penis-pill system until I reach my desired gains." If I hadn't made this pledge, I could have abandoned the project — particularly after spotting Mr Baby Arm, whom I presume is also trying to improve himself. And that's the rub. If you're born with one of these ridiculous organs, there are times when just about every man feels short-changed. The average size of an erect penis is about 15.24 centimetres — six inches in the old money. (When talking about penis size, it's traditional to use inches.) The sad thing is it seems there are many men living fretfully with a ruler in one hand and a world of hope in the other. To meet some of these people, return to http://enlargepenisguide.com — and log on to the "progress reports" forum. You'll find men apparently taking the pills, diligently jelqing (stretching a flaccid penis) and sharing how it's hanging. Like Nicky: "I'm 21, and, measured from the pelvic bone, the length of my penis is around 7.5 inches, but I've always wanted to be large like a porn star. I've been doing the exercise a few days now …" Occasionally, someone claims spectacular results. The simple reason is that the pills — herbal aphrodisiacs, not muscle-building proteins — give little more than an illusion of growth by concentrating blood in the otherwise shrivelled underbelly. But the real joke is that the more anxious one becomes about penis size, the more it is likely to shrink. "The curious thing about our society, most of the time we pretend that the penis doesn't shrink," says David Mitchell, a doctor and a medical anthropologist. "In fact, the penis doesn't have a set flaccid size. It's actually meaningless to measure the size of the penis because it varies from minute to minute according to the temperature and one's state of mind. The trouble is, if you get anxious, it only makes it smaller, to the point where it can disappear … in cases where anxiety spirals into a panic attack." Dr Mitchell has researched a recent outbreak of these attacks — known as "shrinking penis disease" — on the Indonesian island of Flores, where black magic is widely practised. In these instances, the sufferer believes he will die if his penis disappears. The last outbreak in a modern society occurred in Singapore in 1962, following a rumour that eating pork vaccinated against swine fever would cause shrinking penis disease. "There were people rushing through the streets holding their penises … some of them using chopsticks," Dr Mitchell says. "As soon as they hit the hospital and started to relax, they came back to normal." Dr Mitchell says the disease could re-emerge in the Western world. "It could come back again in our society if someone spread the right stories around," he says. Chris Fox, of La Trobe University, is doing a PhD on penis size and its role in body image. So far, he has interviewed 15 men aged 20 to 75. "The short answer is that every man at some point in his life worries about the size of his penis," Mr Fox says. "If we don't like our penis we won't enjoy sex. For people with a pathological issue with penis size, it will affect their sex life. "In some cases it will affect how they behave around other men. And one has to remember that most people make their comparison with a flaccid penis — at the urinal or in a change room. The only erections we tend to see are the very big penises on porn stars … and my interview subjects didn't feel threatened by these giant penises because they

    Penis Enlargement - 13th September 2006 13:33 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/08/06/cleanURLtip

A django site