Feed Sign in with OpenID OpenID

Simon Willison’s Weblog

The good and the ugly

PHP.net has a new feature on their search page—a really nice implementation of an auto complete text widget in Javascript. Even better, the search page is valid XHTML 1.0 Strict and uses CSS for the layout. Let’s hope this is an indication of things to the come for the rest of the site, which still mostly consists of tag soup.

Here’s the ugly bit: the javascript for the auto complete function is deliberately obfuscated. Now I know that this decision is completely up to the author of the script, but personally I find it exasperating. PHP is an open source project, and obfuscation in this way is the antithesis of the open source ideal. A big part about open source is that people shouldn’t have to invent something twice—why waste duplicated effort when sharing code costs nothing and benefits everyone? I’m sure the author had their reasons for hiding the code in this way but to me it seems like a wasted opportunity to teach site visitors a useful new trick. A bug concerning the obfuscation has already been raised in PHP’s bug tracker but was closed without a full explanation.

Obfuscation of client side code such as Javascript is a pretty futile exercise in any case. Most of the effect of the obfuscation can be easily reversed using a tool such as Jesse Ruderman’s view variables bookmarklet, which displays all variables on a page (including ones that contain decoded content from obfuscated variables) and pretty-prints functions to make them more readable.

It’s impossible to prevent “theft” of your Javascript, but if you really want to stop people from using it the best you can do is to place a copyright notice in the code and ask people to contact you for licensing options. If it’s on the web, people can take it. Clear copyright messages are a far more ethical deterrent than ineffective tricks.

Update: It turns out the obfuscation was the result of compressing the Javascript for efficiency reasons—see my apology for further information.

This is The good and the ugly by Simon Willison, posted on 13th November 2003.

View blog reactions

Next: Click Maps

Previous: Extracting EXIF data with Python

19 comments

  1. Completely agree with you on obfuscation: useless, and not the way open source evagelism works. Check out a script by Dave at stilleye: this IS an excellent autocomplete script

    Sergi - 14th November 2003 00:12 - #

  2. Something's wrong here, I'm not seeing any of this on the "Glazman loser list". This can't be!

    neverFails - 14th November 2003 00:23 - #

  3. PHP's search page should use autocomplete=off to discourage browsers from using built-in autocomplete features for that field. It doesn't look good to have two autocomplete dropdowns on the same textbox :)

    Jesse Ruderman - 14th November 2003 00:25 - #

  4. Jesse: But then the site wouldn't be valid XHTML 1.0 Strict anymore. :)

    Matt Brubeck - 14th November 2003 01:15 - #

  5. Might just be a way to compress it. I always do that with my js.

    stylo~ - 14th November 2003 03:42 - #

  6. Performance reasons? What performance? It can't be quicker to run on the client if it's obfuscated, and they obviously aren't worried about transmission time, since they haven't bothered to tune the HTTP headers sent along with that resource. So what are they trying to achieve with this obfuscation?

    Jim Dabell - 14th November 2003 12:32 - #

  7. Dave's autocomplete script isn't cross-browser. :(

    Where can I learn more about autocomplete=off ? I've never heard of that.

    Sam - 14th November 2003 14:16 - #

  8. Answering my own question: http://devedge.netscape.com/viewsource/2003/form-a utocompletion/

    Sam - 14th November 2003 14:22 - #

  9. There are two new comments on that bug. Both claim it's for performance reasons. The good news I guess is this: [14 Nov 3:14am EST] didou@php.net I'll add some infos about the compression method on the search.php page.

    Stinn - 14th November 2003 15:46 - #

  10. I was that evil one who came up with an idea of this feature, but suggested to be implemented with Macromedia Flash. As one of the webmasters responsible for content on PHP.net, I know that adding more and more bytes to the size of pages does not help our servers to better serve the visitors. I thought that this could be implemented in Flash in a small size. I faced a great opposition, and so some people started to implement prototypes in JavaScript. The main problem was that the size of the function list and the code to generate the dropdown was way too big, so we needed some method to compress the file. People came up with compression ideas, and finally this implementation produced the smallest file size possible. I doubt people would have been happy to wait long to get the function list downloaded...

    Simon, you correctly pointed out that PHP.net is the home of an open source project. You can find all the discussion regading the implementation of this feature on our news server in the php.mirrors group. Search for "quickref.swf". The thread was titled after my first proposal. You will also be able to find the sources used to generate that file, including the original JS file without the obfuscation. We are open source you know :)

    The original function list uncompressed is 46Kb in size. The original JavaScript code (without the compressed function list string) includes comments and sensible wrapping and is 6Kb in size. Add this up, and you get a 52Kb file. Have you checked the size of the script on PHP.net? It is less then 23Kb. You can guess how much load this size decrease takes off of the PHP.net mirror sites, given that we have a huge user base, and many concurrent requests. You can also imagine how better the user experience is. We have not developed that code to show off to the people, or to present a JavaScript tutorial. We have added this feature to make the users of our site more comfortable, and let them get to the information they want quickly.

    Simon, you also complain about the closed bug report (which is currently not closed anyway), and that we have not provided enough information. As far as I can see, our users benefit more if we spend our time brainstorming on these kind of things and fixing site problems, rather than replying to JavaScript related questions. We are at PHP.net after all, and we are doing this in our free time... I have also posted links to the threads where we discussed this feature to another bug report.

    Gabor Hojtsy - 14th November 2003 17:03 - #

  11. That makes perfect sense. Apology forthcoming.

    Simon Willison - 14th November 2003 17:10 - #

  12. The main problem was that the size of the function list and the code to generate the dropdown was way too big, so we needed some method to compress the file. People came up with compression ideas, and finally this implementation produced the smallest file size possible.

    Then you need to look at the HTTP headers you are sending out with that resource. Why worry about the difference between the two file sizes when you can eliminate downloading the resource entirely in a lot of cases?

    Jim Dabell - 14th November 2003 18:10 - #

  13. Jim, where have you checked the HTTP headers out of our 109 mirror sites? These are all operated by different maintainers, and thus configured somewhat differently.

    Gabor Hojtsy - 14th November 2003 18:29 - #

  14. I checked the main php.net website. It never even occurred to me that you'd be sending different headers for what is supposed to be the same resource. It doesn't seem very robust to me, there are plenty of things that can go wrong with bad headers, I would have thought they'd be controlled just as much as the HTML/CSS/Javascript.

    Jim Dabell - 14th November 2003 18:56 - #

  15. The headers generated for JS files are handled by Apache completely. I have checked hu.php.net/functions.js as an example, and it properly sets the Last-modified and Etag headers, so the client can cache the file.

    You say we send "different headers for what is supposed to be the same resource". Actually we have at least 109 copies of the functions.js file around the world (plus all the unofficial mirrors we don't count). These are not the same. Browsers will not load in a cached functions.js file downloaded from us.php.net if you visit it.php.net. How the Apache servers of the mirrors configured to handle JS files depend on their static file handling.

    If you visit php.net/search.php, you will certainly be forced to a mirror site, so I doubt you actually checked the main site. If you would have been checked the main site, you would have already figured out that we have a lightweight server for serving static content, and that is used to emit the functions.js file. That also has a last-modified header, which browsers can use to cache the content.

    Please be more specific on what is your problem with the headers.

    Gabor Hojtsy - 14th November 2003 19:35 - #

  16. The headers generated for JS files are handled by Apache completely. I have checked hu.php.net/functions.js as an example, and it properly sets the Last-modified and Etag headers, so the client can cache the file.

    There's more to caching than that. Caching Tutorial for Web Authors and Webmasters is a good tutorial. Perhaps you are thinking of cache validation? Specifically, you want to set the Expires header. Without this, any cache is simply guessing at whether the object is fresh or not - which can not only lead to sub-par caching, but also over-zealous caching. For something like your Javascript file, I'd set a relatively long expiry time, such as a month or so.

    Also, remember that caching isn't just the domain of browsers. Even if your favourite browser does what you want it to, there is no guarantee that caching proxies will too, and those can often be more important than browser caches.

    You say we send "different headers for what is supposed to be the same resource". Actually we have at least 109 copies of the functions.js file around the world (plus all the unofficial mirrors we don't count). These are not the same.

    Apologies, that was lax terminology on my part. I meant to say that you send different headers for what are supposed to be identical resources.

    If you visit php.net/search.php, you will certainly be forced to a mirror site, so I doubt you actually checked the main site.

    Simon linked directly to http://www.php.net/functions.js and that was what I looked at. As of this moment, it has no Expires header.

    That also has a last-modified header, which browsers can use to cache the content.

    Small nitpick: browsers can cache it without a Last-Modified header. By default, unauthenticated GET responses are cachable. How long a cache can keep it for is entirely up to the cache; it could keep it for a week or not at all. While caches often use Last-Modified headers to estimate an object's freshness, this is by no means required behaviour, and the usual use of the Last-Modified header is simply to validate the copy held by a cache.

    Jim Dabell - 14th November 2003 21:09 - #

  17. People came up with compression ideas, and finally this implementation produced the smallest file size possible.

    You could probably knock off another 1500 characters or so (that's 25% of the size of the functions) by writing things a bit smarter, without obfuscating things (further)...

    Take, for example, fh_EKeyUp:

    function fh_EKeyUp(evt) { if (f_s.value!="quickref") return true; evt=(evt)?evt:((event)?event:null); if (!evt) return true; var charCode=evt.charCode?evt.charCode:((evt.keyCode)? evt.keyCode:evt.which); if (charCode==38 || charCode==40 || charCode==57385 || charCode==57386) return false; if (f_p.value!=fh_currenttext) { fh_currenttext=f_p.value; fh_NewText(); } return true; }

    Compare this with:

    function fh_EKeyUp(e) { e = e || event || null; if (f_s.value != "quickref" && e) { var c = e.charCode || e.keyCode || e.which; if (/[&(\u57385\u57386]/.test(c)) return false; if (f_p.value != fh_currenttext) { fh_currenttext = f_p.value; fh_NewText(); } } return true; }

    (I've replaced every tab with two spaces for display purposes. The preview doesn't show <code> as preformatted, so you might have view the source instead...)

    The difference:

    $ wc -c js-* 290 js-after 391 js-before 681 total

    That's 25% less characters right there.

    Arien - 15th November 2003 09:22 - #

  18. Arien: Thanks for the tips. We will consider these kind of improvements for the next version of our script.

    Jim Dabbel: Thanks for the tips on the cahcing. I need to underline that as I have pointed out in my previous report, no PHP.net page links to http://www.php.net/functions.js. We actually point clients to http://static.php.net/www.php.net/functions.js which is a lightweight server set up especially for serving static content. And before you go and check, it does not emit an Expires header :)

    Gabor Hojtsy - 17th November 2003 14:23 - #

  19. Is there somebody that are 'traslated' these files into a clear example? I saw the files and the php/search.php and I don't know how are their working together.... The 'search field' have no javascript (into the search.php) to interact with them. There are a function.js in the botton in the page. (sorry, but my level on JS is so low that I can't understand it). I understand that the 'originalafter.js' contain the JS to interact with the form, but the way of use is not clear. I'm only more interested in the 'autosearch'feature than the 'compress' feature. Any help? Where can I find it? I can't believe that I'm the only man in the web that think that this feature is 'essential' to a web with a lot of items to choose... (I saw some others aproach of this feature, but I think its the best = very close that the Windows-Access may offers...) All the best. Antonio Garcia

    antonio garcia - 23rd March 2004 21:10 - #

Comments are closed.

Previously hosted at http://simon.incutio.com/archive/2003/11/13/goodAndUgly

A django site