Simon Willison’s Weblog

Subscribe

October 2010

Oct. 15, 2010

Is there any consensus yet on link rel=shorturl vs rev=canonical?

It’s pretty clear from the answers that rev=canonical v.s. rel=canonical is way too confusing—so it’s down to rel=shortlink v.s. rel=shorturl.

[... 38 words]

What are the main weaknesses of Java as a programming language?

A cultural bias towards over-engineering. In my experience Java code often ends up a huge network of Factories and AbstractFactories and Visitors and XML configuration files and every design pattern you care to mention, dozens of classes many of which contain hardly any procedural code at all. A lot of Java projects are essentially impossible to navigate without an IDE.

[... 77 words]

Did you mean rel=shortlink vs. rel=shorturl?

That was the cute trick in the initial proposal: it’s REV=canonical, not REL=canonical, suggesting a reverse relationship.

[... 29 words]

Oct. 16, 2010

JS had to “look like Java” only less so, be Java’s dumb kid brother or boy-hostage sidekick. Plus, I had to be done in ten days or something worse than JS would have happened.

Brendan Eich

# 8:25 am / brendan-eich, javascript, recovered

Why do some websites implement their logout link as a form post via JavaScript versus a plain old GET request?

Probably because if you implement logout as a GET action, I can force you to log out of a site by tricking you in to visiting a page with an <img src="http://yoursite.com/logout/" width="1" height="1"> element on it.

[... 64 words]

What is the best way to hire Solr developers?

Do you really need to hire a Solr specialist? It shouldn’t take a competent developer more than a few days to get familiar with Solr—the HTTP API is extremely easy to work with in my experience. You can always hire in a consultant from one of the companies that provide commercial Solr support for a few days to help your developers get up to scratch.

[... 82 words]

Oct. 17, 2010

jQuery 1.4.3 Released. Once again, the thing that impresses me most about this jQuery release is how stable the core API is. Hardly any new methods added, but the existing methods are made faster, more flexible and more predictable. The same as been true for the past several releases as well. It just keeps getting more and more polished.

# 12:15 am / api-design, javascript, jquery, recovered

Is it possible to make API calls without cURL installed?

Yes:

[... 26 words]

Oct. 19, 2010

Linked Data at the Guardian. The Guardian’s Open Platform API can now be queried by MusicBrainz ID and ISBN, opening up some extremely useful new types of query.

# 7:11 pm / guardian, openplatform, semanticweb, recovered

Oct. 22, 2010

What does an ideal Django workflow setup look like?

Short answer: virtualenv, pip, south for migrations, fabric for deployment.

[... 57 words]

Oct. 25, 2010

Firesheep (via) Oh wow. A Firefox extension that makes sniffing for insecured (non-HTTPS) cookie requests on your current WiFi network and logging in as that person a case of clicking a couple of buttons. Always possible of course, but it’s never been made easy before. Private VPNs are about to become a lot more popular.

# 9:11 am / cookies, firesheep, security, wifi, recovered

What is the best way to maintain a API wrapper class across multiple languages?

1. Use JSON for your API. That takes away a lot of the necessity for an API wrapper, since it means you’re automatically returning native data types (hashes, lists, strings etc) for most programming languages.

[... 175 words]

What are the best APIs for creating location-based Wikipedia mashups?

GeoNames has a fantastic API for finding Wikipedia articles near a specific latitude/longitude pair:

[... 32 words]

Bleach, HTML sanitizer and auto-linker. HTML sanitisation is notoriously difficult to do correctly, but Bleach (a Python library) looks like an excellent effort. It uses the html5lib parsing library to deal with potentially malformed HTML, uses a whitelist rather than a blacklist and has a neat feature for auto-linking URLs that is aware of the DOM (so it won’t try to auto-link a URL that is already wrapped in a link element). It was written by the Mozilla team for addons.mozilla.org and support.mozilla.org so it should be production ready.

# 1:32 pm / bleach, django, python, security, recovered

What is the best Mac OS X text editor for a web developer? And what makes it great?

It’s still TextMate for me. It gets the basics right—syntax highlighting, sensible indentation, a good project pane (I use “mate ~/Development/my-project” at the terminal to open TextMate with my entire folder hierarchy), solid extensions and good unix integration (Filter selection through command).

[... 77 words]

Oct. 26, 2010

Is it a good idea for new start-up to outsource Software/App Development?

It depends on what you mean by “outsourcing”.

[... 130 words]

What are some of your favorite complicated diagrams?

This one’s pretty nuts:

[... 25 words]

What’s a good book about basic usage techniques and patterns in Python? (a la Effective Java/C++)

Dive into Python 3 is well worth a look: http://diveintopython3.org/

[... 48 words]

What is the story of Advogato?

There’s a Google Tech Talk about Advogato: http://video.google.com/videopla...

[... 21 words]

Oct. 27, 2010

What is the best lightweight jQuery tooltip plugin? Why?

Last time I went looking, I was very impressed by qTip: http://craigsworks.com/projects/...

[... 28 words]

Why does Python load imported modules separately for different files, unlike C or PHP? Isn’t that inefficient in terms of memory usage?

It doesn’t—you’re misunderstanding how Python’s module system works. If two different places have “import os” in them, the os module is only imported and executed once—it’s cached in the sys.modules dictionary so you can see it happen if you want to. The key thing to understand is that “import os” attaches the os module to the “os” symbol within the current file’s scope, loading it only if it hasn’t been loaded already.

[... 104 words]

What are all the advantages of jQuery?

jQuery’s API is astonishingly well designed. It’s extremely consistent once you learn its rules (e.g. methods often take one argument to read a value and two arguments to set one, e.g. .css(), .attr(), .width(), .height()) and its functionality is so complete that the last few major releases of the library have hardly added any new methods at all.

[... 166 words]

What is the best way to integrate MongoDB with Django?

Personally, I just “import pymongo” and start calling the regular Python API—no need for any special treatment to get it working with Django.

[... 41 words]

Bees with machine guns! Low-cost, distributed load-testing using EC2. Great name for a useful project—Bees with machine guns is a Fabric script which fires up a bunch of EC2 instances, uses them to load test a website and then spins them back down again.

# 11:04 pm / ec2, fabric, load-testing, performance, scaling, recovered

Using MySQL as a NoSQL—A story for exceeding 750,000 qps on a commodity server. Very interesting approach: much of the speed difference between MySQL/InnoDB and memcached is due to the overhead involved in parsing and processing SQL, so the team at DeNA wrote their own MySQL plugin, HandlerSocket, which exposes a NoSQL-style network protocol for directly calling the low level MySQL storage engine APIs—resulting in a 7.5x performance increase.

# 11:10 pm / mysql, nosql, scaling, recovered

Oct. 28, 2010

Is there a blog that covers open source Python projects?

No, but I wish there was.

[... 29 words]

What is the best JS library for automated cropping?

Not entirely clear what you’re looking for, but if you mean a UI tool for letting people resize and crop an image Jcrop is really nice http://deepliquid.com/content/Jc...

[... 43 words]

Is there a good online calendar for upcoming technology conferences?

We’re trying to build exactly this with http://lanyrd.com/—not just for technology conferences, but they are definitely our largest niche.

[... 208 words]

Oct. 29, 2010

What are people’s experiences using Memcached?

That it’s so obviously a good idea (and works so well) that you’d be crazy not to use it. As far as I’m concerned, it’s part of the default stack for any web application.

[... 46 words]

If I have data that loads using  json / JavaScript will it get indexed by Google?

No. Personally I dislike sites with content that is only accessible through JavaScript, but if you absolutely insist on doing this you should look in to implementing the Google Ajax Crawling mechanism: http://code.google.com/web/ajaxc...

[... 56 words]

2010 » October

MTWTFSS
    123
45678910
11121314151617
18192021222324
25262728293031