Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Sign in with Twitter. Intriguing: Twitter are now an OpenID-style identity provider... using OAuth.

# 20th April 2009, 4:10 am / oauth, openid, twitter

Haystack (via) A brand new modular search plugin for Django, by Daniel Lindsley. The interface is modelled after the Django ORM (complete with declarative classes for defining your search schema) and it ships with backends for both Solr and pure-python Whoosh, with more on the way. Excellent documentation.

# 17th April 2009, 9:53 pm / daniel-lindsley, django, haystack, orm, python, search, solr, whoosh

Paul Buchheit: Make your site faster and cheaper to operate in one easy step. Paul promotes gzip encoding using nginx as a proxy, and mentions that FriendFeed use a “custom, epoll-based python server” as their application server. Does that mean that they’re serving their real-time comet feeds directly from Python?

# 17th April 2009, 5:19 pm / comet, epoll, friendfeed, gzip, nginx, paul-buchheit, python

Drop ACID and think about data. I’ve been very impressed with the quality and speed with which the PyCon 2009 videos have been published. Here’s Bob Ippolito on distributed databases and key/value stores.

# 17th April 2009, 5:13 pm / acid, bob-ippolito, data, databases, pycon, pycon2009, python

Installing CouchDB from source on OS X. So far I’ve just been playing with it in an Ubuntu virtual machine.

# 17th April 2009, 4:22 pm / building, couchdb, macos, ubuntu

Cross Browser Base64 Encoded Images Embedded in HTML (via) Scarily clever. View the PHP source to see what’s going on—most browsers get image tags that use data URIs starting with data:image/png;base64, but IE gets served a Content-type:message/rfc822 header and a MIME formatted multipart/related document, as used by e-mail clients to embed inline image attachments.

# 17th April 2009, 4:12 pm / base64, browsers, hedger-wang, internet-explorer, mime, php

Developing Django apps with zc.buildout. Jacob went ahead and actually documented one of Python’s myriad of packaging options.

# 16th April 2009, 9:50 am / jacob-kaplan-moss, packaging, python, zcbuildout

(Yet) Another DiggBar Update. Digg are responding in exactly the right way in my opinion—the DiggBar will start returning 301 redirects for anonymous users, while users who are logged in to Digg can opt-out of the feature if they want to (usage statistics show that most Digg users are fine with the feature).

# 16th April 2009, 12:50 am / digg, diggbar, redirects, urls

10 Cool Things We’ll Be Able To Do Once IE6 Is Dead. Highlights include child and attribute selectors, 24bit PNGs and max-width and min-width. Simple pleasures, but I can hardly wait.

# 15th April 2009, 2:17 pm / brothercake, browsers, css, ie6, maxwidth, minwidth, pngs, selectors, standards

London’s abandoned Underground Stations on Google Street View. “The network is littered with buildings that belonged to stations that closed their doors to the public because routes were changed and diverted, or because there was just too little traffic to make them viable. Here are some of the remnants of disused Underground stations that you can see on Google’s Street View of London.”

# 14th April 2009, 2:51 pm / google, london, martin-belam, streetview, underground

Counting the ways that rev=“canonical” hurts the Web. Mark Nottingham complains about misapplied trust (a page can falsely claim to be the canonical URL for another page), the easy confusion between rev and rel and the lack of discussion with relevant communities.

# 14th April 2009, 2:11 pm / mark-nottingham, revcanonical, standards, urls

Reducing XSS by way of Automatic Context-Aware Escaping in Template Systems (via) The Google Online Security Blog reminds us that simply HTML-escaping everything isn’t enough—the type of escaping needed depends on the current markup context, for example variables inside JavaScript blocks should be escaped differently. Google’s open source Ctemplate library uses an HTML parser to keep track of the current context and apply the correct escaping function automatically.

# 14th April 2009, 9:26 am / ctemplate, django, escaping, google, html, open-source, security, xss

Visualising Sorting Algorithms. Aldo Cortesi dislikes animations of sorting algorithms, so he designed a beautiful technique for statically visualising them instead (using Python and Cairo to generate the images).

# 14th April 2009, 8:55 am / aldo-cortesi, algorithms, cairo, python, sorting, visualisation

Amazon Says Listing Problem Was an Error, Not a Hack (via) “A friend within the company told him that someone working on Amazon’s French site mistagged a number of keyword categories, including the ’Gay and Lesbian’ category, as pornographic, using what’s known internally as the Browse Nodes tool. Soon the mistake affected Amazon sites worldwide.”

# 14th April 2009, 8:32 am / amazon, amazonfail, csrf, security

tinyarchive.org. Blaine Cook’s archive of 301 and 302 redirects—needs to be automatically updated by a crawler for it to be really useful though.

# 13th April 2009, 9:57 pm / archive, tinyarchive, tinyurls, urls

How to cause moral outrage from the entire Internet in ten lines of code. Looks legit—the author claims to have sparked this weekend’s #amazonfail moral outrage (where Amazon where accused of removing Gay and Lesbian books from their best seller rankings) by exploiting a CSRF hole in Amazon’s “report as inappropriate” feature to trigger automatic takedowns. EDIT: His claim is disputed elsewhere (see comments)

# 13th April 2009, 7:48 pm / amazon, amazonfail, csrf, prdisaster, security

favikon.com. Small, easy to use online favicon generator.

# 13th April 2009, 12:09 pm / favicons, favikon

I like rev=“canonical”. Les Orchard summarises the current debate over what colour to paint the rev=“canonical” bikeshed.

# 13th April 2009, 10:41 am / les-orchard, revcanonical, urls

django-shorturls. Jacob took my self-admittedly shonky shorter URL code and turned it in to a proper reusable Django application.

# 13th April 2009, 9:31 am / django, djangoshorturls, jacob-kaplan-moss, python, revcanonical

17-year-old claims responsibility for Twitter worm. It was a text book XSS attack—the URL on the user profile wasn’t properly escaped, allowing an attacker to insert a script element linking out to externally hosted JavaScript which then used Ajax to steal any logged-in user’s anti-CSRF token and use it to self-replicate in to their profile.

# 12th April 2009, 7:22 pm / csrf, security, twitter, worms, xss

Tweenbots: Cute Beats Smart. How do you build a robot that can get from one end of Washington Square Park to the other without your help? Give it a cute smile and a sign explaining where it’s going and rely on strangers to point it in the right direction along the way.

# 12th April 2009, 1:47 pm / cute, robots, tweenbots

Running Rhino and Helma NG on Google App Engine. Helma NG is a JavaScript web app framework, which now works on App Engine out of the box.

# 12th April 2009, 12:52 pm / google, google-app-engine, helma, helmang, javascript, rhino

A rev=“canonical” HTTP Header. Chris Shiflett proposes optionally exposing rev=canonical information in an HTTP header, thus allowing sites to discover shorter URLs using just a HEAD request and removing the need to parse HTML. The pingback specification also uses this shortcut.

# 12th April 2009, 12:33 pm / chris-shiflett, head, headers, http, pingback, revcanonical

Revving up. Jeremy Keith advocates adding the revcanonical attribute to regular A elements as well as / instead of hiding it in the head of the document, following the microformats design principle that invisible metadata is less valuable than augmenting visible links. I’ve updated my shorten bookmarklet to handle this case.

# 12th April 2009, 12:29 pm / jeremy-keith, metadata, microformats, revcanonical

Using Scala with Google App Engine. Scala works, but I haven’t seen confirmation on actors yet (which are likely to break due to their dependency on threads).

# 11th April 2009, 3:28 pm / google, google-app-engine, java, scala, threads

Digg Search: Now With 99.987% Less Suck. Really nice implementation of faceted search, still using Lucene and Solr under the hood.

# 10th April 2009, 10:17 pm / digg, facets, full-text-search, lucene, search, solr

Experiences deploying a large-scale infrastructure in Amazon EC2. “At OpenX we recently completed a large-scale deployment of one of our server farms to Amazon EC2. Here are some lessons learned from that experience.”

# 10th April 2009, 9:43 am / amazon, ec2, griggheorghiu, openx, scaling

Scaling Django web apps on Apache. Cool to see this kind of article cropping up on IBM developerWorks, but it’s a shame they don’t mention mod_wsgi.

# 10th April 2009, 9:23 am / apache, django, ibm, modwsgi, python

Browsing my browsing. Roo Reynolds used the MeeTimer Firefox extension to gather statistics on his browsing habits, then extracted data directly from the SQLite database and generated his own graphs using PHP and the canvas element.

# 10th April 2009, 8:48 am / canvas, firefox, javascript, meetimer, php, rooreynolds, sqlite

Protovis. JavaScript graphing library based on canvas, with an elegant chaining style API.

# 10th April 2009, 8:43 am / canvas, graphs, javascript, protovis, visualisation

Years

Tags