Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Newspaper Club—A work in progress. “We’re building a service to help people make their own newspapers. This is the blog where we’re alarmingly honest about where it’s all going wrong.”

# 2nd July 2009, 7:34 pm / newspaperclub, newspapers, tom-taylor

Video for Everybody! Reminiscent of the early days of Web Standards, Kroc Camen has created a fiendishly clever chunk of HTML which can play a video on any browser, starting with HTML5 video then falling back on Flash and eventually just an HTML message telling the user where they can download the file. No JavaScript to be seen, but conditional comments abound. Requires you to encode as both Ogg and H.264, but Kroc includes details instructions for doing that using Handbrake.

# 2nd July 2009, 7:33 pm / codecs, encoding, h264, hacks, handbrake, html, html5, kroccamen, ogg, video

Modernizr (via) Neat idea and an unobtrusive implementation: a JavaScript library that runs feature tests for various HTML5 features (canvas, box shadow, CSS transforms and so on) and adds classes to the HTML body element, allowing you to write CSS selectors that only apply if a feature is present. Detected features are exposed to JavaScript as boolean properties, e.g. Modernizer.multiplebgs.

# 2nd July 2009, 10:56 am / css, faruk-ates, html5, javascript, modernizr

Codecs for <audio> and <video>. HTML 5 will not be requiring support for specific audio and video codecs—Ian Hickson explains why, in great detail. Short version: Apple won’t implement Theora due to lack of hardware support and an “uncertain patent landscape”, while open source browsers (Chromium and Mozilla) can’t support H.264 due to the cost of the licenses.

# 2nd July 2009, 10:16 am / audio, chromium, codecs, google, h264, html5, ian-hickson, mozilla, ogg, patents, theora, video

PubSub-over-Webhooks with RabbitHub. RabbitMQ, the Erlang-powered AMQP message queue, is growing an HTTP interface based on webhooks and PubSubHubBub.

# 1st July 2009, 8:22 pm / amqp, erlang, http, message-queues, pubsubhubbub, rabbitmq, webhooks

Address Extractor. Running on App Engine, an address extractor web service using code from the EveryBlock open source release.

# 1st July 2009, 8:03 pm / addressextractor, everyblock, google-app-engine, python

EveryBlock source code released. EveryBlock’s Knight Foundation grant required them to release the source code after two years, under the GPL. Lots of neat Django / PostgreSQL / GIS tricks to be found within.

# 1st July 2009, 8:01 pm / django, everyblock, geospatial, gpl, open-source, postgresql, python

Using Mongo for Real-Time Analytics. MongoDB supports an “upsert” query, which when combined with the $inc operator can cause counter fields to be incremented if they exist and created otherwise. This makes it a great fit for real-time analytics applications (one increment per page view), something that regular relational databases aren’t particularly good at.

# 30th June 2009, 7:28 pm / counters, databases, increment, mongodb, upsert

MongoDB. Lots of discussions about this at EuroPython today—it’s a document database, very similar to CouchDB but significantly faster and suggested for production use. Best of all, trying it out on OS X is as easy as extracting the tarball and running “bin/mongod --dbpath /tmp/test-mongo-db run”.

# 30th June 2009, 7:13 pm / couchdb, documentstore, europython, json, keyvaluestore, macos, mongodb, nonrelationaldatabase

Firefox 3.5 for developers. It’s out today, and the feature list is huge. Highlights include HTML 5 drag ’n’ drop, audio and video elements, offline resources, downloadable fonts, text-shadow, CSS transforms with -moz-transform, localStorage, geolocation, web workers, trackpad swipe events, native JSON, cross-site HTTP requests, text API for canvas, defer attribute for the script element and TraceMonkey for better JS performance!

# 30th June 2009, 6:08 pm / audio, browsers, canvas, crossdomain, csstransforms, dragndrop, firefox, firefox35, fonts, geolocation, html5, javascript, json, localstorage, mozilla, offlineresources, performance, textshadow, tracemonkey, video, webworkers

cache-money. A “write-through caching library for ActiveRecord”, maintained by Nick Kallen from Twitter. Queries hit memcached first, and caches are automatically kept up-to-date when objects are created, updated and deleted. Only some queries are supported—joins and comparisons won’t hit the cache, for example.

# 28th June 2009, 3:17 pm / activerecord, cachemoney, caching, memcached, rails, twitter

Twitter, an Evolving Architecture. The most detailed write-up of Twitter’s current architecture I’ve seen, explaining the four layers of cache (all memcached) used by the Twitter API.

# 28th June 2009, 3:09 pm / caching, memcached, twitter, software-architecture

BashReduce. Map/Reduce in Bash is no longer a joke project (if it ever was)—Richard Crowley is extending it and using it for analysis at OpenDNS.

# 28th June 2009, 3:03 pm / bash, bashreduce, mapreduce, opendns, richard-crowley

What’s New In Python 3.1. Lots of stuff, but the best bits are an ordered dictionary type (congrats, Armin), a Counter class for counting unique items in an iterable (I do this on an almost daily basis) and a bunch of performance improvements including a rewrite of the Python 3.0 IO system in C.

# 28th June 2009, 3:02 pm / armin-ronacher, performance, python, python3, python31, releases

The Resource Expert Droid. Like the HTML Validator but for your server’s HTTP headers—extremely useful.

# 25th June 2009, 10:06 am / headers, http, mark-nottingham, resourceexpertdroid, validator

Four crowdsourcing lessons from the Guardian’s (spectacular) expenses-scandal experiment. Michael Andersen from the Nieman Journalism Lab interviewed me about the MP expenses crowdsourcing site.

# 24th June 2009, 3:31 pm / crowdsourcing, guardian, interviews, mpsexpences

Test-Driven Heresy. Tim Bray advocates TDD for maintenance development, but argues that it may not be as useful during the exploratory, greenfield development phase of a project.

# 24th June 2009, 11:03 am / tdd, testing, tim-bray

To Sprite Or Not To Sprite. CSS sprite images are decompressed to full bitmaps by browsers before they are rendered, so sprite files with large numbers of pixels will dramatically increase the memory footprint of your site.

# 24th June 2009, 10:33 am / css, csssprites, performance, velocityconference

Google asked people in Times Square:“What is a browser?”. Stuff like this makes me despair for creating a secure web—what chance do people have of surfing safely if they don’t understand browsers, web sites, operating systems, DNS, URLs, SSL, certificates...

# 20th June 2009, 1:25 am / browsers, google, realhumans, security, usability

The breakneck race to build an application to crowdsource MPs’ expenses. Charles Arthur wrote up a very nice piece on the development effort behind the Guardian’s crowdsourcing expenses app.

# 19th June 2009, 10:16 pm / charles-aurthur, crowdsourcing, guardian, mpsexpenses

Towards a Standard for Django Session Messages. I completely agree that Django’s user.message_set (which I helped design) is unfit for purpose, but I don’t think sessions are the right solution for messages sent to users. A signed cookie containing either the full message or a key referencing the message body on the server is a much more generally useful solution as it avoids the need for a round trip to a persistent store entirely.

# 19th June 2009, 9:57 pm / cookies, django, flash, messages, python, sessions, signedcookies

Unimpressed by NodeIterator. John Resig, one of the most talented API designers I’ve ever come across, posts some well earned criticism of the document.createNodeIterator DOM traversal API.

# 19th June 2009, 9:53 pm / api-design, dom, javascript, john-resig, nodeiterator

Investigate your MP’s expenses. Launched today, this is the project that has been keeping me ultra-busy for the past week—we’re crowdsourcing the analysis of the 700,000+ scanned MP expenses documents released this morning. It’s the Guardian’s first live Django-powered application, and also the first time we’ve hosted something on EC2.

# 18th June 2009, 11:16 pm / crowdsourcing, django, ec2, guardian, mpexpenses, projects, python

Jython 2.5.0 Final is out! It’s been a long time coming—congratulations to the team.

# 16th June 2009, 11:21 pm / java, jython, python

SWFUpload jQuery Plugin. Nice looking plugin around an invisible Flash shim that provides multiple file uploads and client-side progress indicators.

# 16th June 2009, 11:46 am / flash, javascript, jquery, swfupload, upload

Opera Unite. Opera’s big announcement: a developer preview (“labs release”) of their new web-server-in-your-browser feature, Unite. Includes an Opera-hosted proxy to help break through your firewall. The web server can be customised using server-side JavaScript running in an Opera Widget.

# 16th June 2009, 11 am / javascript, opera, operaunite, unite, webservers, widgets

Mr. Penumbra’s Twenty-Four-Hour Book Store. Enormously entertaining short story about data visualisation and creepy San Francisco bookshops by Robin Sloan.

# 12th June 2009, 6:07 pm / bookshops, robin-sloan, san-francisco, shortstory, visualisation

Dealing with election results data. Alf Eaton loaded the Guardian’s European election results spreadsheet in to Google’s new Fusion Tables tool.

# 12th June 2009, 6:06 pm / alf-eaton, datablog, datastore, elections, fusiontables, google, guardian

Years

Tags