Simon Willison’s Weblog

Subscribe

February 2010

Feb. 11, 2010

WARNING: Google Buzz Has A Huge Privacy Flaw. Interesting one this: by default, Buzz creates a public profile for you that lists the people you follow—but your default set of followers is derived from the people you contact most frequently using Gmail. This means users of Buzz may inadvertently reveal their most frequent contacts, which is an issue for people like journalists with anonymous sources, unhappy employees seeking new work or even people having e-mail based affairs.

# 11:30 am / privacy, buzz, google, followers, gmail

HTML5 video markup, compatibility and playback. Everything you need to know about embedding HTML5 video on a page, complete with multiple codecs to cover the various supporting browsers and a fallback to Flash.

# 5:49 pm / html5, video, niallkennedy, flash

Kottke on Chatroulette. Jason Kottke: “In short, Chatroulette is pretty much the best site going on the internet right now.”

# 5:52 pm / jason-kottke, chatroulette

Elastic Search (via) Solr has competition! Like Solr, Elastic Search provides a RESTful JSON HTTP interface to Lucene. The focus here is on distribution, auto-sharding and high availability. It’s even easier to get started with than Solr, partly due to the focus on providing a schema-less document store, but it’s currently missing out on a bunch of useful Solr features (a web interface and faceting are the two that stand out). The high availability features look particularly interesting. UPDATE: I was incorrect, basic faceted queries are already supported.

# 6:33 pm / search, scaling, rest, lucene, java, elasticsearch, json, http, sharding, solr

Feb. 12, 2010

My email contacts list is not a social graph. It is not a group of people I have chosen to follow, but is instead full of people with whom I have a (sometimes very tenuous) professional relationship, as well as my family and some of my friends. Interestingly, my best friends don’t email me very often, so they do not show up as a part of my Buzz following list.

Suw Charman-Anderson

# 9:13 am / suwcharmananderson, buzz, friends, followers, socialgraph, email

Why toppcloud will not be agnostic. Ian Bicking’s toppcloud aims to offer deployment with the ease of use of AppEngine against a standard, open source Ubuntu + Python 2.6 + mod_wsgi + Varnish stack. Here he explains why he’s not going to vary the required components: keeping everything completely standardised means everyone gets the same bugs (and the same fixes).

# 9:21 am / ian-bicking, deployment, django, python, modwsgi, wsgi, ubuntu, varnish, toppcloud, appengine

Google Image Charts: Mathematical (TeX) Formulas (via) I’m not sure when they added this, but you can now use the Google Charts Image API to render mathematical formulas, specified using TeX syntax. Wordpress.com and Wikipedia have both offered this feature for quite a while, but now you can use it anywhere on the Web.

# 9:42 am / google, google-charts, maths, formula, tex

Algorithmic recruitment with GitHub. Matt Biddulph crawls GitHub’s social graph using JUNG (the Java Universal Network/Graph Framework), JRuby and Yahoo! BOSS to find good leads on interesting developers in specific geographic locations.

# 1:17 pm / matt-biddulph, socialgraph, recruiting, github

ElasticSearch: Your Data, Your Search. A neat example of how ElasticSearch’s schemaless indexes and native JSON support make it ridiculously easy to index different types of data and run queries across them.

# 3:22 pm / elasticsearch, java, search, schemaless, json

Around the World by Zeppelin. If you’re in the UK, you have four days left to catch this fantastic 90 minute documentary on iPlayer. It covers the first ever flight around the world, in the Graf Zeppelin in 1929, from the point of view of Lady Grace Drummond-Hay, a reporter for the Hearst media empire and the only woman on the voyage. The archive footage is incredible.

# 10:37 pm / zeppelins, bbc, documentary

Feb. 14, 2010

A new Buzz start-up experience based on your feedback. Buzz is switching to the more obvious model: use existing Gmail behaviour to suggest a list of people to follow, rather than auto-following them. It feels pretty clear to me that this is how following recommendations should work.

# 10:12 am / follow, following, privacy, google-buzz, buzz

Redis in Practice: Who’s Online? Using Redis to implement a “which of your friends are online now” feature, by maintaining a set of active user IDs for every minute, then intersecting the past five minutes of user IDs with a set containing the IDs of your friends.

# 5:17 pm / redis, social, friends, online, lukemelia

Feb. 15, 2010

Revisiting the click track. Paul Lamere uses the new Echo Nest API to access analysis data for music tracks and plot the beats per minute, making it easy to spot bands or drummers using a click track or drum machine to stay in tempo.

# 9:35 am / clicktrack, music, echonest

At this point all I could honestly tell you from the point of view of the editor of several of the HTML5 documents being held up is that the W3C have said they're won't publish without the objections being resolved, and that the objection is from Adobe. I can't even tell what I could do to resolve the objection. It seems to be entirely a process-based objection.

Ian Hickson

# 7:38 pm / ian-hickson, adobe, hixie, html5, w3c, canvas, process

No part of HTML5 is, or was ever, "blocked" in the W3C HTML Working Group -- not HTML5, not Canvas 2D Graphics, not Microdata, not Video -- not by me, not by Adobe. Neither Adobe nor I oppose, are fighting, are trying to stop, slow down, hinder, oppose, or harm HTML5, Canvas 2D Graphics, Microdata, video in HTML, or any of the other significant features in HTML5. Claims otherwise are false. Any other disclaimers needed?

Larry Masinter

# 9:31 pm / adobe, html5, canvas, larry-masinter, w3c

The Widening HTML5 Chasm. Simon St. Laurent’s commentary on the HTML5/Adobe situation. The most interesting piece I’ve read on it so far.

# 9:51 pm / html5, simon-st-laurent, adobe, w3c, whatwg

Feb. 16, 2010

Some questions about the “blocking” of HTML5

Some background reading. I was planning to fill in answers as they arrive, but I screwed up the moderation of the comments and got flooded with detailed responses—I strongly recommend reading the comments.

The magic of sub-editors. A neat illustration of how sub-editors work their magic, using the original article with strikes through the parts that were edited out.

# 10:44 am / subeditors, john-graham-cumming, science, writing

A Collection Of Redis Use Cases. Lots of interesting case studies here, collated by Mathias Meyer. Redis clearly shines for anything involving statistics or high volumes of small writes.

# 3:04 pm / redis, nosql, mathiasmeyer

Django Advent. I can’t believe I haven’t linked to this already—Django Advent is “a series of articles about upcoming releases of the Django web framework”. Seven have been posted so far, covering topics from 1.2 including multi-db, messages, object permissions and natural keys.

# 4:06 pm / django, djangoadvent

Feb. 17, 2010

A new global visual language for the BBC’s digital services. Detailed explanation of the BBC’s new “visual language” for their digital properties.

# 12:34 pm / design, bbc

Werewolf: How a parlour game became a tech phenomenon. The legendary “everyone’s a villager” game from Foo Camp ’08 gets a write-up in Wired.

# 5:30 pm / foocamp, werewolf, wired, games

How To Node. New blog about Node.js, with a superb series of tutorials aimed at both experienced and new JavaScript developers. The stuff on managing callbacks (including running them in both series and parallel) is pretty eye-opening.

# 5:42 pm / node, javascript, callbacks, nodejs

do. A library for Node that adds a higher level abstraction for dealing with chained and parallel callbacks.

# 5:43 pm / do, node, nodejs, javascript

The Case For An Older Woman. OK Cupid’s fascinating statistics blog uses cleverly plotted aggregate data from the dating site to illustrate the difference in age tastes between the genders (men try to date younger women) and show why that might not be the best strategy. An infographics tour-de-force.

# 10:20 pm / dating, graphs, data, infographics, okcupid

Search Engine Time Machine. Detailed explanation of how ElasticSearch provides high availability, through clever sharding and replication strategies and configurable gateways for long-term persistent storage.

# 10:32 pm / elasticsearch, highavailability, scaling, search

Feb. 19, 2010

Making Facebook 2x Faster. Facebook have a system called BigPipe which allows them to progressively send their pages to the browser as the server-side processing completes to optimise client loading time. Anyone reverse engineered this yet to figure out how they actually do it?

# 9:14 am / bigpage, facebook, performance, optimisation

jacobian’s django-deployment-workshop. Notes and resources from Jacob’s 3 hour Django deployment workshop at PyCon, including example configuration files for Apache2 + mod_wsgi, nginx, PostgreSQL and pgpool.

# 2:28 pm / django, python, deployment, pycon, jacob-kaplan-moss, sysadmin, apache, modwsgi, nginx, postgresql, pgpool

Feb. 22, 2010

Ryan Tomayko on Github’s development process. In the comments—a fascinating insight in to how GitHub’s “developers work on whatever is most interesting to them” process manages to achieve really good results.

# 9:18 am / github, ryan-tomayko, process, agile

2010 » February

MTWTFSS
1234567
891011121314
15161718192021
22232425262728