Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Guardian + Lucene = Similar Articles + Categorisation. Alf Eaton loaded 13,000 Guardian articles tagged Science in to Solr and Lucene and is using Solr’s MoreLikeThisHandler to find related articles and automatically apply Guardian tags to Nature News articles.

# 11th March 2009, 12:53 pm / alf-eaton, full-text-search, guardian, lucene, naturenews, openplatform, search, solr, tagging

django-gae2django. An implementation of the Google App Engine API (datastore, memcache, urlfetch, users and mail) that runs on Django, allowing you to take an existing application written for App Engine and deploy it on your own server on top of Django.

# 9th March 2009, 3:37 pm / django, gae2django, google, google-app-engine

What happened to Hot Standby? Hot Standby (the ability to have read-only replication slaves) has been dropped from PostgreSQL 8.4 and is now scheduled for 8.5. “Making hard decisions to postpone features which aren’t quite ready is how PostgreSQL makes sure that our DBMS is ”bulletproof“ and that we release close to on-time every year”.

# 8th March 2009, 9:28 am / databases, hotstandby, josh-berkus, postgresql, replication, scaling

Lovecraftian School Board Member Wants Madness Added To Curriculum. “West says the school inadequately prepares students for the black seas of infinity.”

# 7th March 2009, 11:11 am / funny, lovecraft, the-onion

Imminent Death of the Net Predicted. Well, maybe not, but the way Windows Vista deals with round-robin DNS A records (using a new IPv6 algorithm from RFC3484 backported to IPv4) means that domains that serve up multiple A records to load balance between data centres will find that the IP nearest to the 192.168.* range will get the vast majority of Vista traffic.

# 5th March 2009, 9:50 am / dns, microsoft, networking, vista, windows

It’s time for a change. Jacob Kaplan-Moss is joining Revolution Systems, who will now be offering professional Django support “to companies who need a Django expert on staff, but can’t afford someone full-time.”

# 4th March 2009, 10:30 pm / django, jacob-kaplan-moss, python, support

Panda Tuesday; The History of the Panda, New APIs, Explore and You. Flickr’s Rainbow Vomiting Panda of Awesomeness now has a family of associated APIs.

# 4th March 2009, 11:49 am / apis, flickr, pandas

Combine JSONP and jQuery to quickly build powerful mashups. jQuery’s JSONP support is one of my favourite little-known features of the library.

# 3rd March 2009, 3:17 pm / javascript, jquery, json, jsonp

Django snippets: Smart {% if %} template tag. Chris Beaven's drop-in replacement for Django's {% if %} tag that adds comparison operators (less than, greater than, not equal etc) while staying backwards compatible with the less able original. I love it. This is one place where I no longer favour Django's stated philosophy: I think it's perfectly reasonable to use comparisons in presentation logic, and I've found that in my own code the lack of an advanced if tag frequently leads to pure presentation logic sneaking in to my view functions.

# 3rd March 2009, 3:03 pm / chris-beaven, django, if, python, templating

How search.twitter.com uses Varnish. Includes examples of the configuration options they use.

# 2nd March 2009, 5:08 pm / caching, search, twitter, varnish

Database Sharding at Netlog, with MySQL and PHP. Detailed MySQL sharding case study from Netlog, who serve five billion page requests a month using thousands of shards across more than 80 database servers.

# 2nd March 2009, 10:22 am / databases, mysql, netlog, php, scaling, sharding

jQuery Sparklines. Delightful Sparklines implementation, using canvas or VML in IE. A neat nod towards unobtrusiveness as well: you can specify your data as comma separated values inside a span, then use a single jQuery method call to convert the span in to a sparkline image.

# 27th February 2009, 8:43 pm / canvas, gareth-watts, graphs, javascript, jquery, sparklines, vml

Magic properties make Firefox synchronously load the Java plugin. Even defining a function called sun() (or several other symbols) will trigger the Java VM to be loaded, dramatically hurting the performance of your page.

# 27th February 2009, 4:03 pm / firefox, java, javascript, mark-pilgrim, performance

How FriendFeed uses MySQL to store schema-less data. The pain of altering/ adding indexes to tables with 250 million rows was killing their ability to try out new features, so they’ve moved to storing pickled Python objects and manually creating the indexes they need as denormalised two column tables. These can be created and dropped much more easily, and are continually populated by an off-line index building process.

# 27th February 2009, 2:33 pm / bret-taylor, databases, friendfeed, mysql, python, scaling, sharding

Kestrel. Twitter’s Robey Pointer rewrote their Starling message queue in 1500 lines of Scala, adding reliable fetch (where consumers can confirm their receipt of an item) and blocking fetches, which reduce the need for consumers to poll for updates (and hence solve my only beef with the original Starling). I haven’t tried running this on a low spec VPS yet but it looks very promising.

# 26th February 2009, 10:20 am / kestrel, message-queues, robey-pointer, scala, starling, twitter

What I’ve Learned from Hacker News. I’m always fascinated by online community war stories.

# 25th February 2009, 11:16 pm / community, hacker-news, paul-graham

The Cost of Accessibility. Drew McLellan comments on the seemingly inevitable march towards JavaScript dependent applications, and argues that JavaScript frameworks such as Cappuccino have a duty to integrate accessibility in to their core.

# 25th February 2009, 10:31 pm / accessibility, cappuccino, drew-mclellan, javascript

django-springsteen and Distributed Search. Will Larson’s Django search library currently just talks to Yahoo! BOSS, but is designed to be extensible for other external search services. Interestingly, it uses threads to fire off several HTTP requests in parallel from within the Django view.

# 25th February 2009, 10:28 pm / concurrency, django, djangospringsteen, http, python, search, threads, will-larson, yahoo-boss

FAPWS3-0.2 (WSGI server based on libev). Another strong contender for Python’s answer to Mongrel—3500 requests/s for static files, 43 for a simple dynamic (Django powered) pages and 4.8 for a heavy SQL query—all benchmarked with 300 concurrent requests.

# 25th February 2009, 10:21 pm / django, fapws, mongrel, python, webservers, wsgi

Building and Scaling a Startup on Rails: 12 Things We Learned the Hard Way. Lessons learned from Posterous. Some good advice in here, in particular “Memcache later: If you memcache first, you will never feel the pain and never learn how bad your database indexes and Rails queries are”. Also recommends using job queues for offline processing of anything that takes more than 200ms.

# 23rd February 2009, 8:28 am / memcache, message-queues, posterous, rails, scaling

Oscars 2009: the interactive results | guardian.co.uk. My latest project for the Guardian, put together on very short notice. Updates live as the results are announced, and allows Twitter users to vote on their favourite for each category by sending a specially formatted message to @guardianfilm—jQuery and Ajax polling against S3 under the hood.

# 23rd February 2009, 2:19 am / guardian, javascript, jquery, oscars, projects, s3, twitter

jQuery.Rule (via) jQuery plugin for manipulating stylesheet rules. For me, this is the single most important piece of functionality currently missing from the core jQuery API. The ability to add new CSS rules makes an excellent complement to the .live() method added in jQuery 1.3.

# 22nd February 2009, 5:53 pm / arielflesler, css, javascript, jquery, plugins

Introducing the Karmic Koala, our mascot for Ubuntu 9.10 (via) Ubuntu 9.10 will have a strong focus on cloud computing, including tools for easily creating EC2 AMIs and Eucalyptus, an open-source system for running an EC2-compatible cloud in your own data centre.

# 21st February 2009, 5:19 pm / cloud-computing, ec2, eucalyptus, karmickoala, linux, mark-shuttleworth, ubuntu

jQuery 1.3.2 release notes. Not just a bug fix—there are a number of subtle behaviour changes, including to the :visible/:hidden selectors and the appendTo/prependTo/*To family of methods. I strongly recommend testing and reviewing those changes before upgrading.

# 21st February 2009, 4:42 pm / javascript, jquery

Mapping with Isotype (via) I hadn’t heard of Isotype (International System of Typographic Picture Education), a beautiful pictographic language created in the 1930s. This Isotype-inspired atlas is pretty spectacular.

# 21st February 2009, 11:09 am / design, isotype, mapping

Map Maker for Developers. Tiles from Google’s Map Maker crowdsourcing effort are now available in the JS and static maps APIs on an opt-in basis. Maybe I’m misunderstanding something here, but Google Map Maker seems like a big step backwards for open geographic data. People donate their mapping efforts to Google, who keep them—unlike OpenStreetMap, where the donated efforts are made available under a Creative Commons license.

# 21st February 2009, 9:05 am / creativecommons, crowdsourcing, google, googlemapmaker, google-maps-api, openstreetmap, staticmaps

The History of Python: Adding Support for User-defined Classes. Guido designed the run-time representation first, and tried to design the syntax to include as few new parsing concepts as possible. The origins of explicit self are also explained.

# 18th February 2009, 11 pm / css-classes, guido-van-rossum, object-oriented-programming, python

DB2 support for Django is coming. From IBM, under the Apache 2.0 License. I’m not sure if this makes it hard to bundle it with the rest of Django, which uses the BSD license.

# 18th February 2009, 10:58 pm / antonio-cangiano, bsd, databases, db2, django, ibm, licenses, open-source, orm, python

Found in space. The Astrometry bot on Flickr (which detects which part of the night sky is contained within your photo and adds notes to some of the more interesting stars) is the most delightful use of the Flickr API I’ve ever seen. This interview provides some background, including a link to a paper on the “scale and rotation invariant hashing algorithm” that is used to build the index.

# 18th February 2009, 10:52 pm / astrometry, astronomy, flickr

Years

Tags