Simon Willison’s Weblog

Subscribe
Atom feed

Blogmarks

Filters: Sorted by date

Tracking UK Liberal Indecency. The mashup I’ve been waiting for: Tom Hume used the Guardian Content API to track swearword usage over time.

# 2nd April 2009, 4:44 pm / contentapi, guardian, mashup, obscenity, swearing, tom-hume

Google uncloaks once-secret server. Instead of a data centre wide UPS and redundant power supplies, each Google server has its own 12V battery. They live in standard shipping containers, each holding 1,160 servers.

# 2nd April 2009, 10:47 am / datacentres, google, operations, power, servers, ups

Heap Dump Analysis. Using jmap to dump the JVM’s memory to disk, then analysing it using the visualvm GUI tool.

# 2nd April 2009, 10:34 am / dominic-mitchell, heapdump, java, jvm, memory, profiling, visualvm

Amazon Elastic MapReduce (via) Hadoop as a service. Basically a web based GUI around Hadoop—you could roll this yourself on EC2 but for a small markup on regular EC2 prices you get to avoid the extra work setting everything up. Data processing scripts can be written in Java, Ruby, Perl, Python, PHP, R, or C++ and are loaded in to S3 before firing off the job.

# 2nd April 2009, 10:25 am / amazon, amazon-web-services, cloud-computing, ec2, hadoop, mapreduce, s3

Continuous deployment in 5 easy steps. A classic case of a number in a title making the article look less interesting than it actually is. Lots of interesting information here from IMVU’s Eric Ries.

# 1st April 2009, 12:25 am / continuous-deployment, eric-ries, ivmu, testing

How to use Django with Apache and mod_wsgi. My favourite deployment option is now included in the official Django docs, thanks to Alex Gaynor. I tend to run a stripped down Apache with mod_wsgi behind an nginx proxy, and have nginx serve static files directly. This avoids the need for a completely separate media server (although a separate media domain is still a good idea for better client-side performance).

# 1st April 2009, 12:24 am / alex-gaynor, deployment, django, modwsgi, nginx, proxies, python, wsgi

Dojo 1.3 now available. Looks like an excellent release. dojo.create is particularly nice—I’d be interested to know why something similar has never shipped with jQuery (presumably there’s a reason) as it feels a lot more elegant than gluing together an HTML-style string. Also interesting: you can swap between Dojo’s Acme selector engine and John Resig’s sizzle.

# 1st April 2009, 12:19 am / acme, dojo, dojocreate, javascript, jquery, releases, selectors, sizzle

My Guardian OpenPlatform API’n’Data Hacks’n’Mashups Roundup. Superb collection of Guardian Open Platform mashups from Tony Hirst, all of which use free online tools such as Yahoo! Pipes and Many Eyes. We invited Tony in to give a tech talk at the Guardian last week.

# 31st March 2009, 10:04 pm / guardian, manyeyes, mashups, openplatform, tony-hirst, yahoo-pipes

Special Events in jQuery. How to add a custom “tripleclick” event to jQuery, using the jQuery.event.special extension hook.

# 30th March 2009, 10:15 am / brandon-aaron, events, javascript, jquery

Help! My iPod thinks I’m emo—Part 1. Detailed write-up of one of my favourite panels from this year’s SxSW, on music recommendation engines.

# 30th March 2009, 10:11 am / music, recommendation, sxsw

ProjectPlan—unladen-swallow. A branch of Python 2.6 aiming to radically improve performance (the target is a 5x improvement), by compiling Python to machine code using LLVM’s JIT engine. I think this is a Google 20% time project (or maybe not, see the comments). An early version without LLVM is already available for download.

# 30th March 2009, 10:09 am / google, jit, llvm, performance, python, unladenswallow

Development virtual machines on OS X using VMWare and Ubuntu. Bradley Wright provides detailed instructions for getting the JeOS (VM optimised) flavour of Ubuntu running with VMWare tools so you don’t need to run samba just to share your desktop.

# 24th March 2009, 2:31 pm / bradley-wright, jeos, ubuntu, virtualisation, vmware, vmwarefusion

Future roadmap for mod_wsgi. mod_wsgi 3.0 isn’t too far off, and will include Python 3.0 support, WSGI application preloading and internal web server redirection (similar to nginx X-Accel-Redirect). Version 4.0 plans a major architectural change that will allow multiple versions of Python to be run from the same Apache.

# 19th March 2009, 5:27 pm / apache, graham-dumpleton, modwsgi, nginx, python, wsgi

Building Fast Client-side Searches. Flickr now lazily loads your entire contact list in to memory for auto-completion. Extensive benchmarking found that a control character delimited string was the fastest option for shipping thousands of contacts around as quickly as possible.

# 19th March 2009, 3:35 pm / ajax, autocomplete, flickr, javascript, json

Pwn2Own trifecta: Hacker exploits IE8, Firefox, Safari. You just can’t trust browser security: Current versions of Safari, IE8 and Firefox all fell to zero-day flaws at an exploit competition. None of the vulnerabilities have been disclosed yet.

# 19th March 2009, 3:30 pm / browsers, firefox, ie8, internet-explorer, pwn2own, safari, security

Parrot 1.0.0 “Haru Tatsu” Released! Parrot hits 1.0! Anyone know how complete Pynie, the Python implementation on top of Parrot is?

# 19th March 2009, 3:17 pm / parrot, pynie, python

Streams, affordances, Facebook, and rounding errors. I asked Kellan about scaling activity streams the other day. Here he suggests the best technique is not to promise a perfect stream (like Twitter does)—Facebook used to get away with 80% loss of update messages, but their new redesign has changed the contract with their users.

# 19th March 2009, 2:02 pm / activitystreams, facebook, kellan-elliott-mccrea, scaling, twitter

Load spikes and excessive memory usage in mod_python. “The final answer? Stop using mod_python, use mod_wsgi and run it with daemon mode instead. You will save yourself a lot of headaches by doing so.”

# 16th March 2009, 5:26 pm / apache, graham-dumpleton, modpython, modwsgi, python, wsgi

slippy faumaxion, take two. Mike Migurski made a slippy map using triangular tiles, based on the same principle as Buckminster Fuller’s famous Dymaxion World Map.

# 15th March 2009, 3:40 pm / buckminsterfuller, faumaxion, mapping, michal-migurski

Parallel merge sort in Erlang. Thoughts on an Erlang-y way of implementing a combined activity stream (e.g. Facebook and Twitter). Activity streams are a Really Hard Problem—as far as I know there’s no best practise for implementing them yet.

# 15th March 2009, 1:36 pm / activitystreams, erlang, facebook, twitter

Fixing IE by porting Canvas to Flash. Implementing canvas using Flash is an obvious step, but personally I’m much more interested in an SVG renderer using Flash that finally brings non-animated SVGs to IE.

# 15th March 2009, 1:34 pm / canvas, flash, internet-explorer, svg

redis (via) An in-memory scalable key/value store but with an important difference: this one lets you perform list and set operations against keys, opening up a whole new set of possibilities for application development. It’s very young but already supports persistence to disk and master-slave replication.

# 15th March 2009, 1:32 pm / keyvaluepairs, masterslave, redis, replication, salvatore-sanfilippo, scaling

Concurrence. Exciting: a Python framework for “creating massively concurrent network applications” (the tutorial benchmarks a Hello World web server at over 8,000 requests a second). It’s implemented on top of libevent using pyrex, can run on either Stackless Python or Greenlets from the py library and ships with a WSGI server, an HTTP client and a DBAPI 2.0 compliant MySQL driver.

# 15th March 2009, 1:28 pm / greenlets, http, libevent, mysql, pyrex, python, stacklesspython, wsgi

Ruby on Rails 2.3 Release Notes. I’m impressed with how thoroughly Rails has embraced Rack (Ruby’s standardised web framework API, inspired by Python’s WSGI).

# 15th March 2009, 1:22 pm / python, rack, rails, ruby, wsgi

maps from scratch. An idea whose time has come: using EC2 AMIs for tutorial sessions to give everyone a pre-configured environment.

# 15th March 2009, 1:20 pm / cloud-computing, ec2, mapping, michal-migurski, tutorials

Southerly Breezes. Andrew Godwin is slowly assimilating the best ideas from other Django migration systems in to South—the latest additions include ORM Freezing from Migratory and automatic change detection. Exciting stuff.

# 15th March 2009, 1:17 pm / andrew-godwin, databases, django, migrations, orm, south

Understanding Bidirectional (BIDI) Text in Unicode. It turns out you need to sanitise user input to ensure there are no unicode characters that switch your site’s regular text to RTL.

# 15th March 2009, 4:37 am / bidi, cal-henderson, filtering, security, unicode, userinput

Practical, maintainable CSS (via) Nat’s posted slides and a video from her latest talk at last week’s Brighton Girl Geeks evening.

# 12th March 2009, 12:46 am / css, girlgeeks, natalie-downe

Get our full university data. “The Guardian’s university rankings are the most visited part of Education Guardian”—and now they’re available as a spreadsheet.

# 11th March 2009, 1:52 pm / datastore, guardian, leaguetables, openplatform, university

Years

Tags