Simon Willison’s Weblog


Tuesday, 7th November 2017

Feature Visualization (via) Another gorgeous paper published on Distill, the journal that prides itself on including interactive visualizations to help provide clear explanations of machine learning. # 8:48 pm

GOV.UK Registers (via) Canonical sources of “lists of information” intended for use by GDS teams building software for the UK government, but available for anyone. 17 registers are “ready for use”, 45 are “in progress”. Covers things like the FCO’s country list, the official list of prison estates, and DEFRA’s list of public bodies in England that manage drainage systems. # 3:31 pm

Pull request #4120 · python/cpython. I just had my first ever change merged into Python! It was a one sentence documentation improvement (on how to cancel SQLite operations) but it was fascinating seeing how Python’s GitHub flow is set up—clever use of labels, plus a bot that automatically checks that you have signed a copy of their CLA. # 2:06 pm

Cloud SQL for PostgreSQL adds high availability and replication. Google Cloud Platform now offers PostgreSQL with automatic asynchronous disk-level replication to a separate instance in a different availability zone, via their new “Regional Disks“ feature. Between this, Heroku, Citus and Amazon RDS the appeal of a self-maintained PostgreSQL instance continues to fall. # 1:49 pm

Something is wrong on the internet. James Bridle takes a fascinating and deeply troubling dive into the world of Kids’ YouTube videos, which appear to be increasingly algorithmically generated and are evolving in a very dark direction. # 12:40 pm

In the official timeline, Peppa is appropriately reassured by a kindly dentist. In the version above, she is basically tortured, before turning into a series of Iron Man robots and performing the Learn Colours dance. A search for “peppa pig dentist” returns the above video on the front page, and it only gets worse from here.

James Bridle # 12:34 pm

The only thing that would have been nice is that after the project had been finished and the chip deployed, that someone from Intel would have told me, just as a courtesy, that MINIX 3 was now probably the most widely used operating system in the world on x86 computers. That certainly wasn’t required in any way, but I think it would have been polite to give me a heads up, that’s all.

Andrew S. Tanenbaum # 11:50 am

How Balanced does Database Migrations with Zero-Downtime. I’m fascinated by the idea of “pausing” traffic during a blocking site maintenance activity (like a database migration) and then un-pausing when the operation is complete—so end clients just see some of their requests taking a few seconds longer than expected. I first saw this trick described by Braintree. Balanced wrote about a neat way of doing this just using HAproxy, which lets you live reconfigure the maxconns to your backend down to zero (causing traffic to be queued up) and then bring the setting back up again a few seconds later to un-pause those requests. # 11:36 am

Secondary indexing with Redis. I haven’t seen this section of the official Redis documentation before, and it’s absolutely fantastic—well worth reading the whole thing. It talks through various ways in which you can set up indexes in Redis, mainly by leaning on sorted sets—which it turns out will binary lexicographically sort items with the same score. This makes it easy to implement autocomplete with Redis—but if you use them creatively you can implement subject/predicate/object graph searches or even N-dimensional range queries as well. # 2 am

I’m a Unicorn. I got to try out Animoji on an iPhone X, and it was amazing. # 1:50 am