Simon Willison’s Weblog

Items in 2019

Filters: Year: 2019 ×

Language support on Glitch: a list (via) This is really useful: it’s essentially “Glitch: the missing manual” for running languages other than JavaScript. The Glitch community forums are a gold mine of useful information like this. # 23rd April 2019, 4:28 pm

Running Datasette on Glitch

The worst part of any software project is setting up a development environment. It’s by far the biggest barrier for anyone trying to get started learning to code. I’ve been a developer for more than twenty years and I still feel the pain any time I want to do something new.

[... 998 words]

Lots of people calling for more aggressive moderation seem to imagine that if they yell enough the companies have a thoughtful, unbiased and nuance-understanding HAL 9000 they can deploy. It’s really more like the Censorship DMV.

Alex Stamos # 21st April 2019, 4:36 pm

The Behavioral Change Stairway Model. BCSM is the FBI’s model for crisis negotiation, but it looks like it could be a useful negotiation framework for all kinds of other conflict mediation as well. # 19th April 2019, 5:46 pm

In Kākāpō breeding season news…. I posted on MetaFilter about this year’s record-breaking Kākāpō breeding season. # 19th April 2019, 3:11 am

Exploring Neural Networks with Activation Atlases. Another promising attempt at visualizing what’s going on inside a neural network. # 19th April 2019, 2:24 am

Using the HTML lang attribute (via) TIL the HTML lang attribute is used by screen readers to understand how to provide the correct accent and pronunciation. # 18th April 2019, 9:09 pm

How Zoom’s web client avoids using WebRTC (via) It turns out video conferencing app Zoom uses their own WebAssembly compiled video and audio codecs and transmits H264 over WebSockets. # 18th April 2019, 6:20 pm

An Intro to Threading in Python (via) Real Python consistently produces really comprehensive, high quality articles and tutorials. This is an excellent introduction to threading in Python, covering threads, locks, queues, ThreadPoolExecutor and more. # 18th April 2019, 5:24 am

Pyodide: Bringing the scientific Python stack to the browser (via) More fun with WebAssembly: Pyodide attempts (and mostly succeeds) to bring the full Python data stack to the browser: CPython, NumPy, Pandas, Scipy, and Matplotlib. Also includes interesting bridge tools for e.g. driving a canvas element from Python. Really interesting project from the Firefox Data Platform team. # 17th April 2019, 4:23 am

Wasmer: a Python library for executing WebAssembly binaries. This is a really interesting new tool: “pip install wasmer” and now you can load code that has been compiled to WebAssembly and call those functions directly from Python. It’s built on top of the wasmer universal WebAssembly runtime, written over just the past year in Rust by a team lead by Syrus Akbary, the author of the Graphene GraphQL library for Python. # 16th April 2019, 6:04 pm

ripgrep is faster than {grep, ag, git grep, ucg, pt, sift} (via) Andrew Gallant’s post from September 2016 introducing ripgrep, the command-line grep tool he wrote using Rust (on top of the Rust regular expression library also written by Andrew). ripgrep is a beautifully designed CLI interface and is crazy fast, and this post describes how it gets its performance in a huge amount of detail, right down to comparing the different algorithmic approaches used by other similar tools. I recently learned that ripgrep ships as part of VS Code, which is why VS Code’s search-across-project feature is so fast. In fact, if you dig around in the OS X package you can find the rg binary already installed on your mac: find /Applications/Visual* | grep bin/rg # 16th April 2019, 5:52 pm

Datasette: ?_where=sql-fragment parameter for table views. I just shipped a tiny but really useful new feature to Datasette master: you can now add ?_where=sql-fragment on to the URL of any table view to inject additional SQL directly into the underlying WHERE clause. This tiny feature actually has some really interesting applications: I created this because I wanted to be able to run more complex custom SQL queries without losing access to the conveniences of Datasette’s table view, in particular the built-in faceting support. The feature actually fits in well with Datasette’s philosophy of allowing arbitrary SQL to be executed against a read-only database: you can turn this ability off using the allow_sql config flag. # 13th April 2019, 2 am

How to Create an Index in Django Without Downtime (via) Excellent advanced tutorial on Django migrations, which uses a desire to create indexes in PostgreSQL without locking the table (with CREATE INDEX CONCURRENTLY) to explain the SeparateDatabaseAndState and atomic features of Django’s migration framework. # 11th April 2019, 3:06 pm

Using 6 Page and 2 Page Documents To Make Organizational Decisions (via) I’ve been thinking a lot recently about the challenges of efficiently getting to consensus within a larger organization spread across multiple locations and time zones. This model described by Ian Nowland based on his experience at AWS seems very promising. The goal is to achieve a decision or “disagree and commit” consensus using a max 6 page document and a one hour meeting. The first fifteen minutes of the meeting are dedicated to silently reading the document—if you’ve read it already you are given the option of arriving fifteen minutes late. # 11th April 2019, 3:46 am

Ministry of Silly Runtimes: Vintage Python on Cloud Run (via) Cloud Run is an exciting new hosting service from Google that lets you define a container using a Dockerfile and then run that container in a “scale to zero” environment, so you only pay for time spent serving traffic. It’s similar to the now-deprecated Zeit Now 1.0 which inspired me to create Datasette. Here Dustin Ingram demonstrates how powerful Docker can be as the underlying abstraction by deploying a web app using a 25 year old version of Python 1.x. # 9th April 2019, 5:33 pm

Generator Tricks for Systems Programmers (via) David Beazley’s definitive generators tutorial from 2008, updated for Python 3.7 in October 2018. # 9th April 2019, 5:13 pm

In the five years since the shark was erected, no other examples have occurred … any system of control must make some small place for the dynamic, the unexpected, the downright quirky. I therefore recommend that the Headington Shark be allowed to remain.

Peter Macdonald # 9th April 2019, 1:58 pm

What is a Self-XSS scam? Facebook link to this page from a console.log message that they display the browser devtools console, specifically warning that “If someone told you to copy-paste something here to enable a Facebook feature or hack someone’s account, it is a scam and will give them access to your Facebook account.” # 8th April 2019, 6:01 pm

Colm MacCárthaigh tells the inside story of how AWS responded to Heartbleed. The Heartbleed SSL vulnerability came out five years ago. In this Twitter thread Colm, who was Amazon’s principal engineer for Elastic Load Balancer at the time, describes how the AWS team responded to something that “was scarier than any bug I’d ever seen”. It’s a cracking story. # 7th April 2019, 8:32 pm

tsv-utils (via) Powerful collection of CLI tools for processing TSV files, written in D for performance and released by eBay. Includes a csv2tsv conversion tool. You can download an archive of pre-built binaries for Linux and OS X from their releases page: worked fine on my Mac. # 7th April 2019, 8:29 pm

csv-diff 0.3.1 (via) I released a minor update to my csv-diff CLI tool today which does a better job of displaying a human-readable representation of rows that have been added or removed from a file—previously they were represented as an ugly JSON dump. My script monitoring changes to the official list of trees in San Francisco has been running for a month now and has captured 23 commits! # 7th April 2019, 8:03 pm

The problem with laziness: minimising performance issues caused by Django’s implicit database queries (via) The ability to accidentally execute further database queries by traversing objects from a Django template is a common source of unexpected performance regressions. django-zen-queries is a neat new library which provides a context manager for disabling database queries during a render (or elsewhere), forcing queries to be explicitly executed in view functions. # 3rd April 2019, 3:49 pm

zson (via) “ZSON is a PostgreSQL extension for transparent JSONB compression. Compression is based on a shared dictionary of strings most frequently used in specific JSONB documents [...] In some cases ZSON can save half of your disk space and give you about 10% more TPS.” # 2nd April 2019, 9:26 pm

For the Fairmont, the Tonga Room is an inherited embarrassment, as though it were a local lord whose ancestors captured a repellent goblin and chained him up in the cellar, but the goblin is inexplicably adored by the townsfolk and the children, who sneak the goblin food and treats, and cry when the goblin’s master moves to strike it.

In the Basement of the King # 28th March 2019, 9:11 pm

The Next CEO of Stack Overflow. “Including the Stack Exchange network of 174 sites, we have over 100 million monthly visitors. Every month, over 125,000 wonderful people write answers”—this fits the rule of thumb for user-generated content that only a tiny portion of your audience will actively create content: in this case it’s just 0.125% (one eighth of one percent). I’d love to know how many people are upvoting or performing other more lightweight interactions. # 28th March 2019, 3:12 pm

Programmer migration patterns. Avery Pennarun explores the history of modern programming languages and how developers have migrated from one to another over time. Lots of fun insights in this. # 28th March 2019, 4:59 am

VisiData (via) Intriguing tool by Saul Pwanson: VisiData is a command-line “textpunk utility” for browsing and manipulating tabular data. “pip3 install visidata” and then “vd myfile.csv” (or .json or .xls or SQLite orothers) and get an interactive terminal UI for quickly searching through the data, conducting frequency analysis of columns, manipulating it and much more besides. Two tips for if you start playing with it: hit “gq” to exit, and hit “Ctrl+H” to view the help screen. # 18th March 2019, 3:45 am

The Cloud and Open Source Powder Keg (via) Stephen O’Grady’s analysis of the Elastic v.s. AWS situation, where Elastic started mixing their open source and non-open source code together and Amazon responded by releasing their own forked “open distribution for Elasticsearch”. World War One analogies included! # 17th March 2019, 7:08 pm

What the Hell is Going On? (via) David Perell discusses how the shift from information scarcity to information abundance is reshaping commerce, education, and politics. Long but worthwhile. # 17th March 2019, 4:50 pm