Simon Willison’s Weblog

Blogmarks in 2020

Filters: Type: blogmark × Year: 2020 ×

Replicating SQLite with rqlite (via) I’ve been trying out rqlite, a “lightweight, distributed relational database, which uses SQLite as its storage engine”. It’s written in Go and uses the Raft consensus algorithm to allow a cluster of nodes to elect a leader and replicate SQLite statements between them. By default it uses in-memory SQLite databases with an on-disk Raft replication log—here are my notes on running it in “on disk” mode as a way to run multiple Datasette processes against replicated SQLite database files. # 28th December 2020, 7:51 pm

How Shopify Uses WebAssembly Outside of the Browser (via) I’m fascinated by applications of WebAssembly outside the browser. As a Python programmer I’m excited to see native code libraries getting compiled to WASM in a way that lets me call them from Python code via a bridge, but the other interesting application is executing untrusted code in a sandbox. Shopify are doing exactly that—they are building a kind-of plugin mechanism where partner code compiled to WASM runs inside their architecture using Fastly’s Lucet. The performance numbers are in the same ballpark as native code. Also interesting: they’re recommending AssemblyScript, a TypeScript-style language designed to compile directly to WASM without needing any additional interpreter support, as required by dynamic languages such as JavaScript, Python or Ruby. # 19th December 2020, 4:46 pm

Commits are snapshots, not diffs (via) Useful, clearly explained revision of some Git fundamentals. # 17th December 2020, 10:01 pm (via) Datasette finally has an official project website, three years after the first release of the software. I built it using Datasette, with custom templates to define the various pages. The site includes news, latest releases, example sites and a new searchable plugin directory. # 11th December 2020, 4:11 am

Deno 1.6 Release Notes. Two signature features in Deno 1.6 worth paying attention to: a built-in language server for code editors like VS Code, and the “deno compile” command which can build Deno JavaScript/TypeScript projects into standalone binaries. The ability to build binaries has turned out to be a killer feature of both Go and Rust, so seeing it ship as a default capability of a interpreted dynamic language is fascinating. I would love it if Python followed Deno’s example. # 10th December 2020, 1:25 am

The case against client certificates (via) Colm MacCárthaigh provides a passionately argued Twitter thread about client certificates and why they should be avoided. I tried using them as an extra layer of protection fir my personal Dogsheep server and ended up abandoning them—certificate management across my devices was too fiddly. # 9th December 2020, 2:41 pm

Cameras and Lenses (via) Fabulous explotable interactive essay by Bartosz Ciechanowski explaining how cameras and lenses work. # 8th December 2020, 3:38 am

The secrets of Monkey Island’s source code (via) To celebrate the thirty year anniversary of the Secret of Monkey Island the Video Game History Foundation interviewed developer Rod Gilbert and produced this comprehensive collection of cut content and material showing how the game was originally constructed. # 5th December 2020, 4:32 pm

Command Line Interface Guidelines (via) Aanand Prasad, Ben Firshman, Carl Tashian and Eva Parish provide the missing manual for designing CLI tools in 2020. Deeply researched and clearly presented—I picked up a bunch of useful tips and ideas from reading this, and I’m looking forward to applying them to my own CLI projects. # 4th December 2020, 8:44 pm

Scaling Datastores at Slack with Vitess (via) Slack spent three years migrating 99% of their MySQL query load to run against Vitess, the open source MySQL sharding system originally built by YouTube. “Today, we serve 2.3 million QPS at peak. 2M of those queries are reads and 300K are writes. Our median query latency is 2 ms, and our p99 query latency is 11 ms.” # 1st December 2020, 9:30 pm

New for AWS Lambda – Container Image Support. “You can now package and deploy Lambda functions as container images of up to 10 GB in size”—can’t wait to try this out with Datasette. # 1st December 2020, 5:34 pm

Why is Apple’s M1 Chip So Fast? (via) This explanation by Erik Engheim is exactly the right level of nerdery for me. # 30th November 2020, 11:20 pm

Datasette Weekly volume 4. Datasette Office Hours, Personal Data Warehouses, datasette-ripgrep, datasette-indieauth, Datasette Client for Observable and more. # 30th November 2020, 11:02 pm

Datasette Office Hours (via) I’ve decided to try running Datasette Office Hours every Friday. If you’d like to chat with me over Zoom about the project for 20 minutes for any reason at all you can grab a slot here using Calendly. # 30th November 2020, 11:01 pm

Datasette 0.52. A relatively small release—it has a new plugin hook (database_actions(), for adding links to a new database actions menu), renames the --config option to --setting and adds a new “datasette publish cloudrun --apt-get-install” option. # 29th November 2020, 12:56 am

Unravelling `not` in Python (via) Part of a series where Brett Cannon looks at how fundamental Python syntactic sugar works, including a clearly explained dive into the underlying op codes and C implementation. # 27th November 2020, 5:59 pm

Datasette Client for Observable (via) Really elegant piece of code design from Alex Garcia: DatasetteClient is a client library he built designed to work in Observable notebooks, which uses JavaScript tagged template literals to allow SQL query results to be executed against a Datasette instance and displayed as inline tables in a notebook, or used to return JSON data for further processing. His example notebook includes a neat d3 stacked area chart example built against a Datasette of congresspeople, plus examples using interactive widgets to update the Notebook. # 24th November 2020, 6:53 pm

datasette-graphql 1.2 (via) A new release of the datasette-graphql plugin, fixing a minor security flaw: previous versions of the plugin could expose the schema (but not the actual data) of tables in databases that were otherwise protected by Datasette’s permission system. # 21st November 2020, 10:21 pm

I Lived Through A Stupid Coup. America Is Having One Now (via) If, like me, you have been avoiding the word “coup” since it feels like a clear over-reaction to what’s going on, I challenge you to read this piece and not change your mind. # 21st November 2020, 1:21 pm

The trouble with transaction.atomic (via) David Seddon provides a detailed explanation of Django’s nestable transaction.atomic() context manager and describes a gotcha that can occur if you lose track of whether your code is already running in a transaction block, since you may be working with savepoints instead—along with some smart workarounds. # 20th November 2020, 3:57 pm

Internet Archive Software Library: Flash (via) A fantastic new initiative from the Internet Archive: they’re now archiving Flash (.swf) files and serving them for modern browsers using Ruffle, a Flash Player emulator written in Rust and compiled to WebAssembly. They are fully interactive and audio works too. Considering the enormous quantity of creative material released in Flash over the decades this helps fill a big hole in the Internet’s cultural memory. # 19th November 2020, 9:19 pm

Security vulnerability in datasette-indieauth: Implementation trusts the “me” field returned by the authorization server without verifying it. I spotted a critical security vulnerability in my new datasette-indieauth plugin: it accepted the “me” profile URL value returned from the authorization server in the final step of the IndieAuth flow without verifying it, which means a malicious server could imitate any user. I’ve shipped 1.1 with a fix and posted a security advisory to the GitHub repository. # 19th November 2020, 9:14 pm

Amstelvar (via) A real showcase of what variable fonts can do: this open source font by David Berlow has 17 different variables controlling many different aspects of the font. # 17th November 2020, 3:24 pm

Ok Google: please publish your DKIM secret keys (via) The DKIM standard allows email providers such as Gmail to include cryptographic headers that protect against spoofing, proving that an email was sent by a specific host and has not been tampered with. But it has an unintended side effect: if someone’s email is leaked (as happened to John Podesta in 2016) DKIM headers can be used to prove the validity of the leaked emails. This makes DKIM an enabling factor for blackmail and other security breach related crimes. Matthew Green proposes a neat solution: providers like Gmail should rotate their DKIM keys frequently and publish the PRIVATE key after rotation. By enabling spoofing of past email headers they would provide deniability for victims of leaks, fixing this unintended consequence of the DKIM standard. # 16th November 2020, 10:02 pm

CoronaFaceImpact (via) Variable fonts are fonts that can be customized by passing in additional parameters, which is done in CSS using the font-variation-settings property. Here’s a ​variable font that shows multiple effects of Covid-19 lockdown on a bearded face, created by Friedrich Althausen. # 15th November 2020, 10:41 pm

The Cleanest Trick for Autogrowing Textareas (via) This is a very clever trick. Textarea content is mirrored into a data attribute using a JavaScript one-liner, then a visibility: hidden ::after element clones that content using content: attr(data-replicated-value). The hidden element exists in a CSS grid with the textarea which allows the textarea to resize within the grid when the hidden element increases its height. # 14th November 2020, 5:24 am

Hunting for Malicious Packages on PyPI (via) Jordan Wright installed all 268,000 Python packages from PyPI in containers, and ran Sysdig to capture syscalls made during installation to see if any of them were making extra network calls or reading or writing from the filesystem. Absolutely brilliant piece of security engineering and research. # 14th November 2020, 4:48 am

Intent to Remove: HTTP/2 and gQUIC server push (via) The Chrome / Blink team announce their intent to remove HTTP/2 server push support, where servers can start pushing an asset to a client before it has been requested. It’s been in browsers for over five years now and adoption is terrible. “Over the past 28 days [...] 99.97% of connections never received a pushed stream that got matched with a request [...] These numbers are exactly the same as in June 2019”. Datasette serves redirects with Link: preload headers that cause smart proxies (like Cloudflare) to push the redirected page to the client along with the redirect, but I don’t exepect to miss that optimization if it quietly stops working. # 12th November 2020, 1:44 am

nyt-2020-election-scraper. Brilliant application of git scraping by Alex Gaynor and a growing team of contributors. Takes a JSON snapshot of the NYT’s latest election poll figures every five minutes, then runs a Python script to iterate through the history and build an HTML page showing the trends, including what percentage of the remaining votes each candidate needs to win each state. This is the perfect case study in why it can be useful to take a “snapshot if the world right now” data source and turn it into a git revision history over time. # 6th November 2020, 2:24 pm