Simon Willison’s Weblog

Subscribe

Items in 2022

Filters: Year: 2022 × Sorted by date


Single dependency stacks (via) Brandur Leach notes that the core services at Crunchy (admittedly a PostgreSQL hosting and consultancy company) have only one stateful dependency – Postgres. No Redis, ElasticSearch or anything else. This means that problems like rate limiting and search, which are often farmed out to external services, are all handled using either PostgreSQL or in-memory mechanisms on their servers. # 9th February 2022, 6:43 pm

Sha256 Algorithm Explained (via) Absolutely beautiful interactive animated explanation by Domingo Martin of the SHA256 hashing algorithm. # 7th February 2022, 7:27 pm

Every few weeks, someone on Twitter notices how demented the content on Facebook is. I’ve covered a lot of these stories. The quick TL;DR is that Facebook’s video section is essentially run by a network of magicians and Vegas stage performers who hack the platform’s algorithm with surreal low-value content designed to distract users long enough to trigger an in-video advertisement and anger them enough to leave a comment.

Ryan Broderick # 5th February 2022, 10:41 pm

Help scraping: track changes to CLI tools by recording their --help using Git

I’ve been experimenting with a new variant of Git scraping this week which I’m calling Help scraping. The key idea is to track changes made to CLI tools over time by recording the output of their --help commands in a Git repository.

[... 978 words]

webvm.io (via) This is one heck of a tech demo: it’s a full copy of Debian, compiled to WebAssembly and running in your browser. It’s fully stocked with Python, Perl, Ruby, Node.js and even a working gcc compiler! The underlying technology, CheerpX, is a closed-source WebAssembly virtualization platform. # 2nd February 2022, 2:29 am

Writing better release notes

Release notes are an important part of the open source process. I’ve been thinking about these a lot recently, and I’ve assembled some thoughts on how to do a better job with them.

[... 918 words]

A CGo-free port of SQLite. Fascinating Go version of SQLite, which uses Go code that has been translated from the original SQLite C using ccgo, a package by the same author which “translates cc ASTs to Go source code”. It claims to pass the full public SQLite test suite, which is very impressive. # 30th January 2022, 10:25 pm

Mypyc (via) Spotted this in the Black release notes: “Black is now compiled with mypyc for an overall 2x speed-up”. Mypyc is a tool that compiles Python modules (written in a subset of Python) to C extensions—similar to Cython but using just Python syntax, taking advantage of type annotations to perform type checking and type inference. It’s part of the mypy type checking project, which has been using it since 2019 to gain a 4x performance improvement over regular Python. # 30th January 2022, 1:31 am

Black 22.1.0 (via) Black, the uncompromising code formatter for Python, has had its first stable non-beta release after almost four years of releases. I adopted Black a few years ago for all of my projects and I wouldn’t release Python code without it now—the productivity boost I get from not spending even a second thinking about code formatting and indentation is huge.

I know Django has been holding off on adopting it until a stable release was announced, so hopefully that will happen soon. # 30th January 2022, 1:23 am

The baseline for web development in 2022 (via) “TL;DR:The baseline for web development in 2022 is: low-spec Android devices in terms of performance, Safari from two years before in terms of Web Standards, and 4G in terms of networks. The web in general is not answering those needs properly, especially in terms of performance where factors such as an over-dependence on JavaScript are hindering our sites’ performance.” # 27th January 2022, 8:09 pm

Consistent with the practices outlined in SP 800-63B, agencies must remove password policies that require special characters and regular password rotation from all systems within one year of the issuance of this memorandum. These requirements have long been known to lead to weaker passwords in real-world use and should not be employed by the Federal Government.

Memo: Moving the U.S. Government Toward Zero Trust Cybersecurity Principles # 27th January 2022, 7:18 pm

Two reasons Kubernetes is so complex (via) I like how this article proposes that Kubernetes isn’t trying to be a tool for deploying containers—it’s more like an operating system for a cluster of machines, responsible for the same kind of goals as a regular operating system such as resource sharing and portability. And since everything is built as control loops which attempt to modify actual state to fit the declarative desired state, errors can occur asynchronously seconds or even minutes after the desired state has been updated. # 27th January 2022, 6:25 pm

Weeknotes: python_requires, documentation SEO

Fixed Datasette on Python 3.6 for the last time. Worked on documentation infrastructure improvements. Spent some time with Fly Volumes.

[... 1497 words]

Observable Plot Cheatsheets (via) Beautiful new set of cheatsheets by Mike Freeman for the Observable Plot charting library. This is really top notch documentation—the cheatsheets are available as printable PDFs but the real value here is in the interactive versions of them, which include Observable-powered sliders to tweak the different examples and copy out the resulting generated code. # 25th January 2022, 10:12 pm

Roblox Return to Service 10/28-10/31 2021 (via) A particularly good example of a public postmortem on an outage. Roblox was down for 72 hours last year, as a result of an extremely complex set of circumstances which took a lot of effort to uncover. It’s interesting to think through what kind of monitoring you would need to have in place to help identify the root cause of this kind of issue. # 21st January 2022, 4:41 pm

How to Add a Favicon to Your Django Site (via) Adam Johnson did the research on the best way to handle favicons—Safari still doesn’t handle SVG icons so the best solution today is a PNG served from the /favicon.ico path. This article inspired me to finally add a proper favicon to Datasette. # 20th January 2022, 7:03 am

Tricking Postgres into using an insane – but 200x faster – query plan. Jacob Martin talks through a PostgreSQL query optimization they implemented at Spacelift, showing in detail how to interpret the results of EXPLAIN (FORMAT JSON, ANALYZE) using the explain.dalibo.com visualization tool. # 18th January 2022, 8:53 pm

Weeknotes: s3-credentials prefix and Datasette 0.60

A new release of s3-credentials with support for restricting access to keys that start with a prefix, Datasette 0.60 and a write-up of my process for shipping a feature.

[... 1134 words]

SQLime: SQLite Playground (via) Anton Zhiyanov built this useful mobile-friendly online playground for trying things out it SQLite. It uses the sql.js library which compiles SQLite to WebAssembly, so it runs everything in the browser—but it also supports saving your work to Gists via the GitHub API. The JavaScript source code is fun to read: the site doesn’t use npm or Webpack or similar, opting instead to implement everything library-free using modern JavaScript modules and Web Components. # 17th January 2022, 7:08 pm

Abusing AWS Lambda to make an Aussie Search Engine (via) Ben Boyter built a search engine that only indexes .au Australian websites, with the novel approach of directly compiling the search index into 250 different ~40MB large lambda functions written in Go, then running searches across 12 million pages by farming them out to all of the lambdas and combining the results. His write-up includes all sorts of details about how he built this, including how he ran the indexer and how he solved the surprisingly hard problem of returning good-enough text snippets for the results. # 16th January 2022, 8:52 pm

Writing a minimal Lua implementation with a virtual machine from scratch in Rust. Phil Eaton implements a subset of Lua in a Rust in this detailed tutorial. # 15th January 2022, 6:29 pm

Datasette 0.60: The annotated release notes

I released Datasette 0.60 today. It’s a big release, incorporating 61 commits and 18 issues. Here are the annotated release notes.

[... 1119 words]

Announcing Parcel CSS: A new CSS parser, compiler, and minifier written in Rust! An interesting thing about tools like this being written in Rust is that since the Rust-to-WASM pipeline is well trodden at this point, the live demo that this announcement links to runs entirely in the browser. # 13th January 2022, 8:40 pm

How I build a feature

I’m maintaining a lot of different projects at the moment. I thought it would be useful to describe the process I use for adding a new feature to one of them, using the new sqlite-utils create-database command as an example.

[... 2779 words]

What’s new in sqlite-utils 3.20 and 3.21: --lines, --text, --convert

sqlite-utils is my combined CLI tool and Python library for manipulating SQLite databases. Consider this the annotated release notes for sqlite-utils 3.20 and 3.21, both released in the past week.

[... 2456 words]

Before May 2021, the master key in MetaMask was called the “Seed Phrase”. Through user research and insights from our customer support team, we have concluded that this name does not properly convey the critical importance that this master key has for user security. This is why we will be changing our naming of this master key to “Secret Recovery Phrase”. Through May and June of 2021, we will be phasing out the use of “seed phrase” in our application and support articles, and eventually exclusively calling it a “Secret Recovery Phrase.” No action is required, this is only a name change. We will be rolling this out on both the extension and the mobile app for all users.

MetaMask Support # 9th January 2022, 5:44 am

Hashids (via) Confusingly named because it’s not really a hash—this library (available in 40+ languages) offers a way to convert integer IDs to and from short strings of text based on a salt which, if kept secret, should help prevent people from deriving the IDs and using them to measure growth of your service. It works using a base62 alphabet that is shuffled using the salt. # 8th January 2022, 7:31 pm

Crypto creates a massively multiplayer online game where the game is “currency speculation”, and it’s very realistic because it really is money, at least if enough people get involved. [...] NFTs add another layer to the game. Instead of just currency speculation, you’re now simulating art speculation too! The fact that you don’t actually own the art and the fact that the art is randomly generated cartoon images of monkeys is entirely beside the point: the point is the speculation, and winning the game by making money. This is, again, a lot of fun to some people, and in addition to the piles of money they also in some very limited sense own a picture of a cartoon monkey that some people recognize as being very expensive, so they can brag without having to actually post screenshots of their bank balance, which nobody believed anyway.

Laurie Voss # 6th January 2022, 7:35 am

Weeknotes: Taking a break in Moss Landing

Took some time off. Saw some whales and sea otters. Added a new spot to Niche Museums.

[... 578 words]