Simon Willison’s Weblog

Items in Aug

Filters: Month: Aug ×


airtable-export. I wrote a command-line utility for exporting data from Airtable and dumping it to disk as YAML, JSON or newline delimited JSON files. This means you can backup an Airtable database from a GitHub Action and get a commit history of changes made to your data. # 29th August 2020, 9:48 pm

Mexican bean shakshuka

The first time I ever tried a shakshuka I cooked it from this recipe by J. Kenji López-Alt. It quickly became my favourite breakfast.

[... 448 words]

Weeknotes: California Protected Areas in Datasette

This week I built a geospatial search engine for protected areas in California, shipped datasette-graphql 1.0 and started working towards the next milestone for Datasette Cloud.

[... 1099 words]

California Protected Areas Database in Datasette (via) I built this yesterday: it’s a Datasette interface on top of the CPAD 2020 GIS database of protected areas in California maintained by GreenInfo Network. This was a useful excuse to build a GitHub Actions flow that builds a SpatiaLite database using my shapefile-to-sqlite tool, and I fixed a few bugs in my datasette-leaflet-geojson plugin as well. # 21st August 2020, 11:15 pm

Why weekly? You want to keep your finger on the pulse of what’s really going on. When 1:1s are scheduled bi-weekly, and either of you have to cancel, you’ll likely be going a month between conversations and that is far too long to go without having a 1:1 with your direct report. Think of how much happens in a month. You don’t want to be that far behind!

Adrienne Lowe # 21st August 2020, 5:02 pm

Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database

This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of datasette-graphql, upgraded the CSRF protection for datasette-upload-csvs and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.

[... 1294 words]

Announcing the Consortium for Python Data API Standards (via) Interesting effort to unify the fragmented DataFrame API ecosystem, where increasing numbers of libraries offer APIs inspired by Pandas that imitate each other but aren’t 100% compatible. The announcement includes some very clever code to support the effort: custom tooling to compare the existing APIs, and an ingenious GitHub Actions setup to run traces (via sys.settrace), derive type signatures and commit those generated signatures back to a repository. # 19th August 2020, 5:48 am

Weeknotes: Installing Datasette with Homebrew, more GraphQL, WAL in SQLite

This week I’ve been working on making Datasette easier to install, plus wide-ranging improvements to the Datasette GraphQL plugin.

[... 1009 words]

We’re generally only impressed by things we can’t do—things that are beyond our own skill set. So, by definition, we aren’t going to be that impressed by the things we create. The end user, however, is perfectly able to find your work impressive.

@gamemakerstk # 13th August 2020, 3:12 pm

Datasette 0.46 (via) I just released Datasette 0.46 with a security fix for an issue involving CSRF tokens on canned query pages, plus a new debugging tool, improved file downloads and a bunch of other smaller improvements. # 9th August 2020, 4:57 pm

COVID-19 attacks our physical bodies, but also the cultural foundations of our lives, the toolbox of community and connectivity that is for the human what claws and teeth represent to the tiger.

Wade Davis # 8th August 2020, 3:48 pm

Pysa: An open source static analysis tool to detect and prevent security issues in Python code (via) Interesting new static analysis tool for auditing Python for security vulnerabilities—things like SQL injection and os.execute() calls. Built by Facebook and tested extensively on Instagram, a multi-million line Django application. # 7th August 2020, 8:50 pm

Design Docs at Google. Useful description of the format used for software design docs at Google—informal documents of between 3 and 20 pages that outline the proposed design of a new project, discuss trade-offs that were considered and solicit feedback before the code starts to be written. # 7th August 2020, 4:31 pm

GraphQL in Datasette with the new datasette-graphql plugin

This week I’ve mostly been building datasette-graphql, a plugin that adds GraphQL query support to Datasette.

[... 1249 words]

Zero Downtime Release: Disruption-free Load Balancing of a Multi-Billion User Website (via) I remain fascinated by techniques for zero downtime deployment—once you have it working it makes shipping changes to your software so much less stressful, which means you can iterate faster and generally be much more confident in shipping code. Facebook have invested vast amounts of effort into getting this right, and their new paper for the ACM SIGCOMM conference goes into detail about how it all works. # 5th August 2020, 3:27 am

When you talk with cheese aficionados, it doesn’t usually take long for the conversation to veer this way: away from curds, whey, and mold, and toward matters of life and death. With the zeal of nineteenth-century naturalists, they discuss great lineages and endangered species, painstakingly cataloguing those cheeses that are thriving and those that are lost to history.

Ruby Tandoh # 2nd August 2020, 6:57 pm

How a Cheese Goes Extinct (via) Ruby Tandoh writes for the New Yorker about the culture, history and anthropology of cheesemaking through the lens of the British cheese industry. I learned that two of my favourite British cheeses—Tymsboro and Innes Log, have sadly ceased production. Beautifully written. # 2nd August 2020, 5:51 pm

sqlite-utils 2.14 (via) I finally figured out porter stemming with SQLite full-text search today—it turns out it’s as easy as adding tokenize=’porter’ to the CREATE VIRTUAL TABLE statement. So I just shipped sqlite-utils 2.14 with a tokenize= option (plus the ability to insert binary file data from stdin). # 1st August 2020, 9:19 pm

The impact of crab mentality on performance was quantified by a New Zealand study in 2015 which demonstrated up to an 18% average exam result improvement for students when their grades were reported in a way that prevented others from knowing their position in published rankings.

Crab mentality on Wikipedia # 1st August 2020, 4:25 pm

James Bennett on why Django should not support JWT in core (via) The topic of adding JWT support to Django core comes up occasionally—here’s James Bennett’s detailed argument for not doing that. The short version is that the JWT specification isn’t just difficult to implement securely: it’s fundamentally flawed, which results in things like five implementations in three different languages all manifesting the same vulnerability. Third party modules exist that add JWT support to Django, but baking it into core would act as a form of endorsement and Django’s philosophy has always been to encourage people towards best practices. # 1st August 2020, 12:28 am

Subsume JSON a.k.a. JSON ⊂ ECMAScript (via) TIL that JSON isn’t a subset of ECMAScript after all! “In ES2018, ECMAScript string literals couldn’t contain unescaped U+2028 LINE SEPARATOR and U+2029 PARAGRAPH SEPARATOR characters, because they are considered to be line terminators even in that context.” # 15th August 2019, 10:30 am

Y’all decided you could send 6x as much script because the high-end could take it...but the next billion users can’t. There might have been budget for 2x, but not 6x. Not by a long shot.

Alex Russell # 9th August 2019, 6:53 am

OPP (Other People’s Problems) (via) Camille Fournier provides a comprehensive guide to picking your battles: in a large organization how can you navigate the enormous array of problems you can see that you’d like to fix, especially when so many of those problems aren’t directly in your area of control? # 7th August 2019, 1:58 pm

This is when I pull out “we don’t do that here.” It is a conversation ender. If you are the newcomer and someone who has been around a long time says “we don’t do that here”, it is hard to argue. This sentence doesn’t push my morality on anyone. If they want to do whatever it is elsewhere, I’m not telling them not to. I’m just cluing them into the local culture and values.

Aja Hammerly # 5th August 2019, 3:59 pm

Optimizing for the mobile web: Moving from Angular to Preact. Grubhub reduced their mobile web load times from 9-11s to 3-4s by replacing Angular with Preact (and replacing other libraries such as lodash with native JavaScript code). The conversion took 6 months and involved running Angular and Preact simultaneously during the transition—not a huge additional overhead as Preact itself is only 4KB. They used TypeScript throughout and credit it with providing a great deal of confidence and productivity to the overall refactoring. # 5th August 2019, 12:26 pm

Working with many-to-many relationships in sqlite-utils (via) I just released sqlite-utils 1.9 with syntactic sugar support for creating many-to-many relationships for records stored in SQLite databases. # 4th August 2019, 3:57 am

Logs vs. metrics: a false dichotomy (via) Nick Stenning discusses the differences between logs and metrics: most notably that metrics can be derived from logs but logs cannot be reconstituted starting with time-series metrics. # 3rd August 2019, 4:46 pm

Documentation needs to include and be structured around its four different functions: tutorials, how-to guides, explanation and technical reference. Each of them requires a distinct mode of writing. People working with software need these four different kinds of documentation at different times, in different circumstances—so software usually needs them all.

Daniele Procida # 3rd August 2019, 8:29 am

PyPI now supports uploading via API token (via) All of my open source Python libraries are set up to automatically deploy new tagged releases as PyPI packages using Circle CI or Travis, but I’ve always get a bit uncomfortable about sharing my PyPI password with those CI platforms to get this to work. PyPI just added scopes authentication tokens, which means I can issue a token that’s only allowed to upload a specific project and see an audit log of when that token was last used. # 1st August 2019, 4:03 pm

Advice for a new executive, by Chad Dickerson (via) Lara Hogan shares the advice she was given by Chad Dickerson (CTO and then CEO of Etsy) when she first became VP Engineering at Kickstarter. There is so much good material in here. I can vouch for the “peer support group” recommendation: Natalie and I benefited from that through Y Combinator and ended up building our own founder peer support group when we moved our startup back to London. Having a confidential trusted group with which to discuss the challenges of growing a company was invaluable. # 31st August 2018, 1:45 pm