Simon Willison’s Weblog

Items in Aug, 2020

Filters: Year: 2020 × Month: Aug ×

airtable-export. I wrote a command-line utility for exporting data from Airtable and dumping it to disk as YAML, JSON or newline delimited JSON files. This means you can backup an Airtable database from a GitHub Action and get a commit history of changes made to your data. # 29th August 2020, 9:48 pm

Mexican bean shakshuka

The first time I ever tried a shakshuka I cooked it from this recipe by J. Kenji López-Alt. It quickly became my favourite breakfast.

[... 448 words]

Weeknotes: California Protected Areas in Datasette

This week I built a geospatial search engine for protected areas in California, shipped datasette-graphql 1.0 and started working towards the next milestone for Datasette Cloud.

[... 1099 words]

California Protected Areas Database in Datasette (via) I built this yesterday: it’s a Datasette interface on top of the CPAD 2020 GIS database of protected areas in California maintained by GreenInfo Network. This was a useful excuse to build a GitHub Actions flow that builds a SpatiaLite database using my shapefile-to-sqlite tool, and I fixed a few bugs in my datasette-leaflet-geojson plugin as well. # 21st August 2020, 11:15 pm

Why weekly? You want to keep your finger on the pulse of what’s really going on. When 1:1s are scheduled bi-weekly, and either of you have to cancel, you’ll likely be going a month between conversations and that is far too long to go without having a 1:1 with your direct report. Think of how much happens in a month. You don’t want to be that far behind!

Adrienne Lowe # 21st August 2020, 5:02 pm

Weeknotes: Rocky Beaches, Datasette 0.48, a commit history of my database

This week I helped Natalie launch Rocky Beaches, shipped Datasette 0.48 and several releases of datasette-graphql, upgraded the CSRF protection for datasette-upload-csvs and figured out how to get a commit log of changes to my blog by backing up its database to a GitHub repository.

[... 1294 words]

Announcing the Consortium for Python Data API Standards (via) Interesting effort to unify the fragmented DataFrame API ecosystem, where increasing numbers of libraries offer APIs inspired by Pandas that imitate each other but aren’t 100% compatible. The announcement includes some very clever code to support the effort: custom tooling to compare the existing APIs, and an ingenious GitHub Actions setup to run traces (via sys.settrace), derive type signatures and commit those generated signatures back to a repository. # 19th August 2020, 5:48 am

Weeknotes: Installing Datasette with Homebrew, more GraphQL, WAL in SQLite

This week I’ve been working on making Datasette easier to install, plus wide-ranging improvements to the Datasette GraphQL plugin.

[... 1009 words]

We’re generally only impressed by things we can’t do—things that are beyond our own skill set. So, by definition, we aren’t going to be that impressed by the things we create. The end user, however, is perfectly able to find your work impressive.

@gamemakerstk # 13th August 2020, 3:12 pm

Datasette 0.46 (via) I just released Datasette 0.46 with a security fix for an issue involving CSRF tokens on canned query pages, plus a new debugging tool, improved file downloads and a bunch of other smaller improvements. # 9th August 2020, 4:57 pm

COVID-19 attacks our physical bodies, but also the cultural foundations of our lives, the toolbox of community and connectivity that is for the human what claws and teeth represent to the tiger.

Wade Davis # 8th August 2020, 3:48 pm

Pysa: An open source static analysis tool to detect and prevent security issues in Python code (via) Interesting new static analysis tool for auditing Python for security vulnerabilities—things like SQL injection and os.execute() calls. Built by Facebook and tested extensively on Instagram, a multi-million line Django application. # 7th August 2020, 8:50 pm

Design Docs at Google. Useful description of the format used for software design docs at Google—informal documents of between 3 and 20 pages that outline the proposed design of a new project, discuss trade-offs that were considered and solicit feedback before the code starts to be written. # 7th August 2020, 4:31 pm

GraphQL in Datasette with the new datasette-graphql plugin

This week I’ve mostly been building datasette-graphql, a plugin that adds GraphQL query support to Datasette.

[... 1249 words]

Zero Downtime Release: Disruption-free Load Balancing of a Multi-Billion User Website (via) I remain fascinated by techniques for zero downtime deployment—once you have it working it makes shipping changes to your software so much less stressful, which means you can iterate faster and generally be much more confident in shipping code. Facebook have invested vast amounts of effort into getting this right, and their new paper for the ACM SIGCOMM conference goes into detail about how it all works. # 5th August 2020, 3:27 am

When you talk with cheese aficionados, it doesn’t usually take long for the conversation to veer this way: away from curds, whey, and mold, and toward matters of life and death. With the zeal of nineteenth-century naturalists, they discuss great lineages and endangered species, painstakingly cataloguing those cheeses that are thriving and those that are lost to history.

Ruby Tandoh # 2nd August 2020, 6:57 pm

How a Cheese Goes Extinct (via) Ruby Tandoh writes for the New Yorker about the culture, history and anthropology of cheesemaking through the lens of the British cheese industry. I learned that two of my favourite British cheeses—Tymsboro and Innes Log, have sadly ceased production. Beautifully written. # 2nd August 2020, 5:51 pm

sqlite-utils 2.14 (via) I finally figured out porter stemming with SQLite full-text search today—it turns out it’s as easy as adding tokenize=’porter’ to the CREATE VIRTUAL TABLE statement. So I just shipped sqlite-utils 2.14 with a tokenize= option (plus the ability to insert binary file data from stdin). # 1st August 2020, 9:19 pm

The impact of crab mentality on performance was quantified by a New Zealand study in 2015 which demonstrated up to an 18% average exam result improvement for students when their grades were reported in a way that prevented others from knowing their position in published rankings.

Crab mentality on Wikipedia # 1st August 2020, 4:25 pm

James Bennett on why Django should not support JWT in core (via) The topic of adding JWT support to Django core comes up occasionally—here’s James Bennett’s detailed argument for not doing that. The short version is that the JWT specification isn’t just difficult to implement securely: it’s fundamentally flawed, which results in things like five implementations in three different languages all manifesting the same vulnerability. Third party modules exist that add JWT support to Django, but baking it into core would act as a form of endorsement and Django’s philosophy has always been to encourage people towards best practices. # 1st August 2020, 12:28 am