Simon Willison’s Weblog

Subscribe

February 2021

80 posts: 11 entries, 11 links, 6 quotes, 52 beats

Feb. 8, 2021

Finally, remember that whatever choice is made, you’re going to need to get behind it! You should be able to make a compelling positive case for any of the options you present. If there’s an option you can’t support, don’t present it.

Jacob Kaplan-Moss

# 3:21 pm / jacob-kaplan-moss, management

Feb. 10, 2021

Dependency Confusion: How I Hacked Into Apple, Microsoft and Dozens of Other Companies (via) Alex Birsan describes a new category of security vulnerability he discovered in the npm, pip and gem packaging ecosystems: if a company uses a private repository with internal package names, uploading a package with the same name to the public repository can often result in an attacker being able to execute their own code inside the networks of their target. Alex scored over $130,000 in bug bounties from this one, from a number of name-brand companies. Of particular note for Python developers: the --extra-index-url argument to pip will consult both public and private registries and install the package with the highest version number!

# 8:42 pm / pip, python, security, npm

Feb. 11, 2021

Release datasette-tiles 0.6 — Mapping tile server for Datasette, serving tiles from MBTiles packages
Release datasette-tiles 0.6.1 — Mapping tile server for Datasette, serving tiles from MBTiles packages

Why I Built Litestream. Litestream is a really exciting new piece of technology by Ben Johnson, who previously built BoltDB, the key-value store written in Go that is used by etcd. It adds replication to SQLite by running a process that converts the SQLite WAL log into a stream that can be saved to another folder or pushed to S3. The S3 option is particularly exciting—Ben estimates that keeping a full point-in-time recovery log of a high write SQLite database should cost in the order of a few dollars a month. I think this could greatly expand the set of use-cases for which SQLite is sensible choice.

# 7:25 pm / replication, scaling, sqlite, ben-johnson

trustme (via) This looks incredibly useful. Run python -m trustme and it will create three files for you: server.pem, server.key and a client.pem client certificate, providing a certificate for "localhost" (or another host you spefict) using a fake certificate authority. Looks like it should be the easiest way to test TLS locally.

# 8 pm / certificates, tls

Litestream runs continuously on a test server with generated load and streams backups to S3. It uses physical replication so it'll actually restore the data from S3 periodically and compare the checksum byte-for-byte with the current database.

Ben Johnson

# 8:50 pm / testing, litestream, ben-johnson

Release evernote-to-sqlite 0.3 — Tools for converting Evernote content to SQLite
Release evernote-to-sqlite 0.3.1 — Tools for converting Evernote content to SQLite

Feb. 14, 2021

Weeknotes: Finally, an intro video for Datasette

Visit Weeknotes: Finally, an intro video for Datasette

My big project this week was this Video introduction to Datasette and sqlite-utils. I recorded the video a few weeks ago in advance of FOSDEM, but this week I put together the annotated version. I’m really happy with it, and I’ve added it to the datasette.io homepage as a starting point for helping people understand the project.

[... 690 words]

Release sqlite-utils 3.5 — Python CLI utility and library for manipulating SQLite databases

Feb. 15, 2021

TIL Handling CSV files with wide columns in Python — Users [were reporting](https://github.com/simonw/sqlite-utils/issues/229) the following error using `sqlite-utils` to import some CSV files:
TIL Using io.BufferedReader to peek against a non-peekable stream — When building the [--sniff option](https://github.com/simonw/sqlite-utils/issues/230) for `sqlite-utils insert` (which attempts to detect the correct CSV delimiter and quote character by looking at the first 2048 bytes of a CSV file) I had the need to peek ahead in an incoming stream of data.

Feb. 16, 2021

Release download-tiles 0.4.1 — Download map tiles and store them in an MBTiles database
Release higher-lower 0.1 — Functions for finding numbers using higher/lower

Feb. 18, 2021

TIL Loading radio.garden into SQLite using jq — http://radio.garden/ is an amazing website which displays a 3D globe covered in radio stations and lets you listen to any of them.
TIL Using sphinx.ext.extlinks for issue links — Datasette's [release notes](https://github.com/simonw/datasette/blob/main/docs/changelog.rst) are formatted using Sphinx. Almost every bullet point links to the corresponding GitHub issue, so they were full of lines that look like this:
Release datasette-graphql 1.4 — Datasette plugin providing an automatic GraphQL API for your SQLite databases

Feb. 19, 2021

One of the hardest things I’ve had to learn is that humans aren’t pure functions: an input that works one day and gets one result, then again another day and get an entirely different result.

Sarah Drasner

# 12 am / management

Release datasette 0.55 — An open source multi-tool for exploring and publishing data
Release sqlite-utils 3.6 — Python CLI utility and library for manipulating SQLite databases

Open source projects: consider running office hours

Back in December I decided to try something new for my Datasette open source project: Datasette Office Hours. The idea is simple: anyone can book a 25 minute conversation with me on a Friday to talk about the project. I’m interested in talking to people who are using Datasette, or who are considering using it, or who just want to have a chat.

[... 786 words]

Feb. 20, 2021

Release datasette-json-preview 0.3 — Preview of new JSON default format for Datasette

Feb. 21, 2021

Cross-database queries in SQLite (and weeknotes)

I released Datasette 0.55 and sqlite-utils 3.6 this week with a common theme across both releases: supporting cross-database joins.

[... 720 words]

Feb. 22, 2021

Release airtable-export 0.5 — Export Airtable data to YAML, JSON or SQLite files on disk

Getting started

Here we go then... I’ve signed up to work on this project full-time, four days a week!

[... 609 words]

Release sqlite-transform 0.4 — Tool for running transformations on columns in a SQLite database

People, processes, priorities. Twitter thread from Adrienne Porter Felt outlining her model for thinking about engineering management. I like this trifecta of “people, processes, priorities” a lot.

# 5:21 pm / management

Blazing fast CI with pytest-split and GitHub Actions (via) pytest-split is a neat looking variant on the pattern of splitting up a test suite to run different parts of it in parallel on different machines. It involves maintaining a periodically updated JSON file in the repo recording the average runtime of different tests, to enable them to be more fairly divided among test runners. Includes a recipe for running as a matrix in GitHub Actions.

# 7:06 pm / testing, pytest, github-actions

Business rules engines are li’l Conway’s Law devices: a manifestation of the distrust between stakeholders, client and contractor. We require BREs so that separate business units need not talk to each other to solve problems. They are communication and organizational dysfunction made silicon.

Paul Smith

# 8:34 pm / software-engineering

2021 » February

MTWTFSS
1234567
891011121314
15161718192021
22232425262728