Simon Willison’s Weblog

Subscribe

September 2020

87 posts: 8 entries, 10 links, 5 quotes, 64 beats

Sept. 19, 2020

DuckDB (via) This is a really interesting, relatively new database. It’s kind of a weird hybrid between SQLite and PostgreSQL: it uses the PostgreSQL parser but models itself after SQLite in that databases are a single file and the code is designed for use as an embedded library, distributed in a single amalgamation C++ file (SQLite uses a C amalgamation). It features a “columnar-vectorized query execution engine” inspired by MonetDB (also by the DuckDB authors) and is hence designed to run analytical queries really quickly. You can install it using “pip install duckdb”—the resulting module feels similar to Python’s sqlite3, and follows roughly the same DBAPI pattern.

# 11:43 pm / databases, postgresql, sqlite, duckdb

Sept. 20, 2020

One academic who interviewed attendees of a flat-earth convention found that, almost to a person, they'd discovered the subculture via YouTube recommendations.

YouTube’s Plot to Silence Conspiracy Theories

# 1:27 am / conspiracy, youtube

Release sqlite-utils 2.19 — Python CLI utility and library for manipulating SQLite databases
Release dogsheep-beta 0.9a0 — Build a search index across content from multiple SQLite database tables and run faceted searches against it using Datasette

Sept. 22, 2020

TIL Understanding option names in Click — I hit [a bug today](https://github.com/simonw/datasette/issues/973) where I had defined a Click option called `open` but in doing so I replaced the Python bulit-in `open()` function:

Sept. 23, 2020

Release sqlite-utils 2.20 — Python CLI utility and library for manipulating SQLite databases

Executing advanced ALTER TABLE operations in SQLite

Visit Executing advanced ALTER TABLE operations in SQLite

SQLite’s ALTER TABLE has some significant limitations: it can’t drop columns (UPDATE: that was fixed in SQLite 3.35.0 in March 2021), it can’t alter NOT NULL status, it can’t change column types. Since I spend a lot of time with SQLite these days I’ve written some code to fix this—both from Python and as a command-line utility.

[... 689 words]

Refactoring databases with sqlite-utils extract

Visit Refactoring databases with sqlite-utils extract

Yesterday I described the new sqlite-utils transform mechanism for applying SQLite table transformations that go beyond those supported by ALTER TABLE. The other new feature in sqlite-utils 2.20 builds on that capability to allow you to refactor a database table by extracting columns into separate tables. I’ve called it sqlite-utils extract.

[... 1,345 words]

Sept. 24, 2020

Release sqlite-utils 2.21 — Python CLI utility and library for manipulating SQLite databases

Sept. 26, 2020

Weeknotes: software carpentry, compiling modules for SQLite

Visit Weeknotes: software carpentry, compiling modules for SQLite

This week I completed the Software Carpentry instructor training course, added two foundational features to sqlite-utils and learned how to compile modules for SQLite.

[... 805 words]

The Bias-for-Building Fallacy is most common in orgs that worship speed. That's fine, but if you go speedily in the wrong direction, you will end up in the wrong place. That’s why teams should value velocity much more than speed: velocity being a combo of speed & direction.

Shreyas Doshi

# 2:07 pm / product-management

Sept. 27, 2020

Inevitably we got round to talking about async.

As much of an unneeded complication as it is for so many day-to-day use-cases, it’s important for Python because, if and when you do need the high throughput handling of these io-bound use-cases, you don’t want to have to switch language.

The same for Django: most of what you’re doing has no need of async but you don’t want to have to change web framework just because you need a sprinkling of non-blocking IO.

Carlton Gibson

# 3:09 pm / async, django, python, carlton-gibson

TIL Figuring out if a text value in SQLite is a valid integer or float — Given a table with a `TEXT` column in SQLite I want to figure out if every value in that table is actually the text representation of an integer or floating point value, so I can decide if it's a good idea to change the type of the column (using [sqlite-utils transform](https://sqlite-utils.datasette.io/en/stable/python-api.html#transforming-a-table)).

Sept. 28, 2020

Release datasette-dateutil 0.1 — dateutil functions for Datasette

datasette-dateutil (via) New Datasette plugin exposing date/time parsing custom SQL functions powered by the classic dateutil Python library.

# 12:33 am / dateutil, plugins, projects, datasette

elite-source.asm—annotated source code for Elite on the BBC Micro (via) Mark Moxon has annotated every single line of the source code for Elite on the BBC Micro, and his annotations are so clear and in-depth that I can follow it despite knowing next to nothing about assembly code (and certainly nothing about writing it for the BBC).

# 2:30 am / programming, retro

Release datasette-import-table 0.1a0 — Datasette plugin for importing tables from other Datasette instances
Release datasette-import-table 0.1a1 — Datasette plugin for importing tables from other Datasette instances
Release datasette-import-table 0.1a2 — Datasette plugin for importing tables from other Datasette instances
Release datasette-import-table 0.1 — Datasette plugin for importing tables from other Datasette instances
Release datasette-import-table 0.2 — Datasette plugin for importing tables from other Datasette instances

I was wrong. CRDTs are the future (via) Joseph Gentle has been working on collaborative editors since being a developer on Google Wave back in 2010, later building ShareJS. He’s used Operational Transforms throughout, due to their performance and memory benefits over CRDTs (Conflict-free replicated data types)—but the latest work in that space from Martin Kleppmann and other researchers has seen him finally switch allegiance to these newer algorithms. As a long-time fan of collaborative editing (ever since the Hydra/SubEthaEdit days) I thoroughly enjoyed this as an update on how things have evolved over the past decade.

# 9:03 pm / algorithms, collaboration, crdt, martin-kleppmann

Sept. 29, 2020

Release datasette-dateutil 0.2 — dateutil functions for Datasette

Sept. 30, 2020

Release datasette-dateutil 0.2.1 — dateutil functions for Datasette
Release datasette-edit-schema 0.3a — Datasette plugin for modifying table schemas
Release datasette-edit-schema 0.3a1 — Datasette plugin for modifying table schemas
Release datasette-cluster-map 0.12.1 — Datasette plugin that shows a map for any data with latitude/longitude columns

2020 » September

MTWTFSS
 123456
78910111213
14151617181920
21222324252627
282930