August 2022
67 posts: 5 entries, 24 links, 2 quotes, 36 beats
Aug. 14, 2022
Aug. 16, 2022
Efficient Pagination Using Deferred Joins (via) Surprisingly simple trick for speeding up deep OFFSET x LIMIT y pagination queries, which get progressively slower as you paginate deeper into the data. Instead of applying them directly, apply them to a “select id from ...” query to fetch just the IDs, then either use a join or run a separate “select * from table where id in (...)” query to fetch the full records for that page.
Aug. 17, 2022
Plugin support for Datasette Lite
I’ve added a new feature to Datasette Lite, my distribution of Datasette that runs entirely in the browser using Python and SQLite compiled to WebAssembly. You can now install additional Datasette plugins by passing them in the URL.
[... 865 words]Crunchy Data: Learn Postgres at the Playground (via) Crunchy Data have a new PostgreSQL tutorial series, with a very cool twist: they have a build of PostgreSQL compiled to WebAssembly which runs in the browser, so each tutorial is accompanied by a working psql terminal that lets you try out the tutorial contents interactively. It even has support for PostGIS, though that particular tutorial has to load 80MB of assets in order to get it to work!
Building games and apps entirely through natural language using OpenAI’s code-davinci model. A deeply sophisticated example of using prompts to generate entire working JavaScript programs and games using the new code-davinci OpenAI model.
Aug. 18, 2022
Aug. 19, 2022
The Datasette Newsletter: Datasette Lite, Datasette Tutorials, Datasette Cloud. It’s been quite a while since I’ve sent one of these out now—hoping to get this on to a more regular schedule.
Aug. 20, 2022
Shoelace (via) Saw this for the first time today: it’s a relatively new library of framework-agnostic Web Components, built on lit-html and covering a huge array of common functionality: buttons and sliders and dialogs and drawer interfaces and dropdown menus and so on. The design is very clean, the documentation is superb—and it looks like you can cherry pick just the components you are using for a pretty lean addition to your page weight. So refreshing to see libraries like this that really take advantage of modern web standards.
Show HN: A new way to use GPT-3 to generate code (and everything else).
Riley Goodside is my favourite Twitter follow for GPT-3 tips. Here he describes a powerful prompt pattern he's designed which lets you generate extremely complex code output by asking GPT-3 to fill in $$areas like this$$ with different patterns, then stitch them together into full HTML or other source code files. It's really clever.
Aug. 21, 2022
Analyzing ScotRail audio announcements with Datasette—from prototype to production
Scottish train operator ScotRail released a two-hour long MP3 file containing all of the components of its automated station announcements. Messing around with them is proving to be a huge amount of fun.
[... 4,428 words]Turning SQLite into a distributed database (via) Heyang Zhou introduces mvSQLite, his brand new open source “SQLite-compatible distributed database” built in Rust on top of Apple’s FoundationDB. This is a very promising looking new entry into the distributed/replicated SQLite space: FoundationDB was designed to provide low-level primitives that tools like this could build on top of.
Aug. 22, 2022
Digitizing 55,000 pages of civic meetings (via) Philip James has been building public, searchable archives of city council meetings for various cities—Oakland and Alamedia so far—using my s3-ocr script to run Textract OCR against the PDFs of the minutes, and deploying them to Fly using Datasette. This is a really cool project, and very much the kind of thing I’ve been hoping to support with the tools I’ve been building.
Stable Diffusion Public Release (via) New AI just dropped. Stable Diffusion is similar to DALL-E, but completely open source and with a CC0 license applied to everything it generates. I have a Twitter thread (the via) link of comparisons I’ve made between its output and my previous DALL-E experiments. The announcement buries the lede somewhat: to try it out, visit beta.dreamstudio.ai—which you can use for free at the moment, but it’s unclear to me how billing is supposed to work.
Aug. 23, 2022
Aug. 24, 2022
How SQLite Scales Read Concurrency (via) Ben Johnson’s series on SQLite internals continues—this time with a detailed explanation of how the SQLite WAL (Write-Ahead Log) is implemented.
To make the analogy explicit, in Software 1.0, human-engineered source code (e.g. some .cpp files) is compiled into a binary that does useful work. In Software 2.0 most often the source code comprises 1) the dataset that defines the desirable behavior and 2) the neural net architecture that gives the rough skeleton of the code, but with many details (the weights) to be filled in. The process of training the neural network compiles the dataset into the binary — the final neural network. In most practical applications today, the neural net architectures and the training systems are increasingly standardized into a commodity, so most of the active “software development” takes the form of curating, growing, massaging and cleaning labeled datasets.
Aug. 25, 2022
Building a searchable archive for the San Francisco Microscopical Society
The San Francisco Microscopical Society was founded in 1870 by a group of scientists dedicated to advancing the field of microscopy.
[... 1,845 words]Aug. 28, 2022
Aug. 29, 2022
Stable Diffusion is a really big deal
If you haven’t been paying attention to what’s going on with Stable Diffusion, you really should be.
[... 1,443 words]


