Simon Willison’s Weblog


Items tagged sqlite in Jan, 2023

Filters: Year: 2023 × Month: Jan × sqlite × Sorted by date

sqlite-jsonschema. “A SQLite extension for validating JSON objects with JSON Schema”, building on the jsonschema Rust crate. SQLite and JSON are already a great combination—Alex suggests using this extension to implement check constraints to validate JSON columns before inserting into a table, or just to run queries finding existing data that doesn’t match a given schema. # 28th January 2023, 3:50 am

sqlite-ulid. Alex Garcia’s sqlite-ulid adds lightning-fast SQL functions for generating ULIDs—Universally Unique Lexicographically Sortable Identifiers. These work like UUIDs but are smaller and faster to generate, and can be canonically encoded as a URL-safe 26 character string (UUIDs are 36 characters). Again, this builds on a Rust crate—ulid-rs—and can generate 1 million byte-represented ULIDs with the ulid_bytes() function in just 88.4ms. # 28th January 2023, 3:45 am

sqlite-fastrand. Alex Garcia just dropped three new SQLite extensions, and I’m going to link to all of them. The first is sqlite-fastrand, which adds new functions for generating random numbers (and alphanumeric characters too). Impressively, these out-perform the default SQLite random() and randomblob() functions by about 1.6-2.6x, thanks to being built on the Rust fastrand crate which builds on wyhash, an extremely fast (though not cryptographically secure) hashing function. # 28th January 2023, 3:41 am

[On SQLite for production concurrent writes] In general, WAL mode “just works” as Simon said. You just need to make sure you don’t have long running write transactions, although those are somewhat problematic in any database system. Don’t do stuff like starting a write txn and then calling a remote API and then committing. That’ll kill your write throughout.

Ben Johnson # 26th January 2023, 7:36 pm

Wildebeest (via) New project from Cloudflare, first quietly unveiled three weeks ago: “Wildebeest is an ActivityPub and Mastodon-compatible server”. It’s built using a flurry of Cloudflare-specific technology, including Workers, Pages and their SQLite-based D1 database. # 23rd January 2023, 12:03 am

Hctree Design Documentation. More detailed information on the design of the new Hctree SQLite branch. # 20th January 2023, 12:50 am

Hctree: an experimental high-concurrency database backend for SQLite (via) Really interesting new research branch from the core SQLite team. “Hctree uses optimistic row-level locking and is designed to support dozens of concurrent writers running at full-speed”—with very impressive benchmarks supporting that claim. Also two bonuses: it has a replication mechanism based on the existing SQLite sessions extension, and it bumps up the maximum size of a SQLite database from 16TiB to 1EiB (roughly one million TiB). # 20th January 2023, 12:47 am

Introducing sqlite-xsv: The Fastest CSV Parser for SQLite. Alex Garcia continues to push the boundaries of SQLite extensions. This new extension in Rust wraps the lightning fast Rust csv crate and provides a new csv_reader() virtual table that can handle regular, gzipped and zstd compressed files. # 14th January 2023, 9:54 pm

How to implement Q&A against your documentation with GPT3, embeddings and Datasette

If you’ve spent any time with GPT-3 or ChatGPT, you’ve likely thought about how useful it would be if you could point them at a specific, current collection of text or documentation and have it use that as part of its input for answering questions.

[... 3491 words]