Simon Willison’s Weblog

Subscribe

Blogmarks tagged databases, sqlite

Filters: Type: blogmark × databases × sqlite × Sorted by date

What to do about SQLITE_BUSY errors despite setting a timeout (via) Bert Hubert takes on the challenge of explaining SQLite's single biggest footgun: in WAL mode you may see SQLITE_BUSY errors even when you have a generous timeout set if a transaction attempts to obtain a write lock after initially running at least one SELECT. The fix is to use BEGIN IMMEDIATE if you know your transaction is going to make a write.

Bert provides the clearest explanation I've seen yet of why this is necessary:

When the transaction on the left wanted to upgrade itself to a read-write transaction, SQLite could not allow this since the transaction on the right might already have made changes that the transaction on the left had not yet seen.

This in turn means that if left and right transactions would commit sequentially, the result would not necessarily be what would have happened if all statements had been executed sequentially within the same transaction.

I've written about this a few times before, so I just started a sqlite-busy tag to collect my notes together on a single page.

# 17th February 2025, 7:04 am / databases, sqlite, transactions, sqlite-busy

Why SQLite Uses Bytecode (via) Brand new SQLite architecture documentation by D. Richard Hipp explaining the trade-offs between a bytecode based query plan and a tree of objects.

SQLite uses the bytecode approach, which provides an important characteristic that SQLite can very easily execute queries incrementally—stopping after each row, for example. This is more useful for a local library database than for a network server where the assumption is that the entire query will be executed before results are returned over the wire.

# 30th April 2024, 5:32 am / databases, sql, sqlite, d-richard-hipp

Optimizing SQLite for servers (via) Sylvain Kerkour's comprehensive set of lessons learned running SQLite for server-based applications.

There's a lot of useful stuff in here, including detailed coverage of the different recommended PRAGMA settings.

There was also a tip I haven't seen before about BEGIN IMMEDIATE transactions:

By default, SQLite starts transactions in DEFERRED mode: they are considered read only. They are upgraded to a write transaction that requires a database lock in-flight, when query containing a write/update/delete statement is issued.

The problem is that by upgrading a transaction after it has started, SQLite will immediately return a SQLITE_BUSY error without respecting the busy_timeout previously mentioned, if the database is already locked by another connection.

This is why you should start your transactions with BEGIN IMMEDIATE instead of only BEGIN. If the database is locked when the transaction starts, SQLite will respect busy_timeout.

# 31st March 2024, 8:16 pm / databases, performance, sql, sqlite, sqlite-busy

Announcing DuckDB 0.10.0. Somewhat buried in this announcement: DuckDB has Fixed-Length Arrays now, along with array_cross_product(a1, a2), array_cosine_similarity(a1, a2) and array_inner_product(a1, a2) functions.

This means you can now use DuckDB to find related content (and other tricks) using vector embeddings!

Also notable:

DuckDB can now attach MySQL, Postgres, and SQLite databases in addition to databases stored in its own format. This allows data to be read into DuckDB and moved between these systems in a convenient manner, as attached databases are fully functional, appear just as regular tables, and can be updated in a safe, transactional manner.

# 13th February 2024, 5:57 pm / databases, mysql, postgresql, sql, sqlite, duckdb, embeddings

SQLite 3.45. Released today. The big new feature is JSONB support, a new, specific-to-SQLite binary internal representation of JSON which can provide up to a 3x performance improvement for JSON-heavy operations, plus a 5-10% saving it terms of bytes stored on disk.

# 15th January 2024, 8:15 pm / databases, json, sqlite

Upsert in SQL (via) Anton Zhiyanov is currently on a one-man quest to write detailed documentation for all of the fundamental SQL operations, comparing and contrasting how they work across multiple engines, generally with interactive examples.

Useful tips in here on why “insert... on conflict” is usually a better option than “insert or replace into” because the latter can perform a delete and then an insert, firing triggers that you may not have wanted to be fired.

# 25th September 2023, 8:34 pm / databases, postgresql, sql, sqlite

JSON Changelog with SQLite (via) One of my favourite database challenges is how to track changes to rows over time. This is a neat recipe from 2018 which uses SQLite triggers and the SQLite JSON functions to serialize older versions of the rows and store them in TEXT columns.

# 16th November 2022, 3:41 am / databases, json, sqlite

Introducing LiteFS (via) LiteFS is the new SQLite replication solution from Fly, now ready for beta testing. It’s from the same author as Litestream but has a very different architecture; LiteFS works by implementing a custom FUSE filesystem which spies on SQLite transactions being written to the journal file and forwards them on to other nodes in the cluster, providing full read-replication. The signature Litestream feature of streaming a backup to S3 should be coming within the next few months.

# 21st September 2022, 6:56 pm / databases, replication, sqlite, fly, litestream, ben-johnson

How the SQLite Virtual Machine Works. The latest entry in Ben Johnson’s series about SQLite internals.

# 7th September 2022, 8:49 pm / databases, sqlite, ben-johnson

How SQLite Scales Read Concurrency (via) Ben Johnson’s series on SQLite internals continues—this time with a detailed explanation of how the SQLite WAL (Write-Ahead Log) is implemented.

# 24th August 2022, 4:16 pm / databases, sqlite, ben-johnson

Turning SQLite into a distributed database (via) Heyang Zhou introduces mvSQLite, his brand new open source “SQLite-compatible distributed database” built in Rust on top of Apple’s FoundationDB. This is a very promising looking new entry into the distributed/replicated SQLite space: FoundationDB was designed to provide low-level primitives that tools like this could build on top of.

# 21st August 2022, 5:40 pm / databases, sqlite, rust

SQLite Internals: Pages & B-trees (via) Ben Johnson provides a delightfully clear introduction to SQLite internals, describing the binary format used to store rows on disk and how SQLite uses 4KB pages for both row storage and for the b-trees used to look up records.

# 27th July 2022, 2:57 pm / algorithms, databases, sqlite, ben-johnson

Sqitch tutorial for SQLite (via) Sqitch is an interesting implementation of database migrations: it’s a command-line tool written in Perl with an interface similar to Git, providing commands to create, run, revert and track migration scripts. The scripts the selves are written as SQL in whichever database engine you are using. The tutorial for SQLite gives a good idea as to how the whole system works.

# 24th July 2022, 11:44 pm / databases, migrations, sqlite

DuckDB (via) This is a really interesting, relatively new database. It’s kind of a weird hybrid between SQLite and PostgreSQL: it uses the PostgreSQL parser but models itself after SQLite in that databases are a single file and the code is designed for use as an embedded library, distributed in a single amalgamation C++ file (SQLite uses a C amalgamation). It features a “columnar-vectorized query execution engine” inspired by MonetDB (also by the DuckDB authors) and is hence designed to run analytical queries really quickly. You can install it using “pip install duckdb”—the resulting module feels similar to Python’s sqlite3, and follows roughly the same DBAPI pattern.

# 19th September 2020, 11:43 pm / databases, postgresql, sqlite, duckdb