Cross-database queries in SQLite (and weeknotes)
21st February 2021
Cross-database queries in Datasette
SQLite databases are single files on disk. I really love this characteristic—it makes them easy to create, copy and move around. All you need is a disk volume and you can create as many SQLite databases as you like.
A lesser known feature of SQLite is that you can run queries, including joins, across tables from more than one database. The secret sauce is the ATTACH DATABASE command. Run the following SQL:
ATTACH 'other.db' AS other;
And now you can reference tables in that database as
other.tablename. You can then join against them, combine them with
UNION and generally treat them as if they were another table in your first connected database.
I’ve wanted to add support for cross-database queries to Datasette since May 2018. It took me quite a while to settle on a design—SQLite defaults to only allowing ten databases to be attached together, and I needed to figure out how multiple connected databases would fit with the design of the rest of Datasette.
In the end, I decided on the simplest option that would unlock the feature. Run Datasette with the new
--crossdb option and the first ten databases passed to Datasette will be ATTACHed to an in-memory database available at the
latest.datasette.io demo now exposes two databases using this feature. Here’s an illustrative example query that performs a UNION across the
sqlite_master metadata table in two databases:
'fixtures' as database, *
'extra_database' as database, *
Cross-database queries in sqlite-utils
sqlite-utils offers both a Python library and a command-line utility in one package. I’ve added
ATTACH support to both.
The Python library support looks like this:
db = Database("first.db") db.attach("second", "second.db") # Now you can run queries like this: cursor = db.execute(""" select * from table_in_first union all select * from second.table_in_second """) print(cursor.fetchall())
The command-line tool now has a new --attach option which lets you attach a database using an alias. The equivalent query to the above would look like this:
$ sqlite-utils first.db --attach second second.db '
select * from table_in_first
select * from second.table_in_second'
This defaults to returning results as a JSON array, but you can add
--tsv or other options to get the results back in different output formats.
A cosmetic upgrade to tags on my blog
I noticed that Will Larson’s blog shows little numbers next to the tags indicating how many times they have been used. I really liked that, so I’ve implemented it here as well.
Each entry (and quotation and link) now gets a block in the sidebar that looks like this:
As a long-time fan of faceted search interfaces I really like this upgrade—it helps indicate at a glance the kind of content I have stashed away in my blog’s archive.
Releases this week
Preview of new JSON default format for Datasette
Python CLI utility and library for manipulating SQLite databases
An open source multi-tool for exploring and publishing data
Datasette plugin providing an automatic GraphQL API for your SQLite databases
Functions for finding numbers using higher/lower
Download map tiles and store them in an MBTiles database
TIL this week
More recent articles
- Weeknotes: Getting ready for NICAR - 27th February 2024
- The killer app of Gemini Pro 1.5 is video - 21st February 2024
- Weeknotes: a Datasette release, an LLM release and a bunch of new plugins - 9th February 2024
- LLM 0.13: The annotated release notes - 26th January 2024
- Weeknotes: datasette-test, datasette-build, PSF board retreat - 21st January 2024
- Talking about Open Source LLMs on Oxide and Friends - 17th January 2024
- Publish Python packages to PyPI with a python-lib cookiecutter template and GitHub Actions - 16th January 2024
- What I should have said about the term Artificial Intelligence - 9th January 2024
- Weeknotes: Page caching and custom templates for Datasette Cloud - 7th January 2024