Simon Willison’s Weblog

Subscribe

Executing advanced ALTER TABLE operations in SQLite

23rd September 2020

SQLite’s ALTER TABLE has some significant limitations: it can’t drop columns (UPDATE: that was fixed in SQLite 3.35.0 in March 2021), it can’t alter NOT NULL status, it can’t change column types. Since I spend a lot of time with SQLite these days I’ve written some code to fix this—both from Python and as a command-line utility.

To SQLite’s credit, not only are these limitations well explained in the documentation but the explanation is accompanied by a detailed description of the recommended workaround. The short version looks something like this:

  • Start a transaction
  • Create a new temporary table with the exact shape you would like
  • Copy all of your old data across using INSERT INTO temp_table SELECT FROM old_table
  • Drop the old table
  • Rename the temp table to the old table
  • Commit the transaction

My sqlite-utils tool and Python library aims to make working with SQLite as convenient as possible. So I set out to build a utility method for performing this kind of large scale table transformation. I’ve called it table.transform(...).

Here are some simple examples of what it can do, lifted from the documentation:

# Convert the 'age' column to an integer, and 'weight' to a float
table.transform(types={"age": int, "weight": float})

# Rename the 'age' column to 'initial_age':
table.transform(rename={"age": "initial_age"})

# Drop the 'age' column:
table.transform(drop={"age"})

# Make `user_id` the new primary key
table.transform(pk="user_id")

# Make the 'age' and 'weight' columns NOT NULL
table.transform(not_null={"age", "weight"})

# Make age allow NULL and switch weight to being NOT NULL:
table.transform(not_null={"age": False, "weight": True})

# Set default age to 1:
table.transform(defaults={"age": 1})

# Now remove the default from that column:
table.transform(defaults={"age": None})

Each time the table.transform(...) method runs it will create a brand new table, copy the data across and then drop the old table. You can combine multiple operations together in a single call, avoiding copying the table multiple times.

The table.transform_sql(...) method returns the SQL that would be executed instead of executing it directly, useful if you want to handle even more complex requirements.

The “sqlite-utils transform” command-line tool

Almost every feature in sqlite-utils is available in both the Python library and as a command-line utility, and .transform() is no exception. The sqlite-utils transform command can be used to apply complex table transformations directly from the command-line.

Here’s an example, starting with the fixtures.db database that powers Datasette’s unit tests:

$ wget https://static.simonwillison.net/static/2020/fixtures.db
$ sqlite3 fixtures.db '.schema facetable'
CREATE TABLE facetable (
    pk integer primary key,
    created text,
    planet_int integer,
    on_earth integer,
    state text,
    city_id integer,
    neighborhood text,
    tags text,
    complex_array text,
    distinct_some_null,
    FOREIGN KEY ("city_id") REFERENCES [facet_cities](id)
);
$ sqlite-utils transform fixtures.db facetable \
  --type on_earth text \
  --drop complex_array \
  --drop state \
  --rename tags the_tags
$ sqlite3 fixtures.db '.schema facetable'       
CREATE TABLE IF NOT EXISTS "facetable" (
   [pk] INTEGER PRIMARY KEY,
   [created] TEXT,
   [planet_int] INTEGER,
   [on_earth] TEXT,
   [city_id] INTEGER REFERENCES [facet_cities]([id]),
   [neighborhood] TEXT,
   [the_tags] TEXT,
   [distinct_some_null] TEXT
);

You can use the --sql option to see the SQL that would be executed without actually running it:

$ wget https://latest.datasette.io/fixtures.db
$ sqlite-utils transform fixtures.db facetable \
  --type on_earth text \
  --drop complex_array \
  --drop state \
  --rename tags the_tags \
  --sql
CREATE TABLE [facetable_new_442f07e26eef] (
   [pk] INTEGER PRIMARY KEY,
   [created] TEXT,
   [planet_int] INTEGER,
   [on_earth] TEXT,
   [city_id] INTEGER REFERENCES [facet_cities]([id]),
   [neighborhood] TEXT,
   [the_tags] TEXT,
   [distinct_some_null] TEXT
);
INSERT INTO [facetable_new_442f07e26eef] ([pk], [created], [planet_int], [on_earth], [city_id], [neighborhood], [the_tags], [distinct_some_null])
   SELECT [pk], [created], [planet_int], [on_earth], [city_id], [neighborhood], [tags], [distinct_some_null] FROM [facetable];
DROP TABLE [facetable];
ALTER TABLE [facetable_new_442f07e26eef] RENAME TO [facetable];

Plenty more tricks

sqlite-utils has plenty more tricks up its sleeve. I suggest spending some time browsing the Python library reference and the sqlite-utils CLI documentation, or taking a look through through the release notes.

This is Executing advanced ALTER TABLE operations in SQLite by Simon Willison, posted on 23rd September 2020.

Part of series New features in sqlite-utils

  1. sqlite-utils: a Python library and CLI tool for building SQLite databases - Feb. 25, 2019, 3:29 a.m.
  2. Fun with binary data and SQLite - July 30, 2020, 11:22 p.m.
  3. Executing advanced ALTER TABLE operations in SQLite - Sept. 23, 2020, 1 a.m.
  4. Refactoring databases with sqlite-utils extract - Sept. 23, 2020, 4:02 p.m.
  5. Joining CSV and JSON data with an in-memory SQLite database - June 19, 2021, 10:55 p.m.
  6. Apply conversion functions to data in SQLite columns with the sqlite-utils CLI tool - Aug. 6, 2021, 6:05 a.m.
  7. What's new in sqlite-utils 3.20 and 3.21: --lines, --text, --convert - Jan. 11, 2022, 6:19 p.m.
  8. … more

Next: Refactoring databases with sqlite-utils extract

Previous: Weeknotes: datasette-seaborn, fivethirtyeight-polls