Weeknotes: Datasette column actions, plus three new plugins
2nd October 2020
A renewed emphasis on building out Datasette Cloud has produced three new plugins this week: datasette-dateutil, datasette-import-table and datasette-edit-schema, plus a major improvement to Datasette’s default interface for browsing tables.
Datasette column actions
If you take a look at the latest.datasette.io demo (continuously deployed with the latest Datasette main branch) you’ll see a new UI element: cog icons on each table column, which reveal a drop-down menu.
I’m calling these “column actions”, and they’re an improvement I’ve been planning for a long time.
So far these solve two problems:
- Sorting a column in descending order previously required two full page load clicks—one to sort ascending, and another to reverse that order. You can now sort descending directly from the menu.
- Datasette lets you facet by any column by appending
?_facet=column-nameto the URL, but this is not a very discoverable feature! It suggests facets for columns that have less than 30 unique values, but sometimes you want to facet by something else—US states for example. You can now facet any column using the cog menu, with the exception of single unique primary key columns since faceting by those is guaranteed to only return one result per value.
Additionally, if a column has visible blank values on the page a “Show not-blank rows” option will be displayed, for quickly seeing rows that have something interesting in that column.
I’m planning to make these available for extension by plugins—I’m sure there are all sorts of other potentially useful per-column actions.
One concern I had in building these was accessibility: I was particularly worried that adding the markup for a dropdown menu to a
<th> element would interfere with screenreaders when they were using those headers to describe the contents of a cell.
In the end I solved this by creating the menu as a separate absolutely positioned element, which appears in the correct place when the cog icon is clicked.
This still needs more work—I need to learn how to use ARIA to make the menus fully accesible—but I’m optimistic in the approach so far. If you use a screen reader I would deeply value your feedback in the column action accessibility issue.
If you want to try out column actions they’re available in the latest alpha release, Datasette 0.50a0.
I’m working with students in the Stanford Big Local Journalism course this term, and it’s really helping me figure out the gaps in Datasette and my forthcoming Datasette Cloud hosted offering.
I want Datasette to be as useful to journalists as possible. This means handling messy data from all manner of sources (many of them government).
One missing feature has been flexible date handling. I’ve been succesfully using the excellent python-dateutil library for this for fifteen years now, so I decided to expose its functionality within Datasette as the datasette-dateutil plugin.
The plugin adds several new custom SQL functions. Here’s an example query:
select dateutil_parse("10 october 2020 3pm"), dateutil_parse_fuzzy("This is due 10 september"), dateutil_parse("1/2/2020"), dateutil_parse("2020-03-04"), dateutil_parse_dayfirst("2020-03-04"), dateutil_easter(2020);
Try that query here. It outputs the following:
2020-10-10T15:00:00 2020-09-10T00:00:00 2020-01-02T00:00:00 2020-03-04T00:00:00 2020-04-03T00:00:00 2020-04-12
I’ve also exposed the
rrule module, which implements recurring date rules as described in the iCalendar specification. The
dateutil_rrule_date() functions return JSON arrays of dates matching the rule, for example:
select dateutil_rrule('FREQ=HOURLY;COUNT=5'), dateutil_rrule_date( 'FREQ=DAILY;COUNT=3', '1st jan 2020' );
Try it—here are the results:
[ "2020-10-02T23:12:55", "2020-10-03T00:12:55", "2020-10-03T01:12:55", "2020-10-03T02:12:55", "2020-10-03T03:12:55" ] [ "2020-01-01", "2020-01-02", "2020-01-03" ]
These are SQLite strings containing JSON arrays—but you can use SQLite’s
json_each() function to expand them out into tables that can then be joined against other tables! The README goes into more details on how you can do that.
This is particularly useful for solving problems like generating the number-of-items per day for a given time period while still returning a 0 for dates that do not have any corresponding items.
I previewed this last week, when it was called
datasette-edit-table. I realized that
datasette-edit-schema is a better name as it doesn’t imply that you can edit the rows within the table—just the schema of the table itself.
Release 0.3 is the first non-alpha release, and provides the following capabilities:
- Rename a table
- Add new columns to a table
- Rename columns in a table
- Modify the type of columns in a table
- Re-order the columns in a table
- Delete a table
This is particularly useful should you import a messy CSV file (using datasette-upload-csvs) and then need to rename some columns, change their types (so that you can sort by integers and floats instead of everything being text), then re-order them to make more sense.
It also means you can rename
LON columns to
longitude so that datasette-cluster-map knows that it can render them on a map.
If you’re going to modify and clean data, it can be helpful to start with an existing table. datasette-import-table lets you provide the URL to an existing Datasette table hosted anywhere on the internet—it will then create a local copy of that table by paginating through the JSON API for that table and importing all of the rows.
TIL this week
- Figuring out if a text value in SQLite is a valid integer or float
- Escaping strings in Bash using !:q
- Installing Selenium for Python on macOS with ChromeDriver
Releases this week
- datasette 0.50a0—2020-10-01
- datasette-cluster-map 0.12.4—2020-10-01
- datasette-edit-schema 0.3—2020-10-01
- datasette-edit-schema 0.3a2—2020-10-01
- datasette-cluster-map 0.12.3—2020-10-01
- datasette-cluster-map 0.12.2—2020-10-01
- datasette-cluster-map 0.12.1—2020-09-30
- datasette-edit-schema 0.3a1—2020-09-30
- datasette-edit-schema 0.3a—2020-09-30
- datasette-dateutil 0.2.1—2020-09-30
- datasette-dateutil 0.2—2020-09-29
- datasette-import-table 0.2—2020-09-28
- datasette-import-table 0.1—2020-09-28
- datasette-import-table 0.1a2—2020-09-28
- datasette-import-table 0.1a1—2020-09-28
- datasette-import-table 0.1a0—2020-09-28
- datasette-dateutil 0.1—2020-09-28
More recent articles
- llamafile is the new best way to run a LLM on your own computer - 29th November 2023
- Prompt injection explained, November 2023 edition - 27th November 2023
- I'm on the Newsroom Robots podcast, with thoughts on the OpenAI board - 25th November 2023
- Weeknotes: DevDay, GitHub Universe, OpenAI chaos - 22nd November 2023
- Deciphering clues in a news article to understand how it was reported - 22nd November 2023
- Exploring GPTs: ChatGPT in a trench coat? - 15th November 2023
- Financial sustainability for open source projects at GitHub Universe - 10th November 2023
- ospeak: a CLI tool for speaking text in the terminal via OpenAI - 7th November 2023
- DALL-E 3, GPT4All, PMTiles, sqlite-migrate, datasette-edit-schema - 30th October 2023
- Now add a walrus: Prompt engineering in DALL-E 3 - 26th October 2023