Weeknotes: Datasette 0.40, various projects, Dogsheep photos
A new release of Datasette, two new projects and progress towards a Dogsheep photos solution.
I released Datasette 0.40 last night. Full release notes are here, but the highlights of this key feature in this release is the ability to provide metadata in a
metadata.yaml file as an alternative to
metadata.json. This is particularly useful for embedded multi-line SQL queries: I’ve upgraded simonw/museums and simonw/til to take advantage of this, since they both use their metadata to define SQL queries that power their search pages and Atom feeds.
A JSK fellows directory and twitter-to-sqlite 0.21
It’s a neat example of taking advantage of SQLite views to build faceted search across a subset of data. A script constructs the jsk_fellows view at build time, then metadata.json configures that view to run full-text search and facet by the derived fellowship column.
I shipped twitter-to-sqlite 0.21 with a new
twitter-to-sqlite lists username command as part of this project.
TILs and datasette-template-sql 1.0
I described my new TILs project on Monday. I’ve published 15 so far—the format is working really well for me.
Hacking on simonw/tils reminded me of a feature gap in my datasette-template-sql plugin: it didn’t have a solution for safely escaping parameters in SQL queries, leading to nasty string concatenated SQL queries.
I released github-to-sqlite 2.0 with a small backwards incompatible change to the database schema (hence the major version increment). It builds on 1.1 from a few days ago which added a new
github-to-sqlite contributors command for fetching statistics on contributors to repositories.
More importantly, I improved the live demo running at github-to-sqlite.dogsheep.net.
Want to see which of my projects have had the most releases? Facet releases by repo.
Bringing all of my different project data together in one place like this is really powerful.
I think it’s a great illustration of the Datasette/Dogsheep philosophy of pulling down a complete SQLite-powered copy of data from external services so you can query and join across your data without being limited to the functionality that those services provide through their own interfaces or APIs.
Dogsheep is about bringing all of my interesting personal and social data into a single, private place.
The biggest thing missing at the moment is photos. I want to be able to query my photos with SQL, and eventually combine them with tweets, checkins etc in a unified timeline.
Last week I took a step towards this goal with heic-to-jpeg, a proxy to let me display my iPhone’s HEIC photos online.
This week I started work on photos-to-sqlite—the set of tools which I’ll use to turn my photos into something I can run queries again.
So far I’ve mainly been figuring out how to get them into an S3 bucket that I control. Once configured, running
photos-to-sqlite upload photos.db ~/Pictures/Photos\ Library.photoslibrary/originals will start uploading every photo it can find in that directory to the S3 bucket.
The filename it uses is the sha256 hash of the photo file contents, which I’m hoping will let me de-dupe photos from multiple sources in the future. It also writes basic metadata on the photos to that
photos.db SQLite database.
This is going to be a big project. I’m investigating osxphotos to liberate the metadata from Apple Photos, and various Python libraries for extracting EXIF data from the files themselves.
Once I’ve got that working, I can experiment with things like piping photos through Google Cloud Vision to label them based on their contents.
This is all a very, very early alpha at the moment, but I’m cautiously optimistic about progress so far.