Weeknotes: Velma, more Django SQL Dashboard
Matching locations for Vaccinate The States, fun with GeoJSON and more improvements to Django SQL Dashboard.
Short version: we have scrapers and data ingesters for a whole bunch of different sources (see the vaccine-feed-ingest repository).
Part of the challenge here is how to deal with duplicates—with multiple sources of data, chances are high that the same location will show up in more than on of our input feeds.
So in the past weeks we’ve been building a new tool code-named Velma to help handle this. It shows our volunteers a freshly scraped location and asks them to either match it to one of our existing locations (based on automated suggestions) or use it to create a brand new location in our database.
I’ve been working exclusively on the backend APIs for Velma: APIs that return new scraped data and accept and process the human matching decisions from our volunteers.
This week we’ve been expanding Velma to also cover merging potential duplicate locations within our existing corpus, so I’ve been building out the APIs for that effort as well.
I’ve also been working on new export code for making our entire set of locations available to partners and interested outside developers. We hope to launch that fully in the next few days.
One of the export formats we are working with is GeoJSON. I have a tool called geojson-to-sqlite which I released last year: this week I released an updated version with the ability to create SpatiaLite indexes and a
--nl option for consuming newline-delimited GeoJSON, contributed by Chris Amico.
I’ve also been experimenting with SpatiaLite’s KNN mechanism using
geojson-to-sqlite to load in data—here’s a TIL showing how to use those tools together.
Django SQL Dashboard
I released the first non-alpha version of this last week and it’s started to gain some traction: I’ve heard from a few people who are trying it out on their projects and it seems to work, so that’s good!
I released version 0.14 yesterday with a bunch of fixes based on feedback from users, plus a security fix that closes a hole where users without the
execute_sql permission but with access to the Django Admin could modify the SQL in saved dashboards and hence execute their own custom queries.
I also made a bunch of improvements to the documentation, including adding screenshots and demo links to the widgets page.
TIL this week
- The Wikipedia page stats API
- Vega-Lite bar charts in the same order as the data
- Enabling a gin index for faster LIKE queries
- KNN queries with SpatiaLite
- Django data migration using a PostgreSQL CTE