Trying to end the pandemic a little earlier with VaccinateCA
This week I got involved with the VaccinateCA effort. We are trying to end the pandemic a little earlier, by building the most accurate database possible of vaccination locations and availability in California.
I’ve been following this project for a while through Twitter, mainly via Patrick McKenzie—here’s his tweet about the project from January 20th.
https://t.co/JrD5mb4TAN calls medical professionals daily to ask who they could vaccinate and how to get in line. We publish this, covering the entire state of California, to help more people get their vaccines faster. Please tell your friends and networks.- Patrick McKenzie (@patio11) January 20, 2021
The core idea is one of those things that sounds obviously correct the moment you hear it. The Covid vaccination roll-out is decentralized and pretty chaotic. VaccinateCA figured out that the best way to figure out where the vaccine is available is to call the places that are distributing it—pharmacies, hospitals, clinics—as often as possible and ask if they have any in stock, who is eligible for the shot and how people can sign up for an appointment.
What We’ve Learned (So Far) by Patrick talks about lessons learned in the first 42 days of the project.
There are three public-facing components to VaccinateCA:
- www.vaccinateca.com is a website to help you find available vaccines near you.
help.vaccinatecais the web app used by volunteers who make calls—it provides a script and buttons to submit information gleaned from the call. If you’re interested in volunteering there’s information on the website.
api.vaccinatecais the public API, which is documented here and is also used by the end-user facing website. It provides a full dump of collected location data, plus information on county policies and large-scale providers (pharmacy chains, health care providers).
The system currently mostly runs on Airtable, and takes advantage of pretty much every feature of that platform.
Why I got involved
Jesse Vincent convinced me to get involved. It turns out to be a perfect fit for both my interests and my skills and experience.
I’ve built crowdsourcing platforms before—for MP’s expense reports at the Guardian, and then for conference and event listings with our startup, Lanyrd.
VaccinateCA is a very data-heavy organization: the key goal is to build a comprehensive database of vaccine locations and availability. My background in data journalism and the last three years I’ve spent working on Datasette have given me a wealth of relevant experience here.
And finally… VaccinateCA are quickly running up against the limits of what you can sensibly do with Airtable—especially given Airtable’s hard limit at 100,000 records. They need to port critical tables to a custom PostgreSQL database, while maintaining as much as possible the agility that Airtable has enabled for them.
Django is a great fit for this kind of challenge, and I know quite a bit about both Django and using Django to quickly build robust, scalable and maintainable applications!
So I spent this week starting a Django replacement for the Airtable backend used by the volunteer calling application. I hope to get to feature parity (at least as an API backend that the application can write to) in the next few days, to demonstrate that a switch-over is both possible and a good idea.
What about Datasette?
On Monday I spun up a Datasette instance at vaccinateca.datasette.io (underlying repository) against data from the public VaccinateCA API. The map visualization of all of the locations instantly proved useful in helping spot locations that had incorrectly been located with latitudes and longitudes outside of California.
I hope to use Datasette for a variety of tasks like this, but it shouldn’t be the core of the solution. VaccinateCA is the perfect example of a problem that needs to be solved with Boring Technology—it needs to Just Work, and time that could be spent learning exciting new technologies needs to be spent building what’s needed as quickly, robustly and risk-free as possible.
That said, I’m already starting to experiment with the new JSONField introduced in Django 3.1—I’m hoping that a few JSON columns can help compensate for the lack of flexibility compared to Airtable, which makes it ridiculously easy for anyone to add additional columns.
(To be fair JSONField has been a feature of the Django PostgreSQL Django extension since version 1.9 in 2015 so it’s just about made it into the boring technology bucket by now.)
Also this week
Working on VaccinateCA has given me a chance to use some of my tools in new and interesting ways, so I got to ship a bunch of small fixes, detailed in Releases this week below.
I also recorded a five minute lightning talk about Git Scraping for next week’s NICAR 2021 data journalism conference.
I also made a few small cosmetic upgrades to the way tags are displayed on my blog—they now show with a rounded border and purple background, and include a count of items published with that tag. My tags page is one example of where I’ve now applied this style.
TIL this week
- Using sphinx.ext.extlinks for issue links
- Show the SQL schema for a PostgreSQL database
- Running tests against PostgreSQL in a service container
- Adding extra read-only information to a Django admin change page
- Granting a PostgreSQL user read-only access to some tables
Releases this week
Given a JSON list of objects, flatten any keys which always contain single item arrays to just a single value
datasette-auth-github: 0.13.1—(25 releases total)—2021-02-25
Datasette plugin that authenticates users against GitHub
datasette-block: 0.1.1—(2 releases total)—2021-02-25
Block all access to specific path prefixes
Python class for reading and writing data to a GitHub repository
csv-diff: 1.1—(9 releases total)—2021-02-23
Python CLI tool and library for diffing CSV and JSON files
sqlite-transform: 0.4—(5 releases total)—2021-02-22
Tool for running transformations on columns in a SQLite database
airtable-export: 0.5—(7 releases total)—2021-02-22
Export Airtable data to YAML, JSON or SQLite files on disk
More recent articles
- Lawyer cites fake cases invented by ChatGPT, judge is not amused - 27th May 2023
- llm, ttok and strip-tags - CLI tools for working with ChatGPT and other LLMs - 18th May 2023
- Delimiters won't save you from prompt injection - 11th May 2023
- Weeknotes: sqlite-utils 3.31, download-esm, Python in a sandbox - 10th May 2023
- Leaked Google document: "We Have No Moat, And Neither Does OpenAI" - 4th May 2023
- Midjourney 5.1 - 4th May 2023
- Prompt injection explained, with video, slides, and a transcript - 2nd May 2023
- download-esm: a tool for downloading ECMAScript modules - 2nd May 2023
- Let's be bear or bunny - 1st May 2023
- Weeknotes: Miscellaneous research into Rye, ChatGPT Code Interpreter and openai-to-sqlite - 1st May 2023