Weeknotes: CDC vaccination history fixes, developing in GitHub Codespaces
I spent the last week mostly surrounded by boxes: we’re completing our move to the new place and life is mostly unpacking now. I did find some time to fix some issues with my CDC vaccination history Datasette instance though.
Fixing my CDC vaccination history site
I started tracking changes made to the CDC’s COVID Data Tracker website back in Feburary. I created a git scraper repository for it as part of my five minute lightning talk on git scraping (notes and video) at this year’s NICAR data journalism conference.
Since then it’s been quietly ticking along, recording the latest data in a git repository that now has 335 commits.
In March I added a script to build the collected historic data into a SQLite database and publish it to Vercel using GitHub. That started breaking a few weeks ago, and it turnoud out that was because the database file had grown in size to the point where it was too large to deploy to Vercel (~100MB).
I got a bug report about this, so I took some time to move the deployment over to Google Cloud Run which doesn’t have a documented size limit (though in my experience starts to creak once you go above about 2GB.)
I also started publishing the raw collected data directly as a CSV file, partly as an excuse to learn how to publish to Google Cloud Storage.
I released an extremely simple plugin this week called datasette-template-request—all it does is expose Datasette’s request object in the context passed to custom templates, for people who want to update their custom page based on incoming request parameters.
More notable is how I built the plugin: this is the first plugin I’ve developed, tested and released entirely in my browser using the new GitHub Codespaces online development environment.
I created the new repo using my Datasette plugin template repository, opened it up in Codespaces, implemented the plugin and tests, tried it out using the port forwarding feature and then published it to PyPI using the publish.yml workflow.
Not having to even open a text editor on my laptop (let alone get a new Python development environment up and running) felt really good. I should turn this into a tutorial.
Releases this week
Expose the Datasette request object to custom templates
datasette-notebook: 0.1a1—(2 releases total)—2021-09-22
A markdown wiki and dashboarding system for Datasette
datasette-render-markdown: 2.0—(8 releases total)—2021-09-22
Datasette plugin for rendering Markdown
sqlite-utils: 3.17.1—(87 releases total)—2021-09-22
Python CLI utility and library for manipulating SQLite databases
twitter-to-sqlite: 0.22—(28 releases total)—2021-09-21
Save data from Twitter to a SQLite database
TIL this week
More recent articles
- AI-enhanced development makes me more ambitious with my projects - 27th March 2023
- I built a ChatGPT plugin to answer questions about data hosted in Datasette - 24th March 2023
- Weeknotes: AI won't slow down, a new newsletter and a huge Datasette refactor - 22nd March 2023
- Don't trust AI to talk accurately about itself: Bard wasn't trained on Gmail - 22nd March 2023
- A conversation about prompt engineering with CBC Day 6 - 18th March 2023
- Could you train a ChatGPT-beating model for $85,000 and run it in a browser? - 17th March 2023