Simon Willison’s Weblog

Building a Covid sewage Twitter bot (and other weeknotes)

I built a new Twitter bot today: @covidsewage. It tweets a daily screenshot of the latest Covid sewage monitoring data published by Santa Clara county.

I’m increasingly distrustful of Covid numbers as fewer people are tested in ways that feed into the official statistics. But the sewage numbers don’t lie! As the Santa Clara county page explains:

SARS-CoV-2 (the virus that causes COVID-19) is shed in feces by infected individuals and can be measured in wastewater. More cases of COVID-19 in the community are associated with increased levels of SARS-CoV-2 in wastewater, meaning that data from wastewater analysis can be used as an indicator of the level of transmission of COVID-19 in the community.

That page also embeds some beautiful charts of the latest numbers, powered by an embedded Observable notebook built by Zan Armstrong.

Once a day, my bot tweets a screenshot of those latest charts that looks like this:

Screenshot of a tweet that says "Latest Covid sewage charts for the SF Bay Area" with an attached screenshot of some charts. The numbers are trending up in an alarming direction.

How the bot works

The bot runs once a daily using this scheduled GitHub Actions workflow.

Here’s the bit of the workflow that generates the screenshot:

- name: Generate screenshot with shot-scraper
  run: |-
    shot-scraper https://covid19.sccgov.org/dashboard-wastewater \
      -s iframe --wait 3000 -b firefox --retina -o /tmp/covid.png

This uses my shot-scraper screenshot tool, described here previously. It takes a retina screenshot just of the embedded iframe, and uses Firefox because for some reason the default Chromium screenshot failed to load the embed.

This bit sends the tweet:

- name: Tweet the new image
  env:
    TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }}
    TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }}
    TWITTER_ACCESS_TOKEN_KEY: ${{ secrets.TWITTER_ACCESS_TOKEN_KEY }}
    TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }}
  run: |-
    tweet-images "Latest Covid sewage charts for the SF Bay Area" \
      /tmp/covid.png --alt "Screenshot of the charts" > latest-tweet.md

tweet-images is a tiny new tool I built for this project. It uses the python-twitter library to send a tweet with one or more images attached to it.

The hardest part of the project was getting the credentials for sending tweets with the bot! I had to go through Twitter’s manual verification flow, presumably because I checked the “bot” option when I applied for the new developer account. I also had to figure out how to extract all four credentials (with write permissions) from the Twitter developer portal.

I wrote up full notes on this in a TIL: How to get credentials for a new Twitter bot.

Datasette for geospatial analysis

I stumbled across datanews/amtrak-geojson, a GitHub repository containing GeoJSON files (from 2015) showing all of the Amtrak stations and sections of track in the USA.

I decided to try exploring it using my geojson-to-sqlite tool, which revealed a bug triggered by records with a geometry but no properties. I fixed that in version 1.0.1, and later shipped version 1.1 with improvements by Chris Amico.

In exploring the Amtrak data I found myself needing to learn how to use the SpatiaLite GUnion function to aggregate multiple geometries together. This resulted in a detailed TIL on using GUnion to combine geometries in SpatiaLite, which further evolved as I used it as a chance to learn how to use Chris’s datasette-geojson-map and sqlite-colorbrewer plugins.

This was so much fun that I was inspired to add a new “uses” page to the official Datasette website: Datasette for geospatial analysis now gathers together links to plugins, tools and tutorials for handling geospatial data.

sqlite-utils 3.26

I’ll quote the release notes for sqlite-utils 3.26 in full:

shot-scraper 0.12

In addition to support for WebKit contributed by Ryan Murphy, shot-scraper 0.12 adds options for taking a screenshot that encompasses all of the elements on a page that match a CSS selector.

In also adds a new --js-selector option, suggested by Tony Hirst. This covers the case where you want to take a screenshot of an element on the page that cannot be easily specified using a CSS selector. For example, this expression takes a screenshot of the first paragraph on a page that includes the text “shot-scraper”:

shot-scraper https://simonwillison.net/2022/Apr/8/weeknotes/ \
  --js-selector 'el.tagName == "P" && el.innerText.includes("shot-scraper")' \
  --padding 15 --retina

And an airship museum!

I finally got to add another listing to my www.niche-museums.com website about small or niche museums I have visited.

The Moffett Field Historical Society museum in Mountain View is situated in the shadow of Hangar One, an airship hangar built in 1933 to house the mighty USS Macon.

It’s the absolute best kind of local history museum. Our docent was a retired pilot who had landed planes on aircraft carriers using the kind of equipment now on display in the museum. They had dioramas and models. They even had a model railway. It was superb.

Releases this week

TIL this week